The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsJuly 17-18

Join

xilt

Go Report Card Go Reference Go Coverage libs.tech recommends

A utility for parsing Common and Combined Log Format (CLF) log files and storing them in SQLite for further analysis. Built with concurrency in mind.

Usage

Install

go install go.vxn.dev/xilt/cmd/xilt@latest

Run

xilt [logFilePath] [dbFilePath]

Flags

$ xilt -h
Usage of xilt:
  -avgLogSize float
        Defines the average size of one log in MB. Used for calculating the number of goroutines to spin up. (default 0.001)
  -batchSize int
        Defines the batch size. Used for calculating the number of goroutines to spin up. (default 5000)
  -i    Defines whether indexes should be created in the parsed logs' table.
  -maxMemUsage int
        Defines the maximum allowed memory usage in Megabytes. Used for calculating the number of goroutines to spin up. (default 100)
  -v    Defines whether verbose mode should be used.

Indexes

Currently, if the -i flag is used, the following indexes are created:

This may be subject to change in the future depending on further fine-tuning and configurable parameters extension.

Example

logs.txt:

192.168.1.101 user-identifier alice [12/Nov/2023:14:23:45 -0400] "POST /index.html?query=search HTTP/1.1" 404 1234 "https://example.com" "Mozilla/5.0"
10.0.0.15 user-identifier bob [15/Dec/2023:09:12:33 -0500] "GET /about.html HTTP/1.0" 200 5678 "https://google.com" "Chrome/120.0"
172.16.254.1 user-identifier charlie [20/Jan/2024:18:45:22 -0700] "PUT /upload/file.txt HTTP/1.1" 201 9876 "https://dropbox.com" "PostmanRuntime/7.26"
203.0.113.45 user-identifier dave [05/Feb/2024:22:11:09 +0000] "DELETE /resource/123 HTTP/1.1" 204 0 "https://api.example.com" "curl/7.68"
198.51.100.22 user-identifier eve [10/Mar/2024:07:34:56 -0800] "PATCH /update/profile HTTP/1.1" 400 2345 "https://app.example.com" "Python-requests/2.25"
192.0.2.33 user-identifier frank [25/Apr/2024:12:55:12 +0200] "HEAD /status HTTP/1.0" 302 0 "https://redirect.com" "Safari/15.0"
203.0.113.78 user-identifier grace [30/May/2024:16:44:33 -0600] "OPTIONS /options HTTP/1.1" 200 3456 "https://options.example.com" "Edge/120.0"
198.51.100.99 user-identifier heidi [15/Jun/2024:03:22:11 -0300] "GET /download/file.zip HTTP/1.1" 500 6789 "https://download.com" "Wget/1.21"
192.168.0.55 user-identifier ivan [20/Jul/2024:10:11:12 -0400] "POST /submit/form HTTP/1.1" 201 4321 "https://form.example.com" "Firefox/120.0"
10.0.1.100 user-identifier jack [25/Aug/2024:19:33:44 -0700] "GET /images/logo.png HTTP/1.0" 304 0 "https://cache.example.com" "Opera/90.0"

Run xilt -i logs.txt logs.db:

$ xilt -i logs.txt logs.db
2025/03/11 22:38:44 beginning reading from log file and the parsing process...
2025/03/11 22:38:44 creating table indexes...
2025/03/11 22:38:44 table indexes created...
2025/03/11 22:38:44 log parsing finished
2025/03/11 22:38:44 elapsed time: 624.7111ms

The resulting logs.db file: log.db

Benchmark

Currently, a basic benchmark is available using the following hardware:

A large log file containing 17 379 453 logs (~1.63 GB in size) takes 1 minute, 55 seconds on average to be parsed and stored in SQLite if not using indexes. Index creation on a table of such size prolongs the duration to ~4 minutes.

All logs in the file are the same: 127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326.

Without indexes:

$ xilt logs benchmark.db
2025/03/11 22:47:18 beginning reading from log file and the parsing process...
2025/03/11 22:49:12 log parsing finished
2025/03/11 22:49:12 elapsed time: 1m54.7359703s

With indexes:

$ xilt -i logs benchmark.db
2025/03/11 22:54:47 beginning reading from log file and the parsing process...
2025/03/11 22:56:38 creating table indexes...
2025/03/11 22:58:46 table indexes created...
2025/03/11 22:58:46 log parsing finished
2025/03/11 22:58:46 elapsed time: 3m59.774765s

Memory Usage

Using the default presets (batchSize=5000, avgLogSize=0.001, maxMemUsage=100) to parse the aforementioned benchmark file leads to an average RAM usage of ~85 MB.

Build & Run Locally

git clone https://github.com/thevxn/xilt
go build ./cmd/xilt
./xilt

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.