Blazing-fast line counting for Go. go-countline does one thing — count the lines in an io.Reader — and does it at memory speed: a 1 GiB buffer in about 13 ms (~85 GB/s). For large files and in-memory readers it counts concurrently across CPU cores; small inputs and streaming readers use a serial fallback.
On a 1 GiB file it counts lines about 32× faster than wc -l (≈26 ms vs ≈820 ms on an Apple M4). Verify it yourself with make bench_vs_wc.
Unlike wc -l, it also counts the final line when the input does not end with a line feed.
Install the command-line wrapper:
go install "github.com/KEINOS/go-countline/cmd/countline@latest"Run it with one file path:
countline ./path/to/file.txtAdd it to your module:
go get "github.com/KEINOS/go-countline"Then pass any io.Reader to CountLines:
import "github.com/KEINOS/go-countline/countline"
func ExampleCountLines() {
for _, sample := range []struct {
Input string
}{
{""}, // --> 0
{"Hello"}, // --> 1
{"Hello\n"}, // --> 1
{"\n"}, // --> 1
{"\n\n"}, // --> 2
{"\nHello"}, // --> 2
{"\nHello\n"}, // --> 2
{"\n\nHello"}, // --> 3
{"\n\nHello\n"}, // --> 3
} {
readerFile := strings.NewReader(sample.Input)
count, err := countline.CountLines(readerFile)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%#v --> %v\n", sample.Input, count)
}
// Output:
// "" --> 0
// "Hello" --> 1
// "Hello\n" --> 1
// "\n" --> 1
// "\n\n" --> 2
// "\nHello" --> 2
// "\nHello\n" --> 2
// "\n\nHello" --> 3
// "\n\nHello\n" --> 3
}Counting itself is cheap (a SIMD bytes.Count); the work is bound by memory bandwidth. For inputs of 4 MiB or more that support random access, the input is split into regions counted in parallel across CPU cores, so throughput scales past a single core's bandwidth. Smaller inputs and non-seekable readers (pipes, sockets) use a serial stream.
Measured on Apple M4 (10 cores), 16 GB RAM, Go 1.26.
Counting the 1 GiB test file (data_Giant.txt, warm in the page cache) as a command-line tool, via hyperfine --warmup 3 --runs 10:
| Tool | Time (mean) | |
|---|---|---|
countline |
26 ms | baseline |
wc -l |
823 ms | ≈32× slower |
Both report the same 72,323,529 lines. wc here is the BSD build shipped with macOS; GNU wc on Linux uses a faster newline scan, so expect a smaller gap there. Reproduce with make bench_vs_wc.
The benchmark (BenchmarkCountLines_IO, -count=6 medians) re-reads the same file in a loop, so the kernel serves it from RAM. These numbers measure the counting work plus syscall overhead — not cold-disk read speed. Reproduce with make bench.
| File Size | Time | Throughput | |
|---|---|---|---|
| 1 KiB | 12 μs | 89 MB/s | serial; dominated by open() overhead |
| 1 MiB | 53 μs | 20 GB/s | serial; fits in CPU cache |
| 10 MiB | 0.34 ms | 31 GB/s | parallel |
| 50 MiB | 1.4 ms | 37 GB/s | parallel |
| 100 MiB | 2.6 ms | 41 GB/s | parallel |
| 1 GiB | 23 ms | ~47 GB/s | parallel |
| File Size | Time | Throughput |
|---|---|---|
| 1 GiB | 13 ms | ~85 GB/s |
Note: A first read from cold storage is bound by your disk, not by these figures — expect your SSD/NVMe sequential read speed for uncached files. These results show how little overhead the counting itself adds once the bytes are available.
Found a faster way to count lines? Contributions are welcome.
Alternative implementations live in countline/_alt. If an alternative passes the shared spec and benchmarks faster, it can replace the main implementation in a later release after review.
- Issues:
- Please provide a reproducible code snippet.
- Pull requests:
- Branch:
main - Any pull request that makes it better is welcome!
- Branch: