Skip to content

KEINOS/go-countline

Repository files navigation

Go 1.26+ Go Reference

go-countline

Blazing-fast line counting for Go. go-countline does one thing — count the lines in an io.Reader — and does it at memory speed: a 1 GiB buffer in about 13 ms (~85 GB/s). For large files and in-memory readers it counts concurrently across CPU cores; small inputs and streaming readers use a serial fallback.

On a 1 GiB file it counts lines about 32× faster than wc -l (≈26 ms vs ≈820 ms on an Apple M4). Verify it yourself with make bench_vs_wc.

Unlike wc -l, it also counts the final line when the input does not end with a line feed.

Usage

As a CLI

Install the command-line wrapper:

go install "github.com/KEINOS/go-countline/cmd/countline@latest"

Run it with one file path:

countline ./path/to/file.txt

As a package

Add it to your module:

go get "github.com/KEINOS/go-countline"

Then pass any io.Reader to CountLines:

import "github.com/KEINOS/go-countline/countline"

func ExampleCountLines() {
    for _, sample := range []struct {
        Input string
    }{
        {""},            // --> 0
        {"Hello"},       // --> 1
        {"Hello\n"},     // --> 1
        {"\n"},          // --> 1
        {"\n\n"},        // --> 2
        {"\nHello"},     // --> 2
        {"\nHello\n"},   // --> 2
        {"\n\nHello"},   // --> 3
        {"\n\nHello\n"}, // --> 3
    } {
        readerFile := strings.NewReader(sample.Input)

        count, err := countline.CountLines(readerFile)
        if err != nil {
            log.Fatal(err)
        }

        fmt.Printf("%#v --> %v\n", sample.Input, count)
    }
    // Output:
    // "" --> 0
    // "Hello" --> 1
    // "Hello\n" --> 1
    // "\n" --> 1
    // "\n\n" --> 2
    // "\nHello" --> 2
    // "\nHello\n" --> 2
    // "\n\nHello" --> 3
    // "\n\nHello\n" --> 3
}

Performance

Counting itself is cheap (a SIMD bytes.Count); the work is bound by memory bandwidth. For inputs of 4 MiB or more that support random access, the input is split into regions counted in parallel across CPU cores, so throughput scales past a single core's bandwidth. Smaller inputs and non-seekable readers (pipes, sockets) use a serial stream.

Measured on Apple M4 (10 cores), 16 GB RAM, Go 1.26.

Versus wc -l

Counting the 1 GiB test file (data_Giant.txt, warm in the page cache) as a command-line tool, via hyperfine --warmup 3 --runs 10:

Tool Time (mean)
countline 26 ms baseline
wc -l 823 ms ≈32× slower

Both report the same 72,323,529 lines. wc here is the BSD build shipped with macOS; GNU wc on Linux uses a faster newline scan, so expect a smaller gap there. Reproduce with make bench_vs_wc.

Throughput (file already in the OS page cache)

The benchmark (BenchmarkCountLines_IO, -count=6 medians) re-reads the same file in a loop, so the kernel serves it from RAM. These numbers measure the counting work plus syscall overhead — not cold-disk read speed. Reproduce with make bench.

File Size Time Throughput
1 KiB 12 μs 89 MB/s serial; dominated by open() overhead
1 MiB 53 μs 20 GB/s serial; fits in CPU cache
10 MiB 0.34 ms 31 GB/s parallel
50 MiB 1.4 ms 37 GB/s parallel
100 MiB 2.6 ms 41 GB/s parallel
1 GiB 23 ms ~47 GB/s parallel

In-memory (bytes.Reader, no syscalls)

File Size Time Throughput
1 GiB 13 ms ~85 GB/s

Note: A first read from cold storage is bound by your disk, not by these figures — expect your SSD/NVMe sequential read speed for uncached files. These results show how little overhead the counting itself adds once the bytes are available.

Contributing

Statuses

Go 1.26+ Test on macOS/Win/Linux golangci-lint

codecov Go Report Card CodeQL

Contribute

Found a faster way to count lines? Contributions are welcome.

Alternative implementations live in countline/_alt. If an alternative passes the shared spec and benchmarks faster, it can replace the main implementation in a later release after review.

  • Issues: Issues
    • Please provide a reproducible code snippet.
  • Pull requests: Pull Requests
    • Branch: main
    • Any pull request that makes it better is welcome!

About

Blazing-fast line counting command and package for Go

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors