>>107709618
>So whatever Vec does seems to prefault more pages than it should.
I think the allocator and the kernel just don't like doing the bookkeeping. And growing the buckets wasn't that expensive to begin with.
For that first test I was pre-allocating 2 TB of data, or 500 million pages, in the single-threaded section of the program, and that only took a couple of milliseconds.
I've looked at the implementation of Vec before and it shouldn't be faulting any pages until you actually push elements.
>>107709681
That helps:
$ hyperfine 'target/release/sort < bible.txt' './a.out < bible.txt'
Benchmark 1: target/release/sort < bible.txt
Time (mean ± σ): 14.3 ms ± 1.4 ms [User: 22.1 ms, System: 12.0 ms]
Range (min … max): 12.0 ms … 20.2 ms 187 runs
Benchmark 2: ./a.out < bible.txt
Time (mean ± σ): 22.6 ms ± 1.4 ms [User: 39.0 ms, System: 13.4 ms]
Comment too long. Click here to view the full text.