diff options
| author | Olivier Bourgain <olivierbourgain02@gmail.com> | 2024-01-07 20:15:53 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2024-01-07 20:15:53 +0100 |
| commit | 143132e8dff054708f9695229d8ec3b3e2344246 (patch) | |
| tree | 3b22fad48b341b9a6e41deca685a9c4eb7538b0f /src/main/java/dev/morling/onebrc/CalculateAverage_yehwankim23.java | |
| parent | 2bb44311064d84e0aa01af9b89605349afcc0de4 (diff) | |
My implementation is in dev.morling.onebrc.CalculateAverage_obourgain and runnable with provided script calculate_average_obourgain.sh (#75)
Runs with standard JDK 21.
On my computers (i5 13500, 20 cores, 32GB ram) my best run is (file fully in page cache):
49.78user 0.69system 0:02.81elapsed 1795%CPU
A bit older version of the code on Mac pro M1 32 GB:
real 0m2.867s
user 0m23.956s
sys 0m1.329s
As I wrote in comments in the code, I have a few different roundings that the reference implementation. I have seend that there is an issue about that, but no specific rule yet.
Main points:
- use MemorySegment, it's faster than ByteBuffer
- split the work in a lot of chunks and distribute to a thread pool
- fast measurement parser by using a lot of domain knowledge
- very low allocation
- visit each byte only once
Things I tried that were in fact pessimizations:
- use some internal JDK code to vectorize the hashCode computation
- use a MemorySegment to represent the keys instead of byte[], to avoid
copying
Hope I won't have a bad surprise when running on the target server 😱
Diffstat (limited to 'src/main/java/dev/morling/onebrc/CalculateAverage_yehwankim23.java')
0 files changed, 0 insertions, 0 deletions
