From 886f0cdb4df4fdbee7c99fb1fdd5c72d9608d635 Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Mon, 29 Jan 2024 20:51:52 +0100 Subject: mtopolnik submission 3 (#637) * calculate_average_mtopolnik * short hash (just first 8 bytes of name) * Remove unneeded checks * Remove archiving classes * 2x larger hashtable * Add "set" to setters * Simplify parsing temperature, remove newline search * Reduce the size of the name slot * Store name length and use to detect collision * Reduce memory loads in parseTemperature * Use short for min/max * Extract constant for semicolon * Fix script header * Explicit bash shell in shebang * Inline usage of broadcast semicolon * Try vectorization * Remove vectorization * Go Unsafe * Use SWAR temperature parsing by merykitty * Inline some things * Remove commented-out MemorySegment usage * Inline namesMem.asSlice() invocation * Try out JVM JIT flags * Implement strcmp * Remove unused instance variables * Optimize hashing * Put station name into hashtable * Reorder method * Remove usage of MemorySegment.getUtf8String Replace with UNSAFE.copyMemory() and new String() * Fix hashing bug * Remove outdated comments * Fix informative constants * Use broadcastByte() more * Improve method naming * More hashing * Revert more hashing * Add commented-out code to hash 16 bytes * Slight cleanup * Align hashtable at cacheline boundary * Add Graal Native image * Revert Graal Native image This reverts commit d916a42326d89bd1a841bbbecfae185adb8679d7. * Simplify shell script (no SDK selection) * Move a constant, zero out hashtable on start * Better name comparison * Add prepare_mtopolnik.sh * Cleaner idiom in name comparison * AND instead of MOD for hashtable indexing * Improve word masking code * Fix formatting * Reduce memory loads * Remove endianness checks * Avoid hash == 0 problem * Fix subtle bug * MergeSort of parellel results * Touch up perf * Touch up perf * Remove -Xmx256m * Extract result printing method * Print allocation details on OOME * Single mmap * Use global allocation arena * Add commented-out Xmx64m XXMaxDirectMemorySize=1g * withinSafeZone * Update cursor earlier * Better assert * Fix bug in addrOfSemicolonSafe * Move declaration lower * Simplify code * Add rounding error test case * Fix DANGER_ZONE_LEN * Deoptimize parseTemperatureSimple() * Inline parseTemperatureAndAdvanceCursor() * Skip masking until the last load * Conditionally fetch name words * Cleanup * Use native image * Use supbrocess * Simpler code * Cleanup * Avoid extra condition on hot path --- calculate_average_mtopolnik.sh | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'calculate_average_mtopolnik.sh') diff --git a/calculate_average_mtopolnik.sh b/calculate_average_mtopolnik.sh index 24b5a1c..acd1024 100755 --- a/calculate_average_mtopolnik.sh +++ b/calculate_average_mtopolnik.sh @@ -15,5 +15,11 @@ # limitations under the License. # -java --enable-preview \ - --class-path target/average-1.0.0-SNAPSHOT.jar dev.morling.onebrc.CalculateAverage_mtopolnik +if [ -f target/CalculateAverage_mtopolnik_image ]; then + echo "Using native image 'target/CalculateAverage_mtopolnik_image'" 1>&2 + target/CalculateAverage_mtopolnik_image +else + JAVA_OPTS="--enable-preview" + echo "Native image not found, using JVM mode." 1>&2 + java $JAVA_OPTS --class-path target/average-1.0.0-SNAPSHOT.jar dev.morling.onebrc.CalculateAverage_mtopolnik +fi -- cgit v1.2.3