aboutsummaryrefslogtreecommitdiff
path: root/calculate_average_obourgain.sh
diff options
context:
space:
mode:
authorOlivier Bourgain <olivierbourgain02@gmail.com>2024-01-07 20:15:53 +0100
committerGitHub <noreply@github.com>2024-01-07 20:15:53 +0100
commit143132e8dff054708f9695229d8ec3b3e2344246 (patch)
tree3b22fad48b341b9a6e41deca685a9c4eb7538b0f /calculate_average_obourgain.sh
parent2bb44311064d84e0aa01af9b89605349afcc0de4 (diff)
My implementation is in dev.morling.onebrc.CalculateAverage_obourgain and runnable with provided script calculate_average_obourgain.sh (#75)
Runs with standard JDK 21. On my computers (i5 13500, 20 cores, 32GB ram) my best run is (file fully in page cache): 49.78user 0.69system 0:02.81elapsed 1795%CPU A bit older version of the code on Mac pro M1 32 GB: real 0m2.867s user 0m23.956s sys 0m1.329s As I wrote in comments in the code, I have a few different roundings that the reference implementation. I have seend that there is an issue about that, but no specific rule yet. Main points: - use MemorySegment, it's faster than ByteBuffer - split the work in a lot of chunks and distribute to a thread pool - fast measurement parser by using a lot of domain knowledge - very low allocation - visit each byte only once Things I tried that were in fact pessimizations: - use some internal JDK code to vectorize the hashCode computation - use a MemorySegment to represent the keys instead of byte[], to avoid copying Hope I won't have a bad surprise when running on the target server 😱
Diffstat (limited to 'calculate_average_obourgain.sh')
-rwxr-xr-xcalculate_average_obourgain.sh31
1 files changed, 31 insertions, 0 deletions
diff --git a/calculate_average_obourgain.sh b/calculate_average_obourgain.sh
new file mode 100755
index 0000000..67c91b3
--- /dev/null
+++ b/calculate_average_obourgain.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+#
+# Copyright 2023 The original authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# runs with -Xmx24m on my machine, playing it safe with a larger heap
+JAVA_OPTS="-Xmx64m --enable-preview"
+# to use some black magic options
+JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"
+# no GC, not needed
+JAVA_OPTS="$JAVA_OPTS -XX:+UseEpsilonGC -XX:+AlwaysPreTouch"
+# my finals are really final
+JAVA_OPTS="$JAVA_OPTS -XX:+TrustFinalNonStaticFields"
+# to get CalculateAverage_obourgain$OpenAddressingMap::getOrCreate to inline. A compile command wasn't enough, it was still hitting 'already compiled into a big method'
+JAVA_OPTS="$JAVA_OPTS -XX:InlineSmallCode=10000"
+# seems to be a bit faster
+JAVA_OPTS="$JAVA_OPTS -XX:-TieredCompilation -XX:CICompilerCount=2 -XX:CompileThreshold=1000"
+
+time java $JAVA_OPTS --class-path target/average-1.0.0-SNAPSHOT.jar dev.morling.onebrc.CalculateAverage_obourgain