1brc - my implementations for the billion row challenge

	Commit message (Collapse)	Author	Age	Files	Lines
*	Removing superfluous time calls	Gunnar Morling	2024-01-10	2	-2/+2
\|
*	Leaderboard update	Gunnar Morling	2024-01-10	1	-2/+4
\|
*	Add CalculateAverage_couragelee Java class and shell script	CourageLee	2024-01-10	2	-0/+355
\| \| \|	This commit introduces a new java class, CalculateAverage_couragelee, and a shell script for calculating averages. The java class utilizes NIO's memory-mapping and parallel computing techniques to perform calculations. These changes should improve the efficiency and speed of average calculations.
*	Implementation by rprabhu	Prabhu R	2024-01-10	2	-0/+157
\| \| \|	Co-authored-by: Prabhu R <prabhu.rengaswamy@outlook.com>
*	gabrielreid take 2	greid	2024-01-10	1	-162/+141
\| \| \| \| \|	Clear up some TODOS, simplify the code a bit, which appears to result in a 25% performance increase.
*	Second submission by flippingbits - 50% performance improvement	Stefan Sprenger	2024-01-10	1	-92/+90
\| \| \| \| \| \| \| \| \| \| \|	* feat(flippingbits): Improve parsing of measurement and few cleanups * feat(flippingbits): Reduce chunk size to 10MB * feat(flippingbits): Improve parsing of station names * chore(flippingbits): Remove obsolete import * chore(flippingbits): Few cleanups
*	merykitty's second attempt	Quan Anh Mai	2024-01-10	2	-192/+139
\|
*	Leaderboard update	Gunnar Morling	2024-01-10	1	-3/+3
\|
*	Consume four bytes at a time from buffer using getInt. Store key with unsafe ↵	Elliot Barlas	2024-01-10	2	-84/+143
\| \| \| \|	int array rather than byte array. Use custom equals rather than Arrays equals.
*	Leaderboard update	Gunnar Morling	2024-01-10	1	-2/+2
\|
*	Second tuning for thomaswue	Thomas Wuerthinger	2024-01-10	1	-96/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Optimize checking for collisions by doing this a long at a time always. * Use a long at a time scanning for delimiter. * Minor tuning. Now below 0.80s on Intel i9-13900K. * Add number parsing code from Quan Anh Mai. Fix name length issue. * Include suggestion from Alfonso Peterssen for another 1.5%. * Optimize hash collision check compare for ~4% gain. * Add perf stats based on latest version.
*	Removing nstng's entry as he's retracting from the challenge	Gunnar Morling	2024-01-10	1	-1/+0
\|
*	Revert "Adding Nils Semmelrock's submission"	Nils Semmelrock	2024-01-10	2	-290/+0
\| \| \| \|	This reverts commit 12ae36ad
*	Leaderboard update -- Backfilling remaining entries	Gunnar Morling	2024-01-10	1	-7/+33
\|
*	Update test*.sh to support input file pattern	Alexander Yastrebov	2024-01-10	2	-11/+40
\| \| \| \|	This is useful for testing fork(s) against subset of test samples
*	New leaderboard (WIP) after environment change	Gunnar Morling	2024-01-10	1	-13/+57
\|
*	more robust error message	Jason Nochlin	2024-01-10	1	-1/+1
\|
*	catch hyperfine command failed	Jason Nochlin	2024-01-10	1	-0/+5
\|
*	Add small test cases	Alexander Yastrebov	2024-01-10	6	-0/+10
\| \| \| \|	For https://github.com/gunnarmorling/1brc/issues/276
*	remove debug line	Jason Nochlin	2024-01-10	1	-1/+0
\|
*	Validate that ./calculate_average_<fork>.sh exists for each fork	Jason Nochlin	2024-01-10	1	-0/+8
\|
*	grep returns exit code 1 when no match, `\|\| true` prevents the script from ↵	Jason Nochlin	2024-01-10	1	-1/+2
\| \| \| \|	exiting early
*	Fix test.sh to use prepare script	Alexander Yastrebov	2024-01-10	17	-17/+21
\|
*	Update README.md	Gunnar Morling	2024-01-10	1	-0/+2
\|
*	#281 Trimming slowest/fastest run, not first/last in evaluate2.sh	Gunnar Morling	2024-01-10	1	-2/+2
\|
*	evaluate2.sh improvements - leaderboard, default SDK	Jason Nochlin	2024-01-10	3	-11/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* reset the JDK to the default (21.0.1-open) when no prepare script is provided * leaderboard improvements - sorting and content * run sdk install once at the beginning of the script for all the SDKs detected in any of the evaluated prepare scripts * remove unnecessary code and tweak doc comments * one more nit * Don't print rankings values when only 1 fork is being evaluated * It's been a few hours, so I now have some more rate limit :) --------- Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
*	Hyperfine: Script re-org	Gunnar Morling	2024-01-09	74	-123/+369
\|
*	Use hyperfine and jq to improve evaluate.sh	Jason Nochlin	2024-01-09	1	-0/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* create new version of evaluate.sh using hyperfine + jq * output the raw times for each command * nit: s/command/fork/ * update evaluate2.sh for new fork file structure * review changes * use numactl on linux * 1 warmup * verify output * leaderboard * do not early exit on hyperfine error * check if SMT and turbo boost are disabled * fix bug --------- Co-authored-by: Jason Nochlin <hundredwatt@users.noreply.github.com>
*	Update README.md	Gunnar Morling	2024-01-09	1	-1/+9
\|
*	Committing line separator changes on Linux systems (enforced by gitattr).	Dimitar Dimitrov	2024-01-08	1	-205/+205
\|
*	Added UseTransparentHugePages after testing on a box	Dimitar Dimitrov	2024-01-07	1	-4/+7
\|
*	Add davery22 impl	Daniel Avery	2024-01-07	2	-0/+335
\|
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-1/+1
\|
*	isolgpus: submission 2 - about a 25% improvement on submission 1. (#168)	Jamie Stansfield	2024-01-07	1	-47/+51
\| \| \| \| \| \| \| \| \| \| \| \| \|	* isolgpus: fix chunk sizing when not at 8 threads use as many cores as are available don't buffer the station name, only use it when we need it. get rid of the main branch move variables inside the loop * isolgpus: optimistically assume we can read a whole int for the station name, but roll back if we get it wrong. This should be very beneficial on a dataset where station names are mostly over 4 chars --------- Co-authored-by: Jamie Stansfield <jalstansfield@gmail.com>
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-3/+5
\|
*	Use SIMD for search for delimiter and name compare	Thomas Wuerthinger	2024-01-07	1	-52/+100
\|
*	Add yehwankim23 (#148)	김예환 Ye-Hwan Kim (Sam)	2024-01-07	2	-0/+125
\|
*	My implementation is in dev.morling.onebrc.CalculateAverage_obourgain and ↵	Olivier Bourgain	2024-01-07	2	-0/+512
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	runnable with provided script calculate_average_obourgain.sh (#75) Runs with standard JDK 21. On my computers (i5 13500, 20 cores, 32GB ram) my best run is (file fully in page cache): 49.78user 0.69system 0:02.81elapsed 1795%CPU A bit older version of the code on Mac pro M1 32 GB: real 0m2.867s user 0m23.956s sys 0m1.329s As I wrote in comments in the code, I have a few different roundings that the reference implementation. I have seend that there is an issue about that, but no specific rule yet. Main points: - use MemorySegment, it's faster than ByteBuffer - split the work in a lot of chunks and distribute to a thread pool - fast measurement parser by using a lot of domain knowledge - very low allocation - visit each byte only once Things I tried that were in fact pessimizations: - use some internal JDK code to vectorize the hashCode computation - use a MemorySegment to represent the keys instead of byte[], to avoid copying Hope I won't have a bad surprise when running on the target server 😱
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-2/+2
\|
*	Roy: Adding a bit of unsafe...	Roy van Rijn	2024-01-07	1	-254/+155
\| \| \|	Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
*	Removing App CDS from Roy's submission	Gunnar Morling	2024-01-07	1	-7/+2
\|
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-0/+1
\|
*	first attempt	ags	2024-01-07	2	-0/+272
\|
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-0/+1
\|
*	Initial Implementation - coolmineman (#196)	Cool_Mineman	2024-01-07	2	-0/+310
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* start * slower * still bad * finally faster than baseline :) * starting to go fast * improve * we ball * fix race condition an newline * change threadpool * ~18sec on my machine
*	Update pull_request_template.md	Gunnar Morling	2024-01-07	1	-3/+4
\|
*	Leaderboard update	Gunnar Morling	2024-01-07	1	-0/+2
\|
*	1brc submission - Kevin McMurtrie (#195)	Kevin McMurtrie	2024-01-07	2	-0/+539
\| \| \| \| \|	* v1 * Fix sorting
*	An implementation optimised for simplicity/readability.	John	2024-01-07	2	-0/+340
\|
*	Leaderboard update	Gunnar Morling	2024-01-07	2	-5/+10
\|