aboutsummaryrefslogtreecommitdiff
path: root/src/kernel.rs
AgeCommit message (Expand)Author
2022-04-09kernel_3d_16 and xof functionskernelJack O'Connor
2022-03-26xor_xof variants for the 2d kernelJack O'Connor
2022-03-20blake3_avx512_xof_stream_4Jack O'Connor
2022-03-20blake3_avx2_xof_stream_2Jack O'Connor
2022-03-20blake3_avx512_xof_stream_2Jack O'Connor
2022-03-20initial xof_stream functionsJack O'Connor
2022-03-16rename kernel_1 to kernel2d_1 and add degree argsJack O'Connor
2022-03-15generate blake3_{avx512,sse41,sse2}_compress with asm.pyJack O'Connor
2022-03-11replace tail calls with jumpsJack O'Connor
2022-03-11blake3_avx512_chunks_8 and blake3_avx512_parents_8Jack O'Connor
2022-03-09blake3_avx512_xof_xor_16Jack O'Connor
2022-03-09test unaligned writesJack O'Connor
2022-03-09broadcast the block length and domain flags inside blake3_avx512_kernel_16Jack O'Connor
2022-03-09move third row initialization into blake3_avx512_kernel_16Jack O'Connor
2022-03-09interleave the write ops in blake3_avx512_xor_stream_16Jack O'Connor
2022-03-09blake3_avx512_xof_stream_16Jack O'Connor
2022-03-08split the left and right child CVs for blake3_avx512_parents_16Jack O'Connor
2022-03-08blake3_avx512_parents_16Jack O'Connor
2022-03-08use a memory argument for vpbroadcastdJack O'Connor
2022-03-08describe the transposition in commentsJack O'Connor
2022-03-08now using only 3 scratch zmm registersJack O'Connor
2022-03-08interleave the first pass -- good performanceJack O'Connor
2022-03-08try it with 4 times as many loadsJack O'Connor
2022-03-08add a benchmarkJack O'Connor
2022-03-08blake3_avx512_chunks_16Jack O'Connor
2022-03-08unroll the block loop and load the keyJack O'Connor
2022-03-08correct the last two transposition passesJack O'Connor
2022-03-08nonzero messageJack O'Connor
2022-03-08start working on a refactored assembly implementationJack O'Connor