aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-04-09kernel_3d_16 and xof functionskernelJack O'Connor
2022-03-26xor_xof variants for the 2d kernelJack O'Connor
2022-03-20blake3_avx512_xof_stream_4Jack O'Connor
2022-03-20blake3_avx2_xof_stream_2Jack O'Connor
2022-03-20blake3_avx512_xof_stream_2Jack O'Connor
2022-03-20initial xof_stream functionsJack O'Connor
2022-03-20add some commentsJack O'Connor
2022-03-16rename kernel_1 to kernel2d_1 and add degree argsJack O'Connor
2022-03-15generate blake3_{avx512,sse41,sse2}_compress with asm.pyJack O'Connor
2022-03-11replace tail calls with jumpsJack O'Connor
2022-03-11blake3_avx512_chunks_8 and blake3_avx512_parents_8Jack O'Connor
2022-03-09blake3_avx512_xof_xor_16Jack O'Connor
2022-03-09test unaligned writesJack O'Connor
2022-03-09broadcast the block length and domain flags inside blake3_avx512_kernel_16Jack O'Connor
blake3_avx512_xof_stream_16 was also incorrectly hardcoding a block length of 64. The block length parameter is the *input* block length, which is independent of the output block length. (The output block length is not a compression function parameter.)
2022-03-09move third row initialization into blake3_avx512_kernel_16Jack O'Connor
2022-03-09interleave the write ops in blake3_avx512_xor_stream_16Jack O'Connor
This seems to give a small but consistent performance boost.
2022-03-09blake3_avx512_xof_stream_16Jack O'Connor
2022-03-08split the left and right child CVs for blake3_avx512_parents_16Jack O'Connor
There's no reason to force the caller to allocate them together.
2022-03-08blake3_avx512_parents_16Jack O'Connor
2022-03-08use a memory argument for vpbroadcastdJack O'Connor
2022-03-08describe the transposition in commentsJack O'Connor
2022-03-08now using only 3 scratch zmm registersJack O'Connor
2022-03-08interleave the first pass -- good performanceJack O'Connor
2022-03-08try it with 4 times as many loadsJack O'Connor
2022-03-08add a benchmarkJack O'Connor
2022-03-08blake3_avx512_chunks_16Jack O'Connor
2022-03-08unroll the block loop and load the keyJack O'Connor
2022-03-08correct the last two transposition passesJack O'Connor
2022-03-08nonzero messageJack O'Connor
2022-03-08start working on a refactored assembly implementationJack O'Connor
The main goal is to eventually have extended outputs benefit from the same SIMD optimizations as inputs. To make this easier, I want to factor out a shared "kernel" routine that can be shared among several different interfaces: - compressing chunks - compressing parents - producing XOF output - xor'ing XOF output The timing here partly coincides with Rust stabilizing inline asm. That's certainly not necessary for any of this to work, but it gives me the confidence to try this without needing to master the rules of three different calling conventions.
2022-03-05link to reference impl ports from the main readme tooJack O'Connor
2022-03-04link to ports of the reference implementationJack O'Connor
2022-03-04add "(if any)" regarding keying in the security notesJack O'Connor
2022-03-03correct the security notes for the C APIJack O'Connor
2022-03-03simplify a bit moreJack O'Connor
2022-03-02simplify the security notes, avoid referring to entropyJack O'Connor
2022-03-02copy the same notes to the C docsJack O'Connor
2022-03-02document the extended output security issue found by Aldo GunsingJack O'Connor
https://eprint.iacr.org/2022/283
2022-01-25version 1.3.11.3.1Jack O'Connor
Changes since 1.3.0: - The unstable `traits-preview` feature now includes an implementation of `crypto_common::BlockSizeUser`, AKA `digest::core_api::BlockSizeUser`. This allows `blake3::Hasher` to be used with `hmac::SimpleHmac`.
2022-01-25add a release checklistJack O'Connor
2022-01-24check the HMAC output bytesJack O'Connor
2022-01-24Adds testjbis9051
2022-01-23Add blocksize traitjbis9051
2022-01-18add a RAYON_NUM_THREADS=1 run to CIJack O'Connor
2022-01-10silence a couple more warnings on 32-bit WindowsJack O'Connor
https://github.com/BLAKE3-team/BLAKE3/issues/218#issuecomment-1009510462
2022-01-08fix some compiler warningsSamuel Neves
2022-01-08version 1.3.01.3.0Jack O'Connor
Changes since 1.2.0: - Added blake3_hasher_reset to the C API, for parity with the Rust API. - Updated digest to v0.10. This version merged the crypto-mac crate with digest, so the dependency on crypto-mac has been removed. These trait implementations are still gated behind the "traits-preview" feature. - Updated clap to v3.
2022-01-08add Samuel Neves as a listed author of the Rust crateJack O'Connor
Samuel wrote all of the assembly implementations, with the sole exception of the SSE2 port.
2022-01-07update clap to v3Jack O'Connor
2022-01-07add blake3_hasher_reset to the C APIJack O'Connor