aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-10-02add the default CodeQL workflowcodeqlJack O'Connor
2020-10-01version 0.3.70.3.7Jack O'Connor
Changes since 0.3.6: - BUGFIX: The C implementation was incorrect on big endian systems for inputs longer than 1024 bytes. This bug affected all previous versions of the C implementation. Little endian platforms like x86 were unaffected. The Rust implementation was also unaffected. @jakub-zwolakowski and @pascal-cuoq from TrustInSoft reported this bug: https://github.com/BLAKE3-team/BLAKE3/pull/118 - BUGFIX: The C build on x86-64 was producing binaries with an executable stack. @tristanheaven reported this bug: https://github.com/BLAKE3-team/BLAKE3/issues/109 - @mkrupcale added optimized implementations for SSE2. This improves performance on older x86 processors that don't support SSE4.1. - The C implementation now exposes the `blake3_hasher_init_derive_key_raw` function, to make it easier to implement language bindings. Added by @k0001.
2020-09-29add cross_test.sh for the C bindingsJack O'Connor
This will let us add big endian testing to CI for our C code. (We were already doing it for our Rust code.) This is adapted from test_vectors/cross_test.sh. It works around the limitation that the `cross` tool can't reach parent directories. It's an unfortunate hack, but at least it's only for testing. It might've been less hacky to use symlinks for this somehow, but I worry that would break things on Windows, and I don't want to have to add workarounds for my workarounds.
2020-09-29fix a couple of big-endianness mistakes in blake3.cJack O'Connor
Kudos to @pascal-cuoq and @jakub-zwolakowski from TrustInSoft for catching these bugs. Original report: https://github.com/BLAKE3-team/BLAKE3/pull/118
2020-09-29fix the short_test_cases loop in the C bindings testsJack O'Connor
2020-09-29update the blake3_c_rust_bindings test cases alsoJack O'Connor
2020-09-29add more test cases at shorter input lengthsJack O'Connor
2020-09-24tweak the readme description of the benchmark chartJack O'Connor
2020-09-15add a docs.rs badgeJack O'Connor
2020-09-14use an absolute url for ↵Jack O'Connor
https://github.com/BLAKE3-team/BLAKE3/blob/master/b3sum/what_does_check_do.md
2020-09-14remove an outdated section of the b3sum readmeJack O'Connor
2020-09-10add some horizontal rules to the C readmeJack O'Connor
2020-09-10add a test for blake3_hasher_init_derive_key_rawJack O'Connor
2020-09-10C readme editsJack O'Connor
2020-09-10cargo fmtJack O'Connor
2020-09-10Merge pull request #114 from k0001/no-cstrJack O'Connor
C: Add blake3_hasher_init_derive_key_len
2020-09-02cover the no_sse2 flags in CI testingJack O'Connor
2020-09-01s/multi-threading/multithreading/Jack O'Connor
2020-09-01mention @mkrupcale's SSE2 implementation in the readmeJack O'Connor
2020-09-01C: rename blake3_hasher_init_derive_key_raw and documentationRenzo Carbonara
2020-08-31add i586-unknown-linux-musl as a test targetJack O'Connor
Samuel noticed that rustc seems to assume (incorrectly?) that all i686 targets support SSE2, but it doesn't make that assumption for i586.
2020-08-31add the dynamic check for SSE2 supportJack O'Connor
It will be very rare that this actually executes, but we should include it for completeness.
2020-08-31fix a build break on x86 targets without guaranteed SSE2 supportJack O'Connor
This is quite hard to trigger, because SSE2 has been guaranteed for a long time. But you could trigger it this way: rustup target add i686-unknown-linux-musl RUSTFLAGS="-C target-cpu=i386" cargo build --target i686-unknown-linux-musl Note a relevant gotcha though: The `cross` tool will not forward environment variables like RUSTFLAGS to the container by default, so if you're testing with `cross` you'll need to use the `rustc` command to explicitly pass the flag, as I've done here in ci.yml. (Or you could create a `Cross.toml` file, but I don't want to commit one of those if I can avoid it.)
2020-08-31add sse2 tests and benchmarksSamuel Neves
2020-08-31remove avoidable spillSamuel Neves
2020-08-31Merge pull request #110 from mkrupcale/sse2Samuel Neves
Add SSE2 implementations
2020-08-31C: asm: simplify pblendw emulationMatthew Krupcale
Use statically calculated ~mask. This reduces the number of moves and registers necessary at the expense of an extra memory load. This is probably a good trade-off since we are not bound by memory uops in this loop.
2020-08-31Implement `fmt::Debug` using buildersNikolai Vazquez
This enables pretty printing via `{:#?}`. The normal style for `{:?}` is kept exactly the same.
2020-08-31C: asm: simplify pinsrd emulationMatthew Krupcale
Use punpckl{,q}dq instead of pinsrw.
2020-08-30C: asm: remove blendvps usage altogetherMatthew Krupcale
This simplifies the operation by removing the need to use blendvps at all.
2020-08-30C: Add blake3_hasher_init_derive_key_lenRenzo Carbonara
blake3_hasher_init_derive_key_len is an alternative version of blake3_hasher_init_derive_key which takes the context and its length as separate parameters, and not together as a C string. The motivation for this addition is making it easier for bindings to this C library to call this function without having to first copy over the context bytes just to add one 0x00 byte at the end. Notice that contrary to blake3_hasher_init_derive_key, blake3_hasher_init_derive_key_len allows the inclusion of a 0x00 byte in the context. Given the rules about context string selection, this byte is unlikely to be used as part of a context string. But if for some reason it is ever given, it will be included in the context string and processed like any other non-alphanumeric byte would. For compatibility with blake3_hasher_init_derive_key, bindings should still check for the absence of 0x00 bytes.
2020-08-26wording tweak in the C readmeJack O'Connor
2020-08-25Write _mm_blend_epi16 emulation without multiplicationMatthew Krupcale
Use _mm_and_si128 and _mm_cmpeq_epi16 rather than expensive multiplication _mm_mullo_epi16 with _mm_srai_epi16 that compiler may not be able to optimize.
2020-08-24Fix Windows MSVC undefined symbol errorsMatthew Krupcale
MSVC returns "error A2006:undefined symbol : FFFFFFFFH", so use 0FFFFFFFFH instead. Also use 0 prefix for 0H to align things.
2020-08-24Put PBLENDW masks in the RDATA sectionMatthew Krupcale
Previously, these masks were undefined because they were outside of the RDATA section.
2020-08-24Fix Windows MSVC undefined symbol errorsMatthew Krupcale
MSVC returns "error A2006:undefined symbol : B1H", so use 0B1H instead.
2020-08-24Fix unreachable expression compiler warningMatthew Krupcale
SSE2 target_feature appears to always be present for x86_64.
2020-08-24C: asm: emulate pshufb ROT8 using SSE2 instructionsMatthew Krupcale
Use a simple shift for the rotation. * c/blake3_sse2_x86-64_unix.S: emulate pshufb using SSE2 instructions for x86_64 unix * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU. * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24C: asm: emulate pshufb ROT16 using SSE2 instructionsMatthew Krupcale
Use two 16-bit shuffles: one for the low 64-bits and one for the high 64-bits. * c/blake3_sse2_x86-64_unix.S: emulate pshufb using SSE2 instructions for x86_64 unix * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU. * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24C: asm: emulate pinsrd using SSE2 instructionsMatthew Krupcale
Use two pinsrw and a 16-bit shift to insert the 32-bit integer at the desired location. * c/blake3_sse2_x86-64_unix.S: emulate pinsrd using SSE2 instructions for x86_64 unix * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU. * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24C: asm: emulate blendvps using SSE2 instructionsMatthew Krupcale
Blend according to (mask & b) | ((~mask) & a). * c/blake3_sse2_x86-64_unix.S: emulate blendvps using SSE2 instructions for x86_64 unix * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU. * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24C: asm: emulate pblendw using SSE2 instructionsMatthew Krupcale
Use a constant mask to blend according to (mask & b) | ((~mask) & a). * c/blake3_sse2_x86-64_unix.S: emulate pblendw using SSE2 instructions for x86_64 unix * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU. * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24SSE2 intrinsic: emulate _mm_shuffle_epi8 SSSE3 intrinsic rot8 with SSE2 ↵Matthew Krupcale
intrinsics Use a simple shift version for the 8-bit rotation. * c/blake3_sse2.c: emulate _mm_shuffle_epi8 rot8 using SSE2 intrinsics
2020-08-24SSE2 intrinsic: emulate _mm_shuffle_epi8 SSSE3 intrinsic rot16 with SSE2 ↵Matthew Krupcale
intrinsics Use two 16-bit shuffles: one for the low 64-bits and one for the high 64-bits. * c/blake3_sse2.c: emulate _mm_shuffle_epi8 rot16 using SSE2 intrinsics
2020-08-24SSE2 intrinsic: emulate _mm_blend_epi16 SSE4.1 intrinsic with SSE2 intrinsicsMatthew Krupcale
Use a constant mask to blend according to (mask & b) | ((~mask) & a). * src/rust_sse2.rs: emulate _mm_blend_epi16 using SSE2 intrinsics * c/blake3_sse2.c: Likewise.
2020-08-24Start SSE2 implementation based on SSE4.1 versionMatthew Krupcale
Wire up basic functions and features for SSE2 support using the SSE4.1 version as a basis without implementing the SSE2 instructions yet. * Cargo.toml: add no_sse2 feature * benches/bench.rs: wire SSE2 benchmarks * build.rs: add SSE2 rust intrinsics and assembly builds * c/Makefile.testing: add SSE2 C and assembly targets * c/README.md: add SSE2 to C build instructions * c/blake3_c_rust_bindings/build.rs: add SSE2 C rust binding builds * c/blake3_c_rust_bindings/src/lib.rs: add SSE2 C rust bindings * c/blake3_dispatch.c: add SSE2 C dispatch * c/blake3_impl.h: add SSE2 C function prototypes * c/blake3_sse2.c: add SSE2 C intrinsic file starting with SSE4.1 version * c/blake3_sse2_x86-64_{unix.S,windows_gnu.S,windows_msvc.asm}: add SSE2 assembly files starting with SSE4.1 version * src/ffi_sse2.rs: add rust implementation using SSE2 C rust bindings * src/lib.rs: add SSE2 rust intrinsics and SSE2 C rust binding rust SSE2 module configurations * src/platform.rs: add SSE2 rust platform detection and dispatch * src/rust_sse2.rs: add SSE2 rust intrinsic file starting with SSE4.1 version * tools/instruction_set_support/src/main.rs: add SSE2 feature detection
2020-08-23Fix #109Samuel Neves
The default executable stack setting on Linux can be fixed in two different ways: - By adding the `.section .note.GNU-stack,"",%progbits` special incantation - By passing the `--noexecstack` flag to the assembler This patch implements both, but only one of them is strictly necessary. I've also added some additional hardening flags to the Makefile. May not be portable.
2020-08-19assembly authorship in the READMEJack O'Connor
2020-08-14the same hex example for rustdocsJack O'Connor
2020-08-14tweak the readme hex exampleJack O'Connor