| Age | Commit message (Collapse) | Author |
|
Temporarily skip blake3_xof_many_avx512 on Cygwin, as is already done for _WIN32.
This workaround should be removed when blake3_xof_many_avx512 is implemented in blake3_avx512_x86-64_windows_gnu.S .
fixes https://github.com/BLAKE3-team/BLAKE3/issues/494
|
|
the #endif was in the wrong place, causing the error
/Users/runner/work/php-src/php-src/ext/hash/blake3/upstream_blake3/c/blake3_dispatch.c:112:5: error: unused function 'get_cpu_features' [-Werror,-Wunused-function]
full compiler log https://github.com/php/php-src/actions/runs/7762643678/job/21173438425?pr=13194
|
|
caused
```
/Users/runner/work/php-src/php-src/ext/hash/blake3/upstream_blake3/c/blake3_dispatch.c:237:26: error: unused variable 'features' [-Werror,-Wunused-variable]
const enum cpu_feature features = get_cpu_features();
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If multiple threads try to compute a hash simultaneously before the library has been used for the first time,
the logic in get_cpu_features that detects CPU features will write to g_cpu_features without synchronization,
which is a race condition and flagged by ThreadSanitizer.
This change marks g_cpu_features as an atomic variable to address the race condition.
|
|
SSSE3 is indicated by bit 9 of ECX, not bit 0, which indicates the
presence of SSE3.
There are very few CPUs in use affected by this bug; SSE3 was part of
the Prescott new instructions, introduced in the later Pentium 4 chips,
whereas SSSE3 was introduced in Intel's Core 2 and AMD's Bulldozer. This
leaves a few Pentium 4 and Athlon 64 models that will potentially run an
illegal pshufb or pblendw.
|
|
|
|
|
|
|
|
patch by Samuel Neves ( https://github.com/sneves )
if you tried to compile blake3_dispatch.c with
-Wall -Werror -DBLAKE3_NO_SSE2 -DBLAKE3_NO_SSE41 -DBLAKE3_NO_AVX2 -DBLAKE3_NO_AVX512
something like this would happen:
hans@xDevAd:~/projects/BLAKE3/c$ gcc -O0 -o example example.c blake3.c blake3_dispatch.c blake3_portable.c blake3_sse2_x86-64_unix.S blake3_sse41_x86-64_unix.S blake3_avx2_x86-64_unix.S blake3_avx512_x86-64_unix.S -DBLAKE3_NO_SSE2 -DBLAKE3_NO_SSE41 -DBLAKE3_NO_AVX2 -DBLAKE3_NO_AVX512 -Wall -Wextra -Wpedantic -Werror
blake3_dispatch.c: In function ‘blake3_compress_in_place’:
blake3_dispatch.c:139:26: error: unused variable ‘features’ [-Werror=unused-variable]
139 | const enum cpu_feature features = get_cpu_features();
| ^~~~~~~~
blake3_dispatch.c: In function ‘blake3_compress_xof’:
blake3_dispatch.c:167:26: error: unused variable ‘features’ [-Werror=unused-variable]
167 | const enum cpu_feature features = get_cpu_features();
| ^~~~~~~~
blake3_dispatch.c: In function ‘blake3_hash_many’:
blake3_dispatch.c:195:26: error: unused variable ‘features’ [-Werror=unused-variable]
195 | const enum cpu_feature features = get_cpu_features();
| ^~~~~~~~
blake3_dispatch.c: In function ‘blake3_simd_degree’:
blake3_dispatch.c:244:26: error: unused variable ‘features’ [-Werror=unused-variable]
244 | const enum cpu_feature features = get_cpu_features();
| ^~~~~~~~
cc1: all warnings being treated as errors
|
|
Wire up basic functions and features for SSE2 support using the SSE4.1 version
as a basis without implementing the SSE2 instructions yet.
* Cargo.toml: add no_sse2 feature
* benches/bench.rs: wire SSE2 benchmarks
* build.rs: add SSE2 rust intrinsics and assembly builds
* c/Makefile.testing: add SSE2 C and assembly targets
* c/README.md: add SSE2 to C build instructions
* c/blake3_c_rust_bindings/build.rs: add SSE2 C rust binding builds
* c/blake3_c_rust_bindings/src/lib.rs: add SSE2 C rust bindings
* c/blake3_dispatch.c: add SSE2 C dispatch
* c/blake3_impl.h: add SSE2 C function prototypes
* c/blake3_sse2.c: add SSE2 C intrinsic file starting with SSE4.1 version
* c/blake3_sse2_x86-64_{unix.S,windows_gnu.S,windows_msvc.asm}: add SSE2
assembly files starting with SSE4.1 version
* src/ffi_sse2.rs: add rust implementation using SSE2 C rust bindings
* src/lib.rs: add SSE2 rust intrinsics and SSE2 C rust binding rust SSE2 module
configurations
* src/platform.rs: add SSE2 rust platform detection and dispatch
* src/rust_sse2.rs: add SSE2 rust intrinsic file starting with SSE4.1 version
* tools/instruction_set_support/src/main.rs: add SSE2 feature detection
|
|
Whereas vinserti64x4 is present on AVX512F, vinserti32x8 requires
AVX512DQ, which we do not test for. At this point there is not
much risk of incompatibility, since Skylake-X chips have all the
requires instruction sets, but let's be precise about this.
|
|
|
|
|
|
Thanks to bit4 for spotting this bug.
|
|
Currently this requires setting the BLAKE3_USE_NEON preprocessor flag.
In the future we may enable this automatically on AArch32/64 or include
some kind of dynamic feature detection. (Though ARM makes this harder
than x86.)
As part of this, get rid of the IS_ARM flag. It wasn't being set
properly when I tried it on a Raspberry Pi.
Closes #30.
|
|
|
|
|
|
This recursive function performs parallel parent node hashing, which is
an important optimization.
|
|
|
|
This means that compiling C sources includes all implementations by
default, which is what most callers are going to want.
|
|
|
|
This is commit 4476d9da0e370993823e7ad17592b84e905afd76 of
https://github.com/veorq/BLAKE3-c.
|