diff options
| author | Stefan Boberg <[email protected]> | 2025-11-07 14:49:13 +0100 |
|---|---|---|
| committer | GitHub Enterprise <[email protected]> | 2025-11-07 14:49:13 +0100 |
| commit | 24e43a913f29ac3b314354e8ce5175f135bcc64f (patch) | |
| tree | ca442937ceeb63461012b33a4576e9835099f106 /thirdparty/blake3/src/io.rs | |
| parent | get oplog attachments (#622) (diff) | |
| download | zen-24e43a913f29ac3b314354e8ce5175f135bcc64f.tar.xz zen-24e43a913f29ac3b314354e8ce5175f135bcc64f.zip | |
switch to xmake for package management (#611)
This change removes our dependency on vcpkg for package management, in favour of bringing some code in-tree in the `thirdparty` folder as well as using the xmake build-in package management feature. For the latter, all the package definitions are maintained in the zen repo itself, in the `repo` folder.
It should now also be easier to build the project as it will no longer depend on having the right version of vcpkg installed, which has been a common problem for new people coming in to the codebase. Now you should only need xmake to build.
* Bumps xmake requirement on github runners to 2.9.9 to resolve an issue where xmake on Windows invokes cmake with `v144` toolchain which does not exist
* BLAKE3 is now in-tree at `thirdparty/blake3`
* cpr is now in-tree at `thirdparty/cpr`
* cxxopts is now in-tree at `thirdparty/cxxopts`
* fmt is now in-tree at `thirdparty/fmt`
* robin-map is now in-tree at `thirdparty/robin-map`
* ryml is now in-tree at `thirdparty/ryml`
* sol2 is now in-tree at `thirdparty/sol2`
* spdlog is now in-tree at `thirdparty/spdlog`
* utfcpp is now in-tree at `thirdparty/utfcpp`
* xmake package repo definitions is in `repo`
* implemented support for sanitizers. ASAN is supported on windows, TSAN, UBSAN, MSAN etc are supported on Linux/MacOS though I have not yet tested it extensively on MacOS
* the zencore encryption implementation also now supports using mbedTLS which is used on MacOS, though for now we still use openssl on Linux
* crashpad
* bumps libcurl to 8.11.0 (from 8.8.0) which should address a rare build upload bug
Diffstat (limited to 'thirdparty/blake3/src/io.rs')
| -rw-r--r-- | thirdparty/blake3/src/io.rs | 64 |
1 files changed, 64 insertions, 0 deletions
diff --git a/thirdparty/blake3/src/io.rs b/thirdparty/blake3/src/io.rs new file mode 100644 index 000000000..7e8e154f7 --- /dev/null +++ b/thirdparty/blake3/src/io.rs @@ -0,0 +1,64 @@ +//! Helper functions for efficient IO. + +#[cfg(feature = "std")] +pub(crate) fn copy_wide( + mut reader: impl std::io::Read, + hasher: &mut crate::Hasher, +) -> std::io::Result<u64> { + let mut buffer = [0; 65536]; + let mut total = 0; + loop { + match reader.read(&mut buffer) { + Ok(0) => return Ok(total), + Ok(n) => { + hasher.update(&buffer[..n]); + total += n as u64; + } + // see test_update_reader_interrupted + Err(e) if e.kind() == std::io::ErrorKind::Interrupted => continue, + Err(e) => return Err(e), + } + } +} + +// Mmap a file, if it looks like a good idea. Return None in cases where we know mmap will fail, or +// if the file is short enough that mmapping isn't worth it. However, if we do try to mmap and it +// fails, return the error. +// +// SAFETY: Mmaps are fundamentally unsafe, because you can call invariant-checking functions like +// str::from_utf8 on them and then have them change out from under you. Letting a safe caller get +// their hands on an mmap, or even a &[u8] that's backed by an mmap, is unsound. However, because +// this function is crate-private, we can guarantee that all can ever happen in the event of a race +// condition is that we either hash nonsense bytes or crash with SIGBUS or similar, neither of +// which should risk memory corruption in a safe caller. +// +// PARANOIA: But a data race...is a data race...is a data race...right? Even if we know that no +// platform in the "real world" is ever going to do anything other than compute the "wrong answer" +// if we race on this mmap while we hash it, aren't we still supposed to feel bad about doing this? +// Well, maybe. This is IO, and IO gets special carve-outs in the memory model. Consider a +// memory-mapped register that returns random 32-bit words. (This is actually realistic if you have +// a hardware RNG.) It's probably sound to construct a *const i32 pointing to that register and do +// some raw pointer reads from it. Those reads should be volatile if you don't want the compiler to +// coalesce them, but either way the compiler isn't allowed to just _go nuts_ and insert +// should-never-happen branches to wipe your hard drive if two adjacent reads happen to give +// different values. As far as I'm aware, there's no such thing as a read that's allowed if it's +// volatile but prohibited if it's not (unlike atomics). As mentioned above, it's not ok to +// construct a safe &i32 to the register if you're going to leak that reference to unknown callers. +// But if you "know what you're doing," I don't think *const i32 and &i32 are fundamentally +// different here. Feedback needed. +#[cfg(feature = "mmap")] +pub(crate) fn maybe_mmap_file(file: &std::fs::File) -> std::io::Result<Option<memmap2::Mmap>> { + let metadata = file.metadata()?; + let file_size = metadata.len(); + if !metadata.is_file() { + // Not a real file. + Ok(None) + } else if file_size < 16 * 1024 { + // Mapping small files is not worth it, and some special files that can't be mapped report + // a size of zero. + Ok(None) + } else { + let map = unsafe { memmap2::Mmap::map(file)? }; + Ok(Some(map)) + } +} |