microhash
microhash is a lightweight, non-cryptographic 64-bit hash function for systems ranging from x86-64 servers to resource-constrained embedded targets. It is designed for hash tables, checksums, and data fingerprinting where a small, deterministic implementation matters more than cryptographic strength.
View the specification | Build and test microhash | Browse the source on GitHub
microhash is not a cryptographic hash. Do not use it for passwords, signatures, authentication, or other security-sensitive purposes.
At a glance
| Property | Value |
|---|---|
| Output | 64-bit digest |
| Internal state | Two 32-bit words |
| Processing block | 32 bytes, with the first 16 bytes actively mixed |
| Core working memory | 32-byte block buffer and 8 bytes of state |
| C++ API | Header-only, with no heap allocation in the core overload |
| Implementations | C++17 and C# |
| Cryptographic | No |
The C++ implementation is intentionally small. Its core overload uses fixed width integer arithmetic, bitwise operations, and a stack-allocated buffer:
#include "microhash.hpp"
const uint8_t* data = /* ... */;
size_t length = /* ... */;
uint64_t digest = MicroHash::hashPipe::ComputeHash(data, length);
Quick start
Build the C++ command-line tool from the repository root:
g++ -std=c++17 -O2 -o microhash src/cpp/main.cpp
./microhash "Hello, World!"
Expected output:
microhash("Hello, World!") = 0x352256EFEDC72BD1
The CLI also accepts multiple words and interactive input:
./microhash The quick brown fox
./microhash
./microhash --test
For the C# implementation:
dotnet run --project src/csharp/microhash.csproj -- "Hello, World!"
See Building and Testing for debug and release builds, the C++ and C# test suites, benchmarks, and Docker usage.
Algorithm summary
microhash starts from two fixed 32-bit constants:
state[0] = 0x243F6A88
state[1] = 0x85A308D3
The input is padded to a multiple of 32 bytes. The first 16 bytes of each block are read as four little-endian words. Each word updates both state values with rotate, XOR, and addition operations:
state[0] = ROL32(state[0] XOR word, 5) + state[1]
state[1] = ROL32(state[1] + word, 11) XOR state[0]
The final 64-bit digest combines both accumulators:
final = state[0] XOR ROL32(state[1], 3)
digest = (final << 32) | state[1]
The complete specification documents padding, finalisation, output truncation, implementation differences, and porting notes for constrained platforms.
Embedded targets
The C++ core has no operating-system dependency and performs no heap allocation. It can be adapted to environments with less than 32 bytes of working RAM by processing one word at a time instead of retaining a complete block buffer.
The specification includes implementation guidance for:
- Z80 and CP/M systems
- Motorola 68000 targets
- MOS 6502 systems
- Other languages with wrapping 32-bit arithmetic and bitwise operators
Known limitations
- The algorithm is not designed to resist collision, preimage, or length-extension attacks.
- Only bytes
0through15of each 32-byte block are mixed. Bytes16through31, including the encoded length field in the final block, do not currently influence the output. - The C++ implementation assembles words explicitly and is host-endian safe. The C# implementation uses
BitConverter.ToUInt32, so matching output is expected on little-endian hosts.
Documentation
- Specification: algorithm, implementations, porting notes, constrained-target adaptations, and design trade-offs.
- Building and Testing: C++, C#, Docker, benchmarks, test coverage, and statistical checks.
- README: repository overview and test vectors.
License
microhash is distributed under the MIT License.