A multithreaded and single threaded string interner that allows strings to be cached with a minimal memory footprint,
associating them with a unique key that can be used to retrieve them at any time. A Rodeo
allows O(1)
internment and resolution and can be turned into a RodeoReader
to allow for contention-free resolutions
with both key to str and str to key operations. It can also be turned into a RodeoResolver
with only
key to str operations for the lowest possible memory usage.
For single-threaded workloads Rodeo
is encouraged, while multi-threaded applications should use ThreadedRodeo
.
Both of these are the only way to intern strings, but most applications will hit a stage where they are done interning
strings, and at that point is where the choice between RodeoReader
and RodeoResolver
. If the user needs to get
keys for strings still, then they must use the RodeoReader
(although they can still transfer into a RodeoResolver
)
at this point. For users who just need key to string resolution, the RodeoResolver
gives contention-free access at the
minimum possible memory usage. Note that to gain access to ThreadedRodeo
the multi-threaded
feature is required.
By default lasso
has one dependency, hashbrown
, and only Rodeo
is exposed. Hashbrown is used since the
raw_entry
api is currently unstable in the standard library’s hashmap.
The raw hashmap API is used for custom hashing within the hashmaps, which works to dramatically reduce memory usage
To make use of ThreadedRodeo
, you must enable the multi-threaded
feature.
multi-threaded
- Enables ThreadedRodeo
, the interner for multi-threaded tasks
ahasher
- Use ahash
’s RandomState
as the default hasher
no-std
- Enables no_std
+ alloc
support for Rodeo
and ThreadedRodeo
- Automatically enables the following required features:
ahasher
- no_std
hashing function
serialize
- Implements Serialize
and Deserialize
for all Spur
types and all interners
inline-more
- Annotate external apis with #[inline]
use lasso::Rodeo;
let mut rodeo = Rodeo::default();
let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!(Some(key), rodeo.get("Hello, world!"));
let key2 = rodeo.get_or_intern("Hello, world!");
assert_eq!(key, key2);
use lasso::ThreadedRodeo;
use std::{thread, sync::Arc};
let rodeo = Arc::new(ThreadedRodeo::default());
let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!(Some(key), rodeo.get("Hello, world!"));
let key2 = rodeo.get_or_intern("Hello, world!");
assert_eq!(key, key2);
let moved = Arc::clone(&rodeo);
let hello = thread::spawn(move || {
assert_eq!("Hello, world!", moved.resolve(&key));
moved.get_or_intern("Hello from the thread!")
})
.join()
.unwrap();
assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!("Hello from the thread!", rodeo.resolve(&hello));
use lasso::Rodeo;
let mut rodeo = Rodeo::default();
let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));
let reader = rodeo.into_reader();
assert_eq!("Hello, world!", reader.resolve(&key));
assert_eq!(Some(key), reader.get("Hello, world!"));
use lasso::Rodeo;
let mut rodeo = Rodeo::default();
let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));
let resolver = rodeo.into_resolver();
assert_eq!("Hello, world!", resolver.resolve(&key));
Sometimes you want your keys to only inhabit (or not inhabit) a certain range of values so that you can have custom niches,
meaning that an Option
<
Spur
>
is the same size as a Spur
. This allows you to pack more data into
what would otherwise be unused space, which can be critical for memory-sensitive applications.
use lasso::{Key, Rodeo};
#[derive(Copy, Clone, PartialEq, Eq)]
struct NicheKey(u32);
const NICHE: usize = 0xFF000000;
unsafe impl Key for NicheKey {
fn into_usize(self) -> usize {
self.0 as usize
}
fn try_from_usize(int: usize) -> Option<Self> {
if int < NICHE {
Some(Self(int as u32))
} else {
None
}
}
}
#[test]
fn value_in_range() {
let key = NicheKey::try_from_usize(0).unwrap();
assert_eq!(key.into_usize(), 0);
let key = NicheKey::try_from_usize(NICHE - 1).unwrap();
assert_eq!(key.into_usize(), NICHE - 1);
}
#[test]
fn value_out_of_range() {
let key = NicheKey::try_from_usize(NICHE);
assert!(key.is_none());
let key = NicheKey::try_from_usize(u32::max_value() as usize);
assert!(key.is_none());
}
let mut rodeo: Rodeo<NicheKey> = Rodeo::new();
let key = rodeo.get_or_intern("It works!");
assert_eq!(rodeo.resolve(&key), "It works!");
use lasso::Rodeo;
use core::iter::FromIterator;
let rodeo: Rodeo = vec!["one string", "two string", "red string", "blue string"]
.into_iter()
.collect();
assert!(rodeo.contains("one string"));
assert!(rodeo.contains("two string"));
assert!(rodeo.contains("red string"));
assert!(rodeo.contains("blue string"));
use lasso::Rodeo;
use core::iter::FromIterator;
let rodeo: Rodeo = Rodeo::from_iter(vec![
"one string",
"two string",
"red string",
"blue string",
]);
assert!(rodeo.contains("one string"));
assert!(rodeo.contains("two string"));
assert!(rodeo.contains("red string"));
assert!(rodeo.contains("blue string"));
Benchmarks were gathered with Criterion.rs
OS: Windows 10
CPU: Ryzen 9 3900X at 3800Mhz
RAM: 3200Mhz
Rustc: Stable 1.44.1
Method | Time | Throughput |
resolve | 1.9251 μs | 13.285 GiB/s |
try_resolve | 1.9214 μs | 13.311 GiB/s |
resolve_unchecked | 1.4356 μs | 17.816 GiB/s |
get_or_intern (empty) | 60.350 μs | 433.96 MiB/s |
get_or_intern (filled) | 57.415 μs | 456.15 MiB/s |
try_get_or_intern (empty) | 58.978 μs | 444.06 MiB/s |
try_get_or_intern (filled) | 57.421 μs | 456.10 MiB/s |
get (empty) | 37.288 μs | 702.37 MiB/s |
get (filled) | 55.095 μs | 475.36 MiB/s |
Method | Time | Throughput |
try_resolve | 1.9282 μs | 13.264 GiB/s |
resolve | 1.9404 μs | 13.181 GiB/s |
resolve_unchecked | 1.4328 μs | 17.851 GiB/s |
get_or_intern (empty) | 38.029 μs | 688.68 MiB/s |
get_or_intern (filled) | 33.650 μs | 778.30 MiB/s |
try_get_or_intern (empty) | 39.392 μs | 664.84 MiB/s |
try_get_or_intern (filled) | 33.435 μs | 783.31 MiB/s |
get (empty) | 12.565 μs | 2.0356 GiB/s |
get (filled) | 26.545 μs | 986.61 MiB/s |
Method | Time | Throughput |
resolve | 1.9014 μs | 13.451 GiB/s |
try_resolve | 1.9278 μs | 13.267 GiB/s |
resolve_unchecked | 1.4449 μs | 17.701 GiB/s |
get_or_intern (empty) | 32.523 μs | 805.27 MiB/s |
get_or_intern (filled) | 30.281 μs | 864.88 MiB/s |
try_get_or_intern (empty) | 31.630 μs | 828.00 MiB/s |
try_get_or_intern (filled) | 31.002 μs | 844.78 MiB/s |
get (empty) | 12.699 μs | 2.0141 GiB/s |
get (filled) | 29.220 μs | 896.28 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 54.336 μs | 482.00 MiB/s | 364.27 μs | 71.897 MiB/s |
try_resolve | 54.582 μs | 479.82 MiB/s | 352.67 μs | 74.261 MiB/s |
get_or_intern (empty) | 266.03 μs | 98.447 MiB/s | N\A | N\A |
get_or_intern (filled) | 103.04 μs | 254.17 MiB/s | 441.42 μs | 59.331 MiB/s |
try_get_or_intern (empty) | 261.80 μs | 100.04 MiB/s | N\A | N\A |
try_get_or_intern (filled) | 102.61 μs | 255.25 MiB/s | 447.42 μs | 58.535 MiB/s |
get (empty) | 80.346 μs | 325.96 MiB/s | N\A | N\A |
get (filled) | 92.669 μs | 282.62 MiB/s | 439.24 μs | 59.626 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 22.261 μs | 1.1489 GiB/s | 265.46 μs | 98.658 MiB/s |
try_resolve | 22.378 μs | 1.1429 GiB/s | 268.58 μs | 97.513 MiB/s |
get_or_intern (empty) | 157.86 μs | 165.91 MiB/s | N\A | N\A |
get_or_intern (filled) | 56.320 μs | 465.02 MiB/s | 357.13 μs | 73.335 MiB/s |
try_get_or_intern (empty) | 161.46 μs | 162.21 MiB/s | N\A | N\A |
try_get_or_intern (filled) | 55.874 μs | 468.73 MiB/s | 360.25 μs | 72.698 MiB/s |
get (empty) | 43.520 μs | 601.79 MiB/s | N\A | N\A |
get (filled) | 53.720 μs | 487.52 MiB/s | 360.66 μs | 72.616 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
try_resolve | 17.289 μs | 1.4794 GiB/s | 238.29 μs | 109.91 MiB/s |
resolve | 19.833 μs | 1.2896 GiB/s | 237.05 μs | 110.48 MiB/s |
get_or_intern (empty) | 130.97 μs | 199.97 MiB/s | N\A | N\A |
get_or_intern (filled) | 42.630 μs | 614.35 MiB/s | 301.60 μs | 86.837 MiB/s |
try_get_or_intern (empty) | 129.30 μs | 202.55 MiB/s | N\A | N\A |
try_get_or_intern (filled) | 42.508 μs | 616.12 MiB/s | 337.29 μs | 77.648 MiB/s |
get (empty) | 28.001 μs | 935.30 MiB/s | N\A | N\A |
get (filled) | 37.700 μs | 694.68 MiB/s | 292.15 μs | 89.645 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 1.9398 μs | 13.185 GiB/s | 4.3153 μs | 5.9269 GiB/s |
try_resolve | 1.9315 μs | 13.242 GiB/s | 4.1956 μs | 6.0959 GiB/s |
resolve_unchecked | 1.4416 μs | 17.741 GiB/s | 3.1204 μs | 8.1964 GiB/s |
get (empty) | 38.886 μs | 673.50 MiB/s | N\A | N\A |
get (filled) | 56.271 μs | 465.42 MiB/s | 105.12 μs | 249.14 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 1.9404 μs | 13.181 GiB/s | 4.1881 μs | 6.1069 GiB/s |
try_resolve | 1.8932 μs | 13.509 GiB/s | 4.2410 μs | 6.0306 GiB/s |
resolve_unchecked | 1.4128 μs | 18.103 GiB/s | 3.1691 μs | 8.0703 GiB/s |
get (empty) | 11.952 μs | 2.1399 GiB/s | N\A | N\A |
get (filled) | 27.093 μs | 966.65 MiB/s | 56.269 μs | 465.44 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 1.8987 μs | 13.471 GiB/s | 4.2117 μs | 6.0727 GiB/s |
try_resolve | 1.9103 μs | 13.389 GiB/s | 4.2254 μs | 6.0529 GiB/s |
resolve_unchecked | 1.4469 μs | 17.677 GiB/s | 3.0923 μs | 8.2709 GiB/s |
get (empty) | 12.994 μs | 1.9682 GiB/s | N\A | N\A |
get (filled) | 29.745 μs | 880.49 MiB/s | 52.387 μs | 499.93 MiB/s |
Method | Time (1 Thread) | Throughput (1 Thread) | Time (24 Threads) | Throughput (24 Threads) |
resolve | 1.9416 μs | 13.172 GiB/s | 3.9114 μs | 6.5387 GiB/s |
try_resolve | 1.9264 μs | 13.277 GiB/s | 3.9289 μs | 6.5097 GiB/s |
resolve_unchecked | 1.6638 μs | 15.372 GiB/s | 3.1741 μs | 8.0578 GiB/s |
The amount of strings and bytes that an interner can hold before reallocating
An iterator over an interner’s strings and keys
A key type taking up size_of::<usize>()
bytes of space (generally 4 or 8 bytes)
An error encountered while using an interner
Settings for the memory consumption of an interner
A miniature Key utilizing only 8 bits of space
A miniature Key utilizing only 16 bits of space
A string interner that caches strings quickly with a minimal memory footprint,
returning a unique key to re-access it with O(1)
times.
A read-only view of a
Rodeo
or
ThreadedRodeo
that allows contention-free access to interned strings,
both key to string resolution and string to key lookups
A read-only view of a
Rodeo
or
ThreadedRodeo
that allows contention-free access to interned strings
with only key to string resolution
The default key for every Rodeo, uses only 32 bits of space
An iterator over an interner’s strings
A concurrent string interner that caches strings quickly with a minimal memory footprint,
returning a unique key to re-access it with O(1)
internment and resolution.
The kind of error that occurred
A generic interface over any underlying interner, allowing storing and accessing
interned strings
A generic interface over interners that can be turned into a
Reader
.
A generic interface over interners that can be turned into both a
Reader
and a
Resolver
directly.
Types implementing this trait can be used as keys for all Rodeos
A generic interface that allows using any underlying interner for
both its reading and resolution capabilities, allowing both
str -> key
and key -> str
lookups
A generic interface that allows using any underlying interner only
for its resolution capabilities, allowing only key -> str
lookups
A continence type for an error from an interner