Crate lasso

source ·
Expand description

CI Security Audit Coverage LoC Docs.rs Crates.io

A multithreaded and single threaded string interner that allows strings to be cached with a minimal memory footprint, associating them with a unique key that can be used to retrieve them at any time. A Rodeo allows O(1) internment and resolution and can be turned into a RodeoReader to allow for contention-free resolutions with both key to str and str to key operations. It can also be turned into a RodeoResolver with only key to str operations for the lowest possible memory usage.

Which interner do I use?

For single-threaded workloads Rodeo is encouraged, while multi-threaded applications should use ThreadedRodeo. Both of these are the only way to intern strings, but most applications will hit a stage where they are done interning strings, and at that point is where the choice between RodeoReader and RodeoResolver. If the user needs to get keys for strings still, then they must use the RodeoReader (although they can still transfer into a RodeoResolver) at this point. For users who just need key to string resolution, the RodeoResolver gives contention-free access at the minimum possible memory usage. Note that to gain access to ThreadedRodeo the multi-threaded feature is required.

InternerThread-safeIntern Stringstr to keykey to strContention FreeMemory Usage
RodeoN/AMedium
ThreadedRodeoMost
RodeoReaderMedium
RodeoResolverLeast

Cargo Features

By default lasso has one dependency, hashbrown, and only Rodeo is exposed. Hashbrown is used since the raw_entry api is currently unstable in the standard library’s hashmap. The raw hashmap API is used for custom hashing within the hashmaps, which works to dramatically reduce memory usage To make use of ThreadedRodeo, you must enable the multi-threaded feature.

  • multi-threaded - Enables ThreadedRodeo, the interner for multi-threaded tasks
  • ahasher - Use ahash’s RandomState as the default hasher
  • no-std - Enables no_std + alloc support for Rodeo and ThreadedRodeo
    • Automatically enables the following required features:
      • ahasher - no_std hashing function
  • serialize - Implements Serialize and Deserialize for all Spur types and all interners
  • inline-more - Annotate external apis with #[inline]

Example: Using Rodeo

use lasso::Rodeo;

let mut rodeo = Rodeo::default();
let key = rodeo.get_or_intern("Hello, world!");

// Easily retrieve the value of a key and find the key for values
assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!(Some(key), rodeo.get("Hello, world!"));

// Interning the same string again will yield the same key
let key2 = rodeo.get_or_intern("Hello, world!");

assert_eq!(key, key2);

Example: Using ThreadedRodeo

use lasso::ThreadedRodeo;
use std::{thread, sync::Arc};

let rodeo = Arc::new(ThreadedRodeo::default());
let key = rodeo.get_or_intern("Hello, world!");

// Easily retrieve the value of a key and find the key for values
assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!(Some(key), rodeo.get("Hello, world!"));

// Interning the same string again will yield the same key
let key2 = rodeo.get_or_intern("Hello, world!");

assert_eq!(key, key2);

// ThreadedRodeo can be shared across threads
let moved = Arc::clone(&rodeo);
let hello = thread::spawn(move || {
    assert_eq!("Hello, world!", moved.resolve(&key));
    moved.get_or_intern("Hello from the thread!")
})
.join()
.unwrap();

assert_eq!("Hello, world!", rodeo.resolve(&key));
assert_eq!("Hello from the thread!", rodeo.resolve(&hello));

Example: Creating a RodeoReader

use lasso::Rodeo;

// Rodeo and ThreadedRodeo are interchangeable here
let mut rodeo = Rodeo::default();

let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));

let reader = rodeo.into_reader();

// Reader keeps all the strings from the parent
assert_eq!("Hello, world!", reader.resolve(&key));
assert_eq!(Some(key), reader.get("Hello, world!"));

// The Reader can now be shared across threads, no matter what kind of Rodeo created it

Example: Creating a RodeoResolver

use lasso::Rodeo;

// Rodeo and ThreadedRodeo are interchangeable here
let mut rodeo = Rodeo::default();

let key = rodeo.get_or_intern("Hello, world!");
assert_eq!("Hello, world!", rodeo.resolve(&key));

let resolver = rodeo.into_resolver();

// Resolver keeps all the strings from the parent
assert_eq!("Hello, world!", resolver.resolve(&key));

// The Resolver can now be shared across threads, no matter what kind of Rodeo created it

Example: Making a custom-ranged key

Sometimes you want your keys to only inhabit (or not inhabit) a certain range of values so that you can have custom niches, meaning that an Option<Spur> is the same size as a Spur. This allows you to pack more data into what would otherwise be unused space, which can be critical for memory-sensitive applications.

use lasso::{Key, Rodeo};

// First make our key type, this will be what we use as handles into our interner
#[derive(Copy, Clone, PartialEq, Eq)]
struct NicheKey(u32);

// This will reserve the upper 255 values for us to use as niches
const NICHE: usize = 0xFF000000;

// Implementing `Key` is unsafe and requires that anything given to `try_from_usize` must produce the
// same `usize` when `into_usize` is later called
unsafe impl Key for NicheKey {
    fn into_usize(self) -> usize {
        self.0 as usize
    }

    fn try_from_usize(int: usize) -> Option<Self> {
        if int < NICHE {
            // The value isn't in our niche range, so we're good to go
            Some(Self(int as u32))
        } else {
            // The value interferes with our niche, so we return `None`
            None
        }
    }
}

// To make sure we're upholding `Key`'s safety contract, let's make two small tests
#[test]
fn value_in_range() {
    let key = NicheKey::try_from_usize(0).unwrap();
    assert_eq!(key.into_usize(), 0);

    let key = NicheKey::try_from_usize(NICHE - 1).unwrap();
    assert_eq!(key.into_usize(), NICHE - 1);
}

#[test]
fn value_out_of_range() {
    let key = NicheKey::try_from_usize(NICHE);
    assert!(key.is_none());

    let key = NicheKey::try_from_usize(u32::max_value() as usize);
    assert!(key.is_none());
}

// And now we're done and can make `Rodeo`s or `ThreadedRodeo`s that use our custom key!
let mut rodeo: Rodeo<NicheKey> = Rodeo::new();
let key = rodeo.get_or_intern("It works!");
assert_eq!(rodeo.resolve(&key), "It works!");

Example: Creation using FromIterator

use lasso::Rodeo;
use core::iter::FromIterator;

// Works for both `Rodeo` and `ThreadedRodeo`
let rodeo: Rodeo = vec!["one string", "two string", "red string", "blue string"]
    .into_iter()
    .collect();

assert!(rodeo.contains("one string"));
assert!(rodeo.contains("two string"));
assert!(rodeo.contains("red string"));
assert!(rodeo.contains("blue string"));
use lasso::Rodeo;
use core::iter::FromIterator;

// Works for both `Rodeo` and `ThreadedRodeo`
let rodeo: Rodeo = Rodeo::from_iter(vec![
    "one string",
    "two string",
    "red string",
    "blue string",
]);

assert!(rodeo.contains("one string"));
assert!(rodeo.contains("two string"));
assert!(rodeo.contains("red string"));
assert!(rodeo.contains("blue string"));

Benchmarks

Benchmarks were gathered with Criterion.rs
OS: Windows 10
CPU: Ryzen 9 3900X at 3800Mhz
RAM: 3200Mhz
Rustc: Stable 1.44.1

Rodeo

STD RandomState
MethodTimeThroughput
resolve1.9251 μs13.285 GiB/s
try_resolve1.9214 μs13.311 GiB/s
resolve_unchecked1.4356 μs17.816 GiB/s
get_or_intern (empty)60.350 μs433.96 MiB/s
get_or_intern (filled)57.415 μs456.15 MiB/s
try_get_or_intern (empty)58.978 μs444.06 MiB/s
try_get_or_intern (filled)57.421 μs456.10 MiB/s
get (empty)37.288 μs702.37 MiB/s
get (filled)55.095 μs475.36 MiB/s
AHash
MethodTimeThroughput
try_resolve1.9282 μs13.264 GiB/s
resolve1.9404 μs13.181 GiB/s
resolve_unchecked1.4328 μs17.851 GiB/s
get_or_intern (empty)38.029 μs688.68 MiB/s
get_or_intern (filled)33.650 μs778.30 MiB/s
try_get_or_intern (empty)39.392 μs664.84 MiB/s
try_get_or_intern (filled)33.435 μs783.31 MiB/s
get (empty)12.565 μs2.0356 GiB/s
get (filled)26.545 μs986.61 MiB/s
FXHash
MethodTimeThroughput
resolve1.9014 μs13.451 GiB/s
try_resolve1.9278 μs13.267 GiB/s
resolve_unchecked1.4449 μs17.701 GiB/s
get_or_intern (empty)32.523 μs805.27 MiB/s
get_or_intern (filled)30.281 μs864.88 MiB/s
try_get_or_intern (empty)31.630 μs828.00 MiB/s
try_get_or_intern (filled)31.002 μs844.78 MiB/s
get (empty)12.699 μs2.0141 GiB/s
get (filled)29.220 μs896.28 MiB/s

ThreadedRodeo

STD RandomState
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve54.336 μs482.00 MiB/s364.27 μs71.897 MiB/s
try_resolve54.582 μs479.82 MiB/s352.67 μs74.261 MiB/s
get_or_intern (empty)266.03 μs98.447 MiB/sN\AN\A
get_or_intern (filled)103.04 μs254.17 MiB/s441.42 μs59.331 MiB/s
try_get_or_intern (empty)261.80 μs100.04 MiB/sN\AN\A
try_get_or_intern (filled)102.61 μs255.25 MiB/s447.42 μs58.535 MiB/s
get (empty)80.346 μs325.96 MiB/sN\AN\A
get (filled)92.669 μs282.62 MiB/s439.24 μs59.626 MiB/s
AHash
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve22.261 μs1.1489 GiB/s265.46 μs98.658 MiB/s
try_resolve22.378 μs1.1429 GiB/s268.58 μs97.513 MiB/s
get_or_intern (empty)157.86 μs165.91 MiB/sN\AN\A
get_or_intern (filled)56.320 μs465.02 MiB/s357.13 μs73.335 MiB/s
try_get_or_intern (empty)161.46 μs162.21 MiB/sN\AN\A
try_get_or_intern (filled)55.874 μs468.73 MiB/s360.25 μs72.698 MiB/s
get (empty)43.520 μs601.79 MiB/sN\AN\A
get (filled)53.720 μs487.52 MiB/s360.66 μs72.616 MiB/s
FXHash
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
try_resolve17.289 μs1.4794 GiB/s238.29 μs109.91 MiB/s
resolve19.833 μs1.2896 GiB/s237.05 μs110.48 MiB/s
get_or_intern (empty)130.97 μs199.97 MiB/sN\AN\A
get_or_intern (filled)42.630 μs614.35 MiB/s301.60 μs86.837 MiB/s
try_get_or_intern (empty)129.30 μs202.55 MiB/sN\AN\A
try_get_or_intern (filled)42.508 μs616.12 MiB/s337.29 μs77.648 MiB/s
get (empty)28.001 μs935.30 MiB/sN\AN\A
get (filled)37.700 μs694.68 MiB/s292.15 μs89.645 MiB/s

RodeoReader

STD RandomState
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve1.9398 μs13.185 GiB/s4.3153 μs5.9269 GiB/s
try_resolve1.9315 μs13.242 GiB/s4.1956 μs6.0959 GiB/s
resolve_unchecked1.4416 μs17.741 GiB/s3.1204 μs8.1964 GiB/s
get (empty)38.886 μs673.50 MiB/sN\AN\A
get (filled)56.271 μs465.42 MiB/s105.12 μs249.14 MiB/s
AHash
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve1.9404 μs13.181 GiB/s4.1881 μs6.1069 GiB/s
try_resolve1.8932 μs13.509 GiB/s4.2410 μs6.0306 GiB/s
resolve_unchecked1.4128 μs18.103 GiB/s3.1691 μs8.0703 GiB/s
get (empty)11.952 μs2.1399 GiB/sN\AN\A
get (filled)27.093 μs966.65 MiB/s56.269 μs465.44 MiB/s
FXHash
MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve1.8987 μs13.471 GiB/s4.2117 μs6.0727 GiB/s
try_resolve1.9103 μs13.389 GiB/s4.2254 μs6.0529 GiB/s
resolve_unchecked1.4469 μs17.677 GiB/s3.0923 μs8.2709 GiB/s
get (empty)12.994 μs1.9682 GiB/sN\AN\A
get (filled)29.745 μs880.49 MiB/s52.387 μs499.93 MiB/s

RodeoResolver

MethodTime (1 Thread)Throughput (1 Thread)Time (24 Threads)Throughput (24 Threads)
resolve1.9416 μs13.172 GiB/s3.9114 μs6.5387 GiB/s
try_resolve1.9264 μs13.277 GiB/s3.9289 μs6.5097 GiB/s
resolve_unchecked1.6638 μs15.372 GiB/s3.1741 μs8.0578 GiB/s

Structs

The amount of strings and bytes that an interner can hold before reallocating
An iterator over an interner’s strings and keys
A key type taking up size_of::<usize>() bytes of space (generally 4 or 8 bytes)
An error encountered while using an interner
Settings for the memory consumption of an interner
A miniature Key utilizing only 8 bits of space
A miniature Key utilizing only 16 bits of space
A string interner that caches strings quickly with a minimal memory footprint, returning a unique key to re-access it with O(1) times.
A read-only view of a Rodeo or ThreadedRodeo that allows contention-free access to interned strings, both key to string resolution and string to key lookups
A read-only view of a Rodeo or ThreadedRodeo that allows contention-free access to interned strings with only key to string resolution
The default key for every Rodeo, uses only 32 bits of space
An iterator over an interner’s strings
A concurrent string interner that caches strings quickly with a minimal memory footprint, returning a unique key to re-access it with O(1) internment and resolution.

Enums

The kind of error that occurred

Traits

A generic interface over any underlying interner, allowing storing and accessing interned strings
A generic interface over interners that can be turned into a Reader.
A generic interface over interners that can be turned into both a Reader and a Resolver directly.
A generic interface over Readers that can be turned into a Resolver.
Types implementing this trait can be used as keys for all Rodeos
A generic interface that allows using any underlying interner for both its reading and resolution capabilities, allowing both str -> key and key -> str lookups
A generic interface that allows using any underlying interner only for its resolution capabilities, allowing only key -> str lookups

Type Definitions

A continence type for an error from an interner