…now exists! Check it out.
I was recently working on a Rust program and needed a way to give instances a unique identifier. These IDs did not need to be universally unique. I wanted the simplest way to recognize two instances as unique even if they were identical aside from the ID. A serial (or auto-increment) ID seemed like an appropriate choice.
As a rule, I try to write my own code to avoid over-reliance on dependencies. But this case seemed to warrant a dependency. It would be a simple thing to implement. What I wanted was essentially a counter, after all. I expected to find a reliable crate with lots of downloads.
Surprisingly, no such crate existed. I tried searching for “serial”, “increment”, “auto increment”, “serial id”, etc. and found nothing. One crate had a promising name but no documentation and, from what I could tell from the code, did not quite do what I was looking for.
I decided to write and publish such a crate myself. It seemed like a good opportunity to contribute a small but useful crate with relatively simple code to the greater Rust ecosystem.
Serial IDs are used as identifiers because they are simple. Universally unique identifiers (or UUIDs), on the other hand, are usually used when a program needs to create identifiers without access to information about IDs created elsewhere. This is why a distributed database would use them. Rust developers have access to the rand crate and the uuid crate, which uses rand internally but creates IDs in formats that conform to formal standards. The basic strategy is to create a pseudo-random value or to capture a value (such as a UTC timestamp) and hash it. The details of UUID generation are too extensive to get into here. The uuid crate is both well-documented and cleanly coded, and there are countless other descriptions of the standards it implements.
The important difference between serial IDs and UUIDs is the performance cost. Hashing and random number generation are more expensive than increasing the value of an integer by one. Even if threads have to wait to get exclusive access to an ID generator, each thread will only hold it for the number of nanoseconds that it takes to calculate “x + 1”. Serial IDs are also free from the issue of collisions, allowing them to be smaller. A 32-bit serial ID can have 4,294,967,296 unique instances. There is no probability to consider.
UUIDs are always good enough, but their cost is not always necessary. Git, for example, creates an SHA-1 has of a commit’s content to create a commit ID. The IDs need to be unique when commits from different machines are pushed to a single remote. But many IDs do not need a high probability of being universally unique. CLI applications or other client-side programs that only work with in-memory values or store data locally are such examples.
Serial IDs are familiar to users of database systems. In SQL,
auto-increment integers can be used by setting
GENERATE BY DEFAULT AS IDENTITY
as the default value for
a column.
CREATE TABLE distributors (
integer PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
did varchar(40) NOT NULL CHECK (name <> '')
name );
Postgres has its own SERIAL
type that mimics this
behavior.
CREATE TABLE tablename (
colname SERIAL );
Postgres provides a reasonable standard to replicate: 32-bit integers by default with support for any other unsigned integer. By using a trait to define the types that can be generated, users can make their own types compatible with the API provided by the crate. My crate just needs to provide the generator, the trait and implementations for unsigned integers in the standard library.
This process proved something about Rust that wasn’t readily apparent to me before. Encouraging flexibility by defining behavior (through traits) makes it intuitive to implement new features. When you know what you want to do the code starts to seem obvious. If you need a single type that can output many types, define the common behavior of those types and make the single type operate based on that shared behavior.
Rust is often described as flexible, but many languages are flexible. With Rust, it feels like the API writes itself once you clearly define what it should do. The code is not just expressive, it’s obvious.
I also realized that there are still plenty of opportunities to contribute Rust code. Both widely-applicable and more niche libraries have provided the tools that developers need to adopt the language. But while Rust has become mature enough to fit many use cases, the ecosystem is far from bloated. There is still plenty of space to do new or different things and make an impact.
The serial_int crate is on crates.io and lib.rs. Contributions and feedback on the API are welcome!