June 14, 2024
Generating unique ids using Twitter's Snowflake
Snowflake is an algorithm for generating unique ids in a distributed system. Its basically a 64-bit integer id.
Structure of a Snowflake ID
- Sign bit(1 bit) - always 0
- Timestamp (41 bits) - milliseconds since a custom epoch. This epoch can be any time in past.
- Worker ID (10 bits) - identifies the server/process generating the ID. Upto 1024 nodes/servers are supported.
- Sequence Number (12 bits) - counter that resets every millisecond. This allows upto 4096 ids per node per millisecond
How snowflake works ?
Snowflake plays with 3 variables - timestamp, workerId and sequenceNumber
- Get the current timestamp in millisecons
- If the timestamp is same as previous timestamp, increment the sequenceNumber
- If the timestamp is different, reset the sequenceNumber to 0
- Combine all three variables to get the final id - (1) Shift timestamp left by (10+12) bits and (2) Shift node ID left by 12 bits or all components together
Why is Snowflake useful ?
- There is no coordination required between the server nodes.
- All ids are time sortable.
- The ids are globally unique.
- The ids are compact and smaller in size compared to UUIDs.
- Supports high throughput. One server can generate upto 4096 ids per millisecond.
Mitigating clock synchronization issues in snowflake
- Use redudant NTP servers to synchronize the clock
- Monitor clock drifts
- Disable ID generation for servers who clock moved back significantly

Implementation
Here's a quick javascript implementation of snowflake algorithm.