A 64-Bit Distributed ID Generator
Introduction
Unique ID
generation is a crucial part of any application, because it is used to represents a single unique row in the table. Furthermore, this ID
is often used as the primary-key
of the table and heavily used in searching a specific row uniquely.
Example Query:
select * from short_urls where internal_id = 286435872092454913;
You might wonder why we are emphasizing the ID so much…..?. Couldn’t we just generate a unique value using something like Math.random() + epoch
..? or some random_string_of_some_size + epoch
While that could produce unique values, but it would be a poor choice for several reasons, especially at scale:
Meaningful Data: A simple timestamp (or a random number) doesn’t convey any useful information other than the time of creation. Our Snowflake-like approach embeds a timestamp and a shard ID, which can be useful for operational purposes (e.g., identifying which shard a URL was created on).
Database Performance (Crucially): The data type of the primary key has a major impact on database performance, particularly for search operations. Purely numeric IDs (integers or long integers) are vastly superior to strings for primary keys and indexed columns or both as well(A Indexed Primary).
Why Not Random Strings?
Using random strings (e.g., generated by Math.random() + epoch
or random string of some size) as primary keys is generally a bad idea for performance reasons:
Slower Comparisons: String comparisons are inherently slower than numeric comparisons. Databases can compare integers much more efficiently at the hardware level.
Larger Index Size: String primary keys create larger database indexes. Indexes are data structures (usually B-trees or B+trees) that speed up lookups. A larger index means:
- More
disk I/O
to read the index. - More memory consumption to keep parts of the
index
incache
. - Slower query performance.
- More
Index Fragmentation: Inserting random strings into a B-tree index leads to fragmentation. A fragmented index has its data scattered across the disk. This requires more disk seeks (which are slow) to retrieve data. Numeric, sequential IDs minimize fragmentation because new entries are usually appended to the end of the index.
Poor Locality of Reference: Random strings have poor locality of reference. Successive ID lookups are likely to access very different parts of the index and the table data, leading to more disk seeks and cache misses.
Why Numeric, Sequential IDs are Better:
Numeric IDs, especially when generated in a roughly sequential manner (like our Snowflake-like approach), offer substantial advantages:
- Fast Comparisons: Databases can compare numbers extremely quickly.
- Smaller Indexes: Numeric IDs (e.g., 64-bit integers) take up less space than long, random strings, resulting in smaller, faster indexes.
- Reduced Fragmentation: Sequential IDs cause new rows to be inserted at the end of the B-tree index, minimizing fragmentation and improving write performance.
- Improved Locality of Reference: Sequential IDs tend to be clustered together in the index and on disk. This improves cache utilization and reduces disk seeks, speeding up both reads and writes.
Database Internals: B-Trees (Simplified)
Most relational databases (MySQL, PostgreSQL, etc.) use B-trees (or variants like B+trees) to implement indexes. Here’s a simplified explanation:
Structure: A B-tree is a self-balancing tree data structure. Data is kept sorted, and searches, insertions, and deletions can be performed in logarithmic time (O(log n)).
Nodes: Each node in the B-tree contains multiple keys (in our case, the
internal_id
values) and pointers to child nodes.Searching: To find a specific ID, the database:
- Starts at the root node.
- Compares the target ID with the keys in the node.
- Follows the appropriate pointer to a child node.
- Repeats this process until it finds the ID or determines it doesn’t exist. Because the keys are sorted within each node, the database can very efficiently narrow down the search.
Insertions:
- Sequential IDs: New keys are usually added to the rightmost leaf node of the B-tree. This is very efficient.
- Random Strings: The new key could need to be inserted anywhere in the B-tree. This might require splitting nodes (if a node is full), leading to multiple write operations and fragmentation.
Time Complexity: The Time complexity for searching using the
internal_id
which is primary key, will be O(log n). Where ‘n’ is the number of rows in your table. And, If we had chosen a string-based ID, and even if that string column were indexed, the search complexity within the index itself would still involve string comparisons, which are slower than integer comparisons. The complexity would still be logarithmic (O(log n)) in terms of the number of index nodes accessed, but each node comparison would take longer. And, if your are quering on non-indexed string column then it will be O(n*m), Where ‘n’ is the number of rows, and ‘m’ is the average length of your string ID.
Method: generateInternalId(int shardId)
The generateInternalId
method creates unique IDs without relying on a central coordinating service (like a database auto-increment). It achieves this by combining a timestamp, a shard ID, and a sequence number into a single 64-bit integer.
ID Structure
The generated ID is a 64-bit long integer, structured as follows:
+-------------------------------------------------------------------------------------+
| Timestamp (41 bits) | ShardID(10) | Sequence(12) | extra bit |
+--------------------------+---------------+------------------+-----------------------+
| 63 | 22 | 12 | 0 (Bit positions) |
+-------------------------------------------------------------------------------------+
Timestamp (41 bits): Represents the number of milliseconds since a custom epoch. The epoch in this implementation is
2023-01-01 00:00:00 UTC
(defined by theepoch
variable).- Calculation:
System.currentTimeMillis() - epoch
- Capacity: 41 bits allow for 241 milliseconds, which is approximately 69.7 years:
- 241 milliseconds = 2,199,023,255,552 milliseconds
- 2,199,023,255,552 ms / 1000 ms/s / 60 s/min / 60 min/hr / 24 hr/day / 365 days/year ≈ 69.7 years
- Lifespan: The system can generate unique IDs for about 69.7 years. After this, the timestamp would wrap around, potentially causing ID collisions. To prevent this, you’d need a strategy like re-sharding with a new epoch or implementing collision detection (not done in this simplified version).
- Calculation:
Shard ID (10 bits): Identifies the specific database shard or application instance that generated the ID. This is essential for distributed systems. 10 bits allow for 210 = 1024 unique shard IDs.
Sequence Number (12 bits): A counter that increments for each ID generated within the same millisecond on the same shard. 12 bits allow for 212 = 4096 unique sequence numbers per millisecond per shard. This handles situations where many IDs are generated very quickly.
- Atomic Increment:
sequence.getAndIncrement()
ensures the sequence number is incremented atomically, preventing race conditions in a multi-threaded environment. - Bitwise AND (
& 0xFFF
): Keeps the sequence number within the 12-bit range (0-4095).0xFFF
is the hexadecimal representation of 4095 (binary111111111111
). The bitwise AND effectively masks all but the last 12 bits.
- Atomic Increment:
Bitwise Operations
The code uses bitwise operations to combine the timestamp, shard ID, and sequence number:
Left Shift (
<<
):timestamp << 22
: Shifts the timestamp 22 bits to the left. This creates 22 empty bits (filled with zeros) on the right-hand side. These bits are reserved for the shard ID (10 bits) and sequence number (12 bits). Left shifting by n bits is equivalent to multiplying by 2n.Timestamp (binary, simplified): 1001...00100 Timestamp << 22: 1001...00100000000000000000000000000
shardId << 12
: Shifts the shard ID 12 bits to the left, creating 12 empty bits on the right for the sequence number (multiplying by 212).Shard ID (binary): 0000000101 Shard ID << 12: 0000000101000000000000
Bitwise OR (
|
): Combines the shifted timestamp, shifted shard ID, and sequence number. Because the left shifts created non-overlapping sections, the bitwise OR effectively concatenates the three values.
+------------------------------------------------------------------------------------+
| (timestamp << 22) | (shardId << 12) | sequenceNumber |
+---------------------------------------+--------------------------+-----------------+
| 1001...0010000000000000000000000000 | 0000000101000000000000 | 000001111011 |
+------------------------------------------------------------------------------------+
Resulting 64-bit ID: 1001…001000000000101000001111011
Visual Representation
+--------------------------------------------------------------------------------------+
| ID (64 bits) |
+--------------------------------------------------------------------------------------+
| Timestamp (41 bits) | Shard ID (10 bits) | Seq (12 bits) |
+---------------------------------+-----------------------------+----------------------+
| 10010011...00100000000000 | 0000000101 | 000001111011 |
+---------------------------------+-----------------------------+----------------------+
Example
Let’s say:
- Current time:
2024-03-01 10:30:15.500 UTC
- Epoch:
2023-01-01 00:00:00.000 UTC
- Shard ID:
5
- Sequence:
123
Timestamp: 39,486,615,500 milliseconds (difference between current time and epoch).
Binary Representations (simplified):
- Timestamp:
10010011111000000110000110101100000100100
- Shard ID:
0000000101
- Sequence Number:
000001111011
- Timestamp:
Bitwise Operations: The code performs the shifts and OR as described above, resulting in the final 64-bit ID.
Code
Here is the java
implementation of the above explanation:
private final long epoch ;
private final AtomicInteger sequence ;
private synchronized long
Also i have used this in one of my personal project Url Shortener
Parameters
shardId
: The ID of the database shard (an integer between 0 and 1023, inclusive).
Return Value
- The generated 64-bit unique internal ID.
Important Considerations (Not Handled in this Simplified Version)
- Sequence Overflow: This code does not handle the case where more than 4096 IDs are generated within the same millisecond on the same shard. In a production system, you must address this (e.g., by waiting for the next millisecond, throwing an exception, or using a larger sequence number space).
- Clock Drift/Jump: This code does not handle system clock changes (clock drift or NTP adjustments). Clock changes can cause the timestamp to go backward, potentially leading to duplicate IDs. A production system must handle this (e.g., by tracking the last generated timestamp and waiting or throwing an exception if the clock moves backward).
- Shard ID Assignment: You need a robust mechanism to assign shard IDs and to determine the shard ID for a given operation. Consistent hashing is a common approach.
This detailed explanation should cover all aspects of the generateInternalId
method and the Snowflake-like ID generation strategy. Remember to use the more robust version of this method (with clock drift and sequence overflow handling) in a production environment.
If you like the post please connect me on LinkedIn And Twitter