In the realm of database design, there is a seemingly simple question that often leads to long-term technical debt: Should timestamps be stored in seconds or milliseconds?
Many developers new to the field believe this is merely a difference in "digits" (10 vs. 13 digits), or they opt for the database's native DATETIME string type for the sake of readability. However, as business scale grows, distributed systems become ubiquitous, and high-concurrency query demands surge, the granularity of timestamp storage directly impacts query performance, data precision, cross-system compatibility, and the complexity of timezone handling.
This article will deeply explore why modern high-availability systems generally tend to use millisecond-level timestamps (or finer granularity) as the core time storage solution in databases, analyzing the logic behind performance optimization.
1. Three Common Forms of Timestamp Storage
Before analyzing, let's clarify the three main ways to store time in databases and their characteristics:
| Storage Type | Example | Pros | Cons |
|---|---|---|---|
| String Type (VARCHAR) | 2023-10-01 12:00:00.123 | Strong readability, easy debugging | Large storage footprint, extremely low indexing efficiency, cannot perform numerical operations directly |
| Second-level Integer (INT/BIGINT) | 1698765432 | Small footprint, fast indexing, good cross-language compatibility | Insufficient precision, cannot distinguish sequence within the same second |
| Millisecond-level Integer (BIGINT) | 1698765432123 | High precision, extremely fast indexing, highest numerical operation efficiency | Poor readability (requires conversion), slightly larger footprint than second-level integers |
Core Viewpoint: In architectures pursuing high performance and high reliability, Millisecond-level Integer (BIGINT storing Unix Millisecond Timestamp) is the optimal solution.
2. The Precision Debate: The Irreplaceability of Milliseconds
2.1 Granularity Requirements for Business Scenarios
In modern internet applications, concurrency is extremely high. Suppose you are designing a Seckill System or an Order System:
Second-level Timestamp: If second-level storage is used, when 1,000 orders occur within the same second, the
created_attimestamps in the database are identical. This leads to:- Inability to precisely reconstruct the user's operation sequence via timestamps.
- In distributed transactions, inability to determine precedence, potentially leading to data race conditions.
Millisecond-level Timestamp: Even for operations within the same second, millisecond-level precision provides sufficient distinction (theoretically 1,000 unique values per second). Combined with Auto-Increment IDs or Distributed IDs, it allows for precise reconstruction of event chronology.
2.2 Integrity of Logs and Auditing
For financial and security auditing systems, time precision often has legal implications. Millisecond-level recording is key evidence to distinguish between "before operation" and "after operation." If stored only to the second, it will be impossible to provide a sufficiently granular timeline proof in the event of a dispute.
3. Performance Optimization: Why Integers are Faster than Time Types
Many developers are accustomed to using MySQL's DATETIME or TIMESTAMP types. While they store dates, numerical timestamps hold an absolute advantage in indexing and query performance at the underlying level.
3.1 Index Tree Comparison
Database B+Tree indexing relies on numerical comparison.
DATETIME/TIMESTAMP: Although they are time types, for low-level sorting and comparison, the database needs to first convert the time into an internal numerical format (e.g., days+seconds), involving extra computational overhead.BIGINT(Millisecond): It is pure numbers. Index page splitting and node comparisons are direct CPU instruction-level comparisons, which are extremely fast.
3.2 Efficiency of Range Queries
In data analysis and reporting queries, we often need to filter by time range (e.g., WHERE create_time BETWEEN A AND B).
Test Scenario: Querying data for a specific day in a table with tens of millions of rows.
-- Scheme A: Time Type
SELECT * FROM orders
WHERE order_date >= '2024-01-01' AND order_date < '2024-01-02';
-- Database needs to convert string/time boundaries to internal representation, scanning index
-- Scheme B: Millisecond Integer
SELECT * FROM orders
WHERE timestamp_ms >= 1704067200000 AND timestamp_ms < 1704153600000;
-- Pure numerical comparison, typically 20%-30% more efficient when walking the index 3.3 Storage Space Considerations
DATETIMEin MySQL 5.6+ occupies 5-8 bytes.TIMESTAMPoccupies 4 bytes (but has the Year 2038 problem).BIGINToccupies 8 bytes.
Although BIGINT takes up 4 bytes more than TIMESTAMP, compared to the dozens of bytes for string types, it is still extremely compact. In modern SSDs and in-memory databases, the precision boost and timezone peace of mind gained from this tiny amount of space are well worth it.
4. The "Time Problem" in Distributed Systems
In microservices and distributed architectures, the storage granularity of timestamps directly affects system scalability.
4.1 Completely Avoiding Timezone Chaos
This is the biggest hidden benefit of using millisecond integers.
- Problem: If your database uses
DATETIMEwithout a unified timezone, servers deployed in the US West and Singapore might write times that deviate due to different timezone settings. When data aggregates in the data warehouse, the timeline becomes completely scrambled. - Solution: Uniformly generate UTC Millisecond Timestamps at the application layer and store them in the
BIGINTfield in the database.- Write:
const now = Date.now();(UTC Millis). - Read: Frontend formats based on the user's local timezone.
- Result: The database stores absolute numerical values on the timeline, completely eliminating data inconsistency issues caused by timezone conversion.
- Write:
4.2 Supporting Distributed ID Generation Algorithms
Distributed IDs generated by algorithms like Snowflake inherently contain timestamp information (usually millisecond-level). Storing database timestamps as milliseconds facilitates correlation analysis with these IDs.
For example, you can directly restore the precise generation time of a record via (snowflake_id >> 22) + 1288834974657 without needing to store an extra time field. This design is extremely common in high-performance distributed systems.
5. The "Cost" of Millisecond Storage and Mitigation Strategies
Of course, no architectural choice is perfect. Storing millisecond timestamps has two main "pain points," but both have mature solutions.
5.1 Poor Readability
Pain Point: The number 1698765432123 is not intuitively perceivable by humans when viewing directly in a database client (like Navicat).
Mitigation Strategy:
- Create Virtual Columns: In MySQL 5.7+, you can create a virtual generated column to display formatted time without affecting write performance.
ALTER TABLE orders ADD COLUMN created_at_readable DATETIME GENERATED ALWAYS AS (FROM_UNIXTIME(timestamp_ms / 1000)) VIRTUAL; - View Encapsulation: Create views for query personnel that automatically convert millisecond timestamps to local time strings.
5.2 The Year 2038 Problem
Pain Point: If using 32-bit integers (INT) to store second-level timestamps, overflow occurs on January 19, 2038. However, millisecond-level timestamps often use 64-bit integers (BIGINT).
Mitigation Strategy: When creating tables,be sure to use BIGINT instead of INT. The maximum value of BIGINT is approximately 9.22 * 10^18, which is sufficient to last until the heat death of the universe, completely removing worries about overflow.
CREATE TABLE events (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
event_timestamp_ms BIGINT NOT NULL COMMENT 'Millisecond Timestamp (UTC)',
-- Add index to improve query performance
INDEX idx_timestamp (event_timestamp_ms)
); 6. Practical Comparison: Impact of Different Storage Schemes on Queries
To demonstrate the difference more intuitively, let's establish a simple test model. Assume a log table with 50 million rows.
| Storage Scheme | Est. Space per Record | Range Query Time (5% data) | Concurrent Write TPS | Timezone Handling |
|---|---|---|---|---|
| VARCHAR(30) | ~30 Bytes | 1.2s | 800 | Extremely Complex (App layer conversion) |
| DATETIME | 5 Bytes | 0.5s | 2000 | Prone to DB timezone influence |
| INT (Second-level) | 4 Bytes | 0.4s | 2500 | Good compatibility (UTC), but insufficient precision |
| BIGINT (Millisecond) | 8 Bytes | 0.35s | 2800 | Perfect (App layer control, absolute values) |
Conclusion: Millisecond BIGINT ranks in the first tier for query performance while providing the highest precision and the cleanest timezone handling solution.
7. Best Practices Summary
Based on the analysis above, I offer the following architectural design suggestions for database timestamp storage:
- Storage Layer: Uniformly use
BIGINTtype to store UTC Millisecond Timestamps. - Application Layer:
- Write: Generate using
System.currentTimeMillis()(Java) orDate.now()(JS/Node.js). - Read: Convert to business-required
LocalDateTimeor formatted strings in the service layer (DTO).
- Write: Generate using
- Indexing Strategy: Create a
BTREEindex on the timestamp field. If dealing with time-series data (e.g., IoT data), consider using Descending Indexes (supported in MySQL 8.0+), as the newest data is often queried most frequently. - Sharding: When database sharding is required, millisecond timestamps are natural shard keys. You can easily perform data lifecycle management (hot/cold data separation) based on time ranges (e.g., monthly sharding).
- Compatibility: If interaction with external legacy systems is mandatory (which only accept seconds), perform unit conversion at the API Gateway layer, do not let conversion logic pollute the core storage layer.
8. Conclusion
Database design is an art of trade-offs. Choosing to store millisecond-level integers as timestamps seemingly sacrifices direct readability but yields huge returns in precision, performance, distributed scalability, and timezone uniformity.
For any new system about to go live, or any legacy system undergoing refactoring, it is recommended to migrate time fields to millisecond BIGINT storage as early as possible. This is not just a technical choice, but an architectural mindset oriented towards the future—exchanging minimal storage costs for data accuracy and maintenance convenience throughout the system's entire lifecycle.
In the pursuit of high concurrency and high availability today, details determine success or failure, and the storage granularity of timestamps is precisely that overlooked yet extremely critical technical detail.
Disclaimer: The content of this article is original technical exploration, combining database principles with frontline architectural practice. The data test results involved are estimates based on general environments; actual performance may vary depending on hardware configuration, database version, and data distribution. It is recommended to perform stress testing based on business data before specific implementation.