Horizontal vs. Vertical Scaling in Modern Data Engines
As the Automotive Dataset grew from thousands of records to millions, we reached the limits of vertical scaling. Throwing more RAM and CPU at a monolithic database eventually becomes an equation with diminishing returns.
The Transition to Sharding
We abandoned monolithic structures in favor of horizontal scaling. By sharding our databases based on regional lookup frequencies, we distributed the compute across a wider array of nodes. This dramatically reduced lock-contention on write operations during market data ingestion periods.
Eventual Consistency Models
Not all data needs to be ACID compliant across every node simultaneously. By embracing eventual consistency for non-critical reads, we massively accelerated the availability of our API layer without risking data corruption during heavy writes.