The structure of today's most popular relational database management systems (DBMS) is not efficient for data warehouse applications. Today's database systems are "write-optimized"—they store information in a "row," meaning all fields in a given record are stored contiguously.
The DBMS "row store" has proved very successful for commercial transaction-processing applications, because a record can be inserted or deleted with a single disk operation, but it is not as appropriate for warehouse applications. In these applications, data from transaction systems is loaded periodically into a historical store, and analysts run ad-hoc queries to garner intelligence about the business. A DBMS that could store similar fields in different records together in a "column store" structure would be at least an order of magnitude faster than "row-store" database systems. Such a "column store" would produce a "read-optimized" DBMS.
Commercial vendors are loathe building, and maintaining, two radically different DBMS and have thereby focused on transaction processing and lived with lower efficiency in their burgeoning data warehouses. This project aims to design a hybrid system that delivers the best of both worlds: a write-optimized engine that performs updates efficiently coupled with a back-end, read-optimized engine that performs massive queries efficiently.
The technology from this project was spun out into a startup company, Vertica Systems (HP).
For the story of this project and its journey to a startup company, see Case Studies.