本文为摘录,原文为: attachments/pdf/3/The Vertica Analytic Database- C-Store 7 Years Later (p1790_andrewlamb_vldb2012).pdf
1 ABSTRACT
- Vertica 是 C-Store 的商业化成果
2 BACKGROUND
2.1 Design Overview
2.1.1 Design Goals
Designed for analytic workloads rather than for transactional workloads
Transactional workloads are characterized by:
- a large number of transactions per second (e.g. thousands)
- 事务性工作负载指的是每秒钟有大量的交易(比如数千次),
- each transaction involves a handful of tuples.
- 每个交易仅涉及几个元组。
- most of the transactions take the form of single row insertions or modifications to existing rows.
- 大多数交易采用单行插入或修改现有行的形式。
- 例如,插入新的销售记录或更新银行帐户余额。
Analytic workloads are characterized by:
- smaller transaction volume (e.g. tens per second),
- 每秒钟的交易量较小(例如每秒钟几十次),
- each transaction examines a significant fraction of the tuples in a table.
- 但每个交易都会检查表中相当一部分的元组。
- 例如,跨时间和地理维度聚合销售数据以及分析网站上不同用户的行为。
Share nothing storage
尽量进行本地计算: 优化器, 执行器避免通过网络搬运大量数据
加载,尤其是批量加载,要快
3 DATA MODEL
3.1 Projections
- 它将表数据物理地组织成投影
- 这些投影是表属性的排序子集
- 可以允许使用任意数量的具有不同排序顺序和表列子集的投影。
3.2 Join Indexes
未使用 C-Store 中使用的连接索引
3.3 Prejoin Projections
3.4 Encoding and Compression
3.5 Partitioning
3.6 Segmentation: Cluster Distribution
3.7 Read and Write Optimized Stores
- ROS: Read Optimized Store
- WOS: Write Optimized Store
3.7.1 Data Modifications and Delete Vectors
- A delete vector is a list of positions of rows that have been deleted.