本文为摘录,原文为: attachments/pdf/3/The Vertica Analytic Database- C-Store 7 Years Later (p1790_andrewlamb_vldb2012).pdf

1 ABSTRACT

  • Vertica 是 C-Store 的商业化成果

2 BACKGROUND

2.1 Design Overview

2.1.1 Design Goals

  • Designed for analytic workloads rather than for transactional workloads

    • Transactional workloads are characterized by:

      • a large number of transactions per second (e.g. thousands)
      • 事务性工作负载指的是每秒钟有大量的交易(比如数千次),
      • each transaction involves a handful of tuples.
      • 每个交易仅涉及几个元组。
      • most of the transactions take the form of single row insertions or modifications to existing rows.
      • 大多数交易采用单行插入或修改现有行的形式。
      • 例如,插入新的销售记录或更新银行帐户余额。
    • Analytic workloads are characterized by:

      • smaller transaction volume (e.g. tens per second),
      • 每秒钟的交易量较小(例如每秒钟几十次),
      • each transaction examines a significant fraction of the tuples in a table.
      • 但每个交易都会检查表中相当一部分的元组。
      • 例如,跨时间和地理维度聚合销售数据以及分析网站上不同用户的行为。
  • Share nothing storage

  • 尽量进行本地计算: 优化器, 执行器避免通过网络搬运大量数据

  • 加载,尤其是批量加载,要快

3 DATA MODEL

3.1 Projections

  • 它将表数据物理地组织成投影
    • 这些投影是表属性的排序子集
    • 可以允许使用任意数量的具有不同排序顺序和表列子集的投影。

3.2 Join Indexes

未使用 C-Store 中使用的连接索引

3.3 Prejoin Projections

3.4 Encoding and Compression

3.5 Partitioning

3.6 Segmentation: Cluster Distribution

3.7 Read and Write Optimized Stores

  • ROS: Read Optimized Store
  • WOS: Write Optimized Store

3.7.1 Data Modifications and Delete Vectors

  • A delete vector is a list of positions of rows that have been deleted.

4 TUPLE MOVER

5 QUERY EXECUTION

5.1 Query Operators and Plan Format

5.2 Query Optimization