目录

gpdb cdb

1 Data structures

1.1 Slice Table

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
@startuml
class SliceTable {
+ NodeTag type
+ int localSlice
+ int numSlices
+ ExecSlice slices
+ bool hasMotions
+ int instrument_options
+ uint32 ic_instance_id
}

note right of SliceTable::localSlice
Index of the slice to execute
end note

note right of SliceTable::slices
Array of slices, indexed by SliceIndex
end note


note right of SliceTable::hasMotions
Are there any Motion nodes anywhere in the plan?
end note


class ExecSlice {
+ int sliceIndex
+ int rootIndex
+ int parentIndex
+ int planNumSegments
+ List children
+ GangType gangType
+ List segments
+ struct primaryGang
+ List primaryProcesses
+ Bitmapset processesMap
}

note right of ExecSlice::primaryProcesses
A list of CDBProcess nodes corresponding to the worker
processes allocated to implement this plan slice.
end note

note right of ExecSlice::processesMap
A bitmap to identify which QE should execute this slice
end note

SliceTable o-- ExecSlice

class Gang {
+ GangType type
+ int size
+ struct db_descriptors
+ bool allocated
}

note right of Gang::db_descriptors
Array of QEs/segDBs that make up this gang.
Sorted by segment index.
end note


ExecSlice *-- Gang

class CdbProcess {
+ NodeTag type
+ char listenerAddr
+ int listenerPort
+ int pid
+ int contentid
+ int dbid
}

ExecSlice o-- CdbProcess



class SegmentDatabaseDescriptor {
+ struct segment_database_info
+ int segindex
+ int conn
+ int motionListener
+ int backendPid
+ char whoami
+ int isWriter
+ int identifier
}

Gang o-- SegmentDatabaseDescriptor



class CdbComponentDatabases {
+ CdbComponentDatabaseInfo segment_db_info
+ int total_segment_dbs
+ CdbComponentDatabaseInfo entry_db_info
+ int total_entry_dbs
+ int total_segments
+ int fts_version
+ int expand_version
+ int numActiveQEs
+ int numIdleQEs
+ int qeCounter
+ List freeCounterList
}

note right of CdbComponentDatabaseInfo::segment_db_info
array of  SegmentDatabaseInfo for segment databases
end note

note right of CdbComponentDatabaseInfo::entry_db_info
array of  SegmentDatabaseInfo for entry databases
end note


class CdbComponentDatabaseInfo {
+ struct config
+ CdbComponentDatabases cdbs
+ int hostSegs
+ List freelist
+ int numIdleQEs
+ int numActiveQEs
}

note right of CdbComponentDatabaseInfo::cdbs
point to owners
end note

CdbComponentDatabases o-- CdbComponentDatabaseInfo



class GpSegConfigEntry {
+ int dbid
+ int segindex
+ char role
+ char preferred_role
+ char mode
+ char status
+ int port
+ char hostname
+ char address
+ char datadir
+ char hostip
+ char hostaddrs
}

CdbComponentDatabaseInfo o-- GpSegConfigEntry

SegmentDatabaseDescriptor o-- CdbComponentDatabaseInfo

@enduml

2/ox-hugo/SliceTable-gen-e3de35057480a3f67a59b81aff9b1a55.png

1.1.1 SliceTable

SliceTable : 由 Slice 组成的链表, Slice 组织成三类:

  • root slices:
    Slice 0

  • motion slices
    1 ~ n 为 motion slices, 每个 slice 的根为 sending motion

  • 其余为 initPlans

1.1.2 ExecSlice

  • MPP 中,计划树 (PlanTree) 被切分成多个单独的执行单元 (又称 Slice
  • 一个 Slice 在进程组 (process gang) 的一个 worker 上执行

2 PostgresMain

2.1 Call graph (QE):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
Main()
  PostmasterMain()
    ServerLoop()
      BackendStartup()
        BackendRun()
          PostgresMain()
            InitPostgres()
              cdb_setup()
                ensureInterconnectAddress()
                InitMotionLayerIPC()
                  InitMotionTCP()
                      setupTCPListeningSocket()
            sendQEDetails()

setupTCPListeningSocket() 会由操作系统分配端口,并返回上层。并在 InitMotionlayerIPC() 中存储在全局变量 Gp_listener_port 中,并随后在函数 sendQEDetails(void) 中将端口信息 “qe_listener_port” 发送给 client 。

2.2 QD

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
standard_ExecutorStart()
  CdbDispatchPlan()
    cdbdisp_dispatchX()
      AssignGangs()
        AssignWriterGangFirst()
          AllocateGang()
            cdbgang_createGang()
              cdbgang_createGang_async()
                cdbconn_doConnectComplete()
                  cdbconn_get_motion_listener_port()
          setupCdbProcessList()

函数 AssignGangs() 在 QD 上执行, 将 Executor 工厂分配的 gangs 分配给 slice table 中的 slices. 从而构建全局的 slice table 。该过程分成了两步:

3 Receiver

4 Sender