对称哈希连接实现
对称哈希连接实现
何衍泓 2023150039
目标
你将实现对称哈希连接的内连接部分。修改pg的执行器(executor)来实现对称哈希连接
前提理解
-
执行器是根据优化器产生的计划来执行,最后一步步得到结果。
-
执行器有两个树,计划树和状态树
- 计划树是只读的,只保存要干什么事情
- 状态树是“实时记录”,记录每个步骤当前的状态,与计划树节点一一对应
-
需求拉动管道(Demand-Pull Pipeline)
- 顶层节点向直接子节点要一个元组
- 子节点收到请求之后,可能向他自己的子节点要元组
- 递归到最底层的扫描节点直接从表中读取元组
查询:SELECT * FROM users JOIN orders ON users.id = orders.user_id;执行树: HashJoinNode(顶层节点) ├─ SeqScanNode(扫描users表) └─ HashNode(扫描orders表并构建哈希表)执行流程: 1. HashJoinNode向SeqScanNode要一个users元组。 2. SeqScanNode从users表读一行,返回给HashJoinNode。 3. HashJoinNode用这个users元组在哈希表中查找匹配的orders元组。 4. 如果找到,返回给客户端;否则继续向SeqScanNode要下一个users元组。
-
单节点执行(ExecInitNode)
- 每次请求一个元组:就像厨师每次只炒一盘菜,做完一盘再做下一盘。
- 适用场景:大多数操作(如扫描、连接、过滤)。
-
多节点执行(MultiExecProcNode)
- 一次性返回所有结果:就像厨师一次性炒完所有菜,打包给顾客。
- 适用场景:特殊操作(如并行查询、哈希表构建)。
-
执行器的工作流程
- 优化器生成计划树,告诉执行器要干啥事情
- 执行器创建状态树,做准备
- 顶层节点开始需求拉动
- 底层节点读取数据并且向上传递
- 表达式节点计算条件
- 最终结果返回给客户端
思路建立
传统哈希连接
- 脑图:
脑图说明
- 文件分层:按源码目录结构分组,清晰展示各文件的职责。
- 函数流程
- 构建阶段:
nodeHashjoin.c
->BuildHashTable()
->nodeHash.c
的哈希表操作函数。 - 探测阶段:
nodeHashjoin.c
->ProbeHashTable()
->nodeHash.c
的扫描函数。
- 构建阶段:
- 关键数据结构:
HashJoinState
在execnodes.h
中定义,存储哈希连接的运行时状态。 - 优化器与执行器交互:
createplan.c
生成计划节点,nodeHashjoin.c
负责执行。
对称哈希连接(该实验要修改的内容)
- 脑图
- 优化器通过
createplan.c
确保内外关系均用哈希节点处理; - 流水线哈希通过
nodeHash.c
的ExecHash
逐元组构建哈希表; - 对称连接通过
nodeHashjoin.c
的双哈希表双向探测实现内连接匹配; - 状态管理通过
execnodes.h
的HashJoinState
维护双表状态和结果缓存。
具体实现步骤
步骤一:修改 Plan 生成逻辑(createplan.c
)
- 让
HashJoin
的两侧都先经过Hash节点,plan tree变成:
HashJoin/ \
Hash Hash| |Outer Inner
-
这样子,执行的时候就可以从两侧各拉去一条元组,插入各自的哈希表
-
找到
create_hashjoin_plan
函数
static HashJoin *
create_hashjoin_plan(PlannerInfo *root,HashPath *best_path)
{Hash *inner_hash_plan; Hash *outer_hash_plan; // 添加外表的哈希计划// 1. 先递归生成 outer/inner 的 planouter_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath,CP_SMALL_TLIST);inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath,CP_SMALL_TLIST);// 2. 包裹 outer_plan 和 inner_plan 都加 Hash 节点inner_hash_plan = make_hash(inner_plan,inner_hashkeys,InvalidOid,InvalidAttrNumber,false);outer_hash_plan = make_hash(outer_plan,outer_hashkeys,InvalidOid,InvalidAttrNumber,false);// 3. 生成 HashJoin 节点,左右分别用 outer_hash_plan 和 inner_hash_planjoin_plan = make_hashjoin(tlist,joinclauses,otherclauses,hashclauses,hashoperators,hashcollations,outer_hashkeys,(Plan *) outer_hash_plan,(Plan *) inner_hash_plan,best_path->jpath.jointype,best_path->jpath.inner_unique);}
步骤二:修改 Hash 节点支持流式执行(nodeHash.c
)
- 实现
ExecHash
,支持 pipeline 模式:每次调用返回一个元组,也就是底层节点的元组 - 不需要再这里插入
- 我们需要的是minimaltuple,所以要进行类型的转化
/** ExecHash** 该函数用于执行 Hash 节点的处理逻辑。它会从外部节点(通常是子计划节点)获取一条元组** 参数:* pstate - 指向当前计划节点的 PlanState 指针,实际类型为 HashState。** 返回值:* 返回处理的 TupleTableSlot 指针,如果处理结束则返回 NULL。*/
TupleTableSlot *
ExecHash(PlanState *pstate)
{// 将通用 PlanState 指针转换为 HashState 指针HashState *node = (HashState *)pstate;// 获取外部节点(即 Hash 节点的输入子节点)PlanState *outerNode = outerPlanState(node);// 用于存放从外部节点获取的元组TupleTableSlot *slot;TupleTableSlot *minimal_slot = NULL;if (!minimal_slot)minimal_slot = ExecInitExtraTupleSlot(node->ps.state,ExecGetResultType(outerNode),&TTSOpsMinimalTuple);// 如果已经处理结束(即所有元组都已处理),直接返回 NULLif (node->finished)return NULL;// 从外部节点获取一条元组slot = ExecProcNode(outerNode);// 如果没有获取到元组(即输入结束),标记 finished 并返回 NULLif (TupIsNull(slot)) {node->finished = true;return NULL;}ExecMaterializeSlot(slot);ExecCopySlot(minimal_slot, slot);return minimal_slot;
}
- 需要在
HashState
结构体中增加状态变量(finished
),表示是否已经处理完所有元组。
/* ----------------* HashState information* ----------------*/
typedef struct HashState
{PlanState ps; /* its first field is NodeTag */HashJoinTable hashtable; /* hash table for the hashjoin */List *hashkeys; /* list of ExprState nodes */int finished; // 是否处理完所有元组SharedHashInfo *shared_info; /* one entry per worker */HashInstrumentation *hinstrument; /* this worker's entry *//* Parallel hash state. */struct ParallelHashJoinState *parallel_state;
} HashState;
- 注意:需要在
ExecInitHash
初始化node->finished = false;
。
步骤三:扩展 HashJoinState
结构体(execnodes.h
)
- 支持双向哈希表,即 outer/inner 各自的哈希表。
- 支持双向探测,即每次从一侧取元组,插入本侧哈希表,并用它去探测对方哈希表。
typedef struct HashJoinState
{JoinState js; /* its first field is NodeTag */ExprState *hashclauses;List *hj_OuterHashKeys; /* list of ExprState nodes */List *hj_InnerHashKeys;List *hj_HashOperators; /* list of operator OIDs */List *hj_Collations;HashJoinTable hj_HashTable; // 对称哈希连接需要的状态HashJoinTable hj_OuterHashTable; // outer 侧哈希表HashJoinTable hj_InnerHashTable; // inner 侧哈希表uint32 hj_CurHashValue;int hj_CurBucketNo;int hj_CurSkewBucketNo;HashJoinTuple hj_CurTuple;HashJoinTuple cur_outer_probe_tuple;HashJoinTuple cur_inner_probe_tuple;TupleTableSlot *hj_OuterTupleSlot;TupleTableSlot *hj_InnerTupleSlot;TupleTableSlot *hj_HashTupleSlot;TupleTableSlot *hj_InnerHashTupleSlot;TupleTableSlot *hj_OuterHashTupleSlot;TupleTableSlot *hj_NullOuterTupleSlot;TupleTableSlot *hj_NullInnerTupleSlot;TupleTableSlot *hj_FirstOuterTupleSlot;int hj_JoinState;bool hj_MatchedOuter;bool hj_OuterNotEmpty;bool hj_OuterDone;bool hj_InnerDone;bool process_outer;
} HashJoinState;
步骤四:实现对称哈希连接算法(nodeHashjoin.c
)
- 用对称哈希连接算法替换原有的哈希连接算法。
- 每次从 outer/inner 轮流取一条元组,插入本侧哈希表,并用它去探测对方哈希表,发现匹配就输出结果。
- 修改
ExecInitHashJoin
,原本只初始化 inner 侧的 Hash 节点,现在要outer/inner 都初始化
HashJoinState *
ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
{HashJoinState *hjstate;Plan *outerNode;Plan *innerNode;Hash *outerhashNode;Hash *innerhashNode;TupleDesc outerDesc,innerDesc;const TupleTableSlotOps *ops;// 1. 获取 outer/inner 的 Hash 节点innerhashNode = (Hash *) innerPlan(node);outerhashNode = (Hash *) outerPlan(node);// 2. 初始化 outer/inner 的 PlanStateouterPlanState(hjstate) = ExecInitNode((Plan *) outerhashNode, estate, eflags);outerDesc = ExecGetResultType(outerPlanState(hjstate));innerPlanState(hjstate) = ExecInitNode((Plan *) innerhashNode, estate, eflags);innerDesc = ExecGetResultType(innerPlanState(hjstate));// 3. 初始化两个哈希表指针和状态hjstate->hj_OuterHashTable = NULL;hjstate->hj_InnerHashTable = NULL;hjstate->hj_OuterDone = false;hjstate->hj_InnerDone = false;hjstate->process_outer = false;hjstate->cur_outer_probe_tuple = NULL;hjstate->cur_inner_probe_tuple = NULL;// 4. 初始化两个 hashkeysforeach(lc, node->hashclauses){Node *expr = lfirst(lc);if (!IsA(expr, OpExpr))elog(ERROR, "hashkey is not OpExpr, tag=%d", (int)nodeTag(expr));OpExpr *op = (OpExpr *) expr;if (op->args == NIL || list_length(op->args) != 2)elog(ERROR, "op->args is NIL or length != 2");outerHashKeys = lappend(outerHashKeys, linitial(op->args));innerHashKeys = lappend(innerHashKeys, lsecond(op->args));}hjstate->hj_OuterHashKeys = ExecInitExprList(outerHashKeys, (PlanState *) hjstate);hjstate->hj_InnerHashKeys = ExecInitExprList(innerHashKeys, (PlanState *) hjstate);// 5. 初始化两个 TupleSlotops = ExecGetResultSlotOps(outerPlanState(hjstate), NULL);hjstate->hj_OuterTupleSlot = ExecInitExtraTupleSlot(estate, outerDesc, &TTSOpsMinimalTuple);ops = ExecGetResultSlotOps(innerPlanState(hjstate), NULL);hjstate->hj_InnerTupleSlot = ExecInitExtraTupleSlot(estate, innerDesc, &TTSOpsMinimalTuple);hjstate->hj_OuterHashTupleSlot = ExecInitExtraTupleSlot(estate, outerDesc, &TTSOpsMinimalTuple);hjstate->hj_InnerHashTupleSlot = ExecInitExtraTupleSlot(estate, innerDesc, &TTSOpsMinimalTuple); return hjstate;
}
-
修改
ExecHashJoinImpl
,每次从 outer/inner 轮流取一条元组,插入本侧哈希表,并用它去探测对方哈希表,发现匹配就输出。 -
需要保证
hj_OuterHashTable
和hj_InnerHashTable
在ExecInitHashJoin
中初始化为 NULL,hj_OuterDone
和hj_InnerDone
初始化为 false。 -
注意,由于匹配可能不止一次,所以要继续完成没结束的匹配,否则会落下一下元组
static pg_attribute_always_inline TupleTableSlot *
ExecHashJoinImpl(PlanState *pstate, bool parallel)
{HashJoinState *node = castNode(HashJoinState, pstate);ExprContext *econtext = node->js.ps.ps_ExprContext;ExprState *joinqual = node->js.joinqual;ExprState *otherqual = node->js.ps.qual;int alldone = 0;// 只支持非并行Assert(!parallel);// 初始化两个哈希表if (node->hj_OuterHashTable == NULL) {node->hj_OuterHashTable = ExecHashTableCreate((HashState *)outerPlanState(node),node->hj_HashOperators,node->hj_Collations,false);}if (node->hj_InnerHashTable == NULL) {node->hj_InnerHashTable = ExecHashTableCreate((HashState *)innerPlanState(node),node->hj_HashOperators,node->hj_Collations,false);}// 交替处理左右流for (;;){CHECK_FOR_INTERRUPTS();// 处理外连接和全连接if (node->hj_OuterDone && node->hj_InnerDone) {if (node->process_outer) {if (HJ_FILL_OUTER(node)) {if (__ExecScanHashBucket(node, econtext)){econtext->ecxt_innertuple = node->hj_NullInnerTupleSlot;return ExecProject(node->js.ps.ps_ProjInfo);}else {// 全部处理结束了node->process_outer = !node->process_outer;alldone++;if (alldone == 1 + (HJ_FILL_INNER(node) == true)) return NULL;}}else {node->process_outer = !node->process_outer;if (alldone == (HJ_FILL_OUTER(node) == true) + (HJ_FILL_INNER(node) == true)) return NULL;}}else {if (HJ_FILL_INNER(node)){if (__ExecScanHashBucket(node, econtext)){econtext->ecxt_outertuple = node->hj_NullOuterTupleSlot;return ExecProject(node->js.ps.ps_ProjInfo);}else {// 全部处理结束了node->process_outer = !node->process_outer;alldone++;if (alldone == 1 + (HJ_FILL_OUTER(node) == true)) return NULL;}}else{node->process_outer = !node->process_outer;if (alldone == (HJ_FILL_OUTER(node) == true) + (HJ_FILL_INNER(node) == true)) return NULL;}}}// 继续完成还没结束的匹配if (node->process_outer && node->cur_outer_probe_tuple) {while (node->cur_outer_probe_tuple){// 把inner哈希桶中的元组放到innerslotTupleTableSlot *innerslot = node->hj_InnerHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple), innerslot, false);// 设置表达式上下文中的inner元组econtext->ecxt_innertuple = innerslot;// 检查连接条件和其他过滤条件if (ExecQual(node->hashclauses, econtext)) {if ((!joinqual || ExecQual(joinqual, econtext)) &&(!otherqual || ExecQual(otherqual, econtext))){// 满足条件,输出连接结果if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple));}if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple));}node->cur_outer_probe_tuple = node->cur_outer_probe_tuple->next.unshared;if (!node->hj_InnerDone && node->cur_outer_probe_tuple == NULL) {node->process_outer = !node->process_outer;}return ExecProject(node->js.ps.ps_ProjInfo);}}// 继续遍历桶链表node->cur_outer_probe_tuple = node->cur_outer_probe_tuple->next.unshared;}node->cur_outer_probe_tuple = NULL;if (!node->hj_InnerDone) node->process_outer = !node->process_outer;continue;}else if (!node->process_outer && node->cur_inner_probe_tuple) {while (node->cur_inner_probe_tuple){// 把outer哈希桶中的元组放到outerslotTupleTableSlot *outerslot = node->hj_OuterHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple), outerslot, false);// 设置表达式上下文中的inner元组econtext->ecxt_outertuple = outerslot;// 检查连接条件和其他过滤条件if (ExecQual(node->hashclauses, econtext)) {if ((!joinqual || ExecQual(joinqual, econtext)) &&(!otherqual || ExecQual(otherqual, econtext))){// 满足条件,输出连接结果if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple));}if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple));}node->cur_inner_probe_tuple = node->cur_inner_probe_tuple->next.unshared;if (!node->hj_OuterDone && node->cur_inner_probe_tuple == NULL) {node->process_outer = !node->process_outer;}return ExecProject(node->js.ps.ps_ProjInfo);}}// 继续遍历桶链表node->cur_inner_probe_tuple = node->cur_inner_probe_tuple->next.unshared;}node->cur_inner_probe_tuple = NULL;if (!node->hj_OuterDone) node->process_outer = !node->process_outer;continue;}// 拉取新元组// 处理outer流(左表),如果还没结束if (node->process_outer && !node->hj_OuterDone){// 先拉取要处理的元组if (TupIsNull(node->hj_OuterTupleSlot)) {node->hj_OuterTupleSlot = ExecProcNode(outerPlanState(node));}// 当前要处理的outer元组TupleTableSlot *slot = node->hj_OuterTupleSlot;// 如果outer流已经没有元组了,标记为完成if (TupIsNull(slot)){node->hj_OuterDone = true;if (!node->hj_InnerDone) node->process_outer = !node->process_outer;continue;}else{uint32 hashvalue;// 把当前outer元组放到表达式上下文econtext->ecxt_outertuple = slot;// 计算该元组的哈希值(只要不是NULL就返回true)if (ExecHashGetHashValue(node->hj_OuterHashTable, econtext,node->hj_OuterHashKeys, false, HJ_FILL_OUTER(node), &hashvalue)){int bucketno, batchno;uint32 inner_hashvalue;// 把当前outer元组插入到outer侧哈希表ExecHashTableInsert(node->hj_OuterHashTable, slot, hashvalue);ExecHashGetBucketAndBatch(node->hj_OuterHashTable, hashvalue, &bucketno, &batchno);node->hj_CurTuple = node->hj_OuterHashTable->buckets.unshared[bucketno];// 用inner哈希表的参数重新计算hashvalueecontext->ecxt_innertuple = slot; // 注意这里if (ExecHashGetHashValue(node->hj_InnerHashTable, econtext,node->hj_OuterHashKeys, true, false, &inner_hashvalue)) {// 计算该哈希值在inner哈希表中的桶号ExecHashGetBucketAndBatch(node->hj_InnerHashTable, inner_hashvalue, &bucketno, &batchno);// 获取inner哈希表对应桶的链表头node->cur_outer_probe_tuple = node->hj_InnerHashTable->buckets.unshared[bucketno];// 遍历inner哈希桶链表,查找所有匹配的inner元组while (node->cur_outer_probe_tuple){// 把inner哈希桶中的元组放到innerslotTupleTableSlot *innerslot = node->hj_InnerHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple), innerslot, false);// 设置表达式上下文中的inner元组econtext->ecxt_innertuple = innerslot;// 检查连接条件和其他过滤条件if (ExecQual(node->hashclauses, econtext)) {if ((!joinqual || ExecQual(joinqual, econtext)) &&(!otherqual || ExecQual(otherqual, econtext))){// 满足条件,输出连接结果if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->cur_outer_probe_tuple));}if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple));}node->cur_outer_probe_tuple = node->cur_outer_probe_tuple->next.unshared;if (!node->hj_InnerDone && node->cur_outer_probe_tuple == NULL) {node->process_outer = !node->process_outer;}node->hj_OuterTupleSlot = ExecProcNode(outerPlanState(node));return ExecProject(node->js.ps.ps_ProjInfo);}}// 继续遍历桶链表node->cur_outer_probe_tuple = node->cur_outer_probe_tuple->next.unshared;}}}if (!node->hj_InnerDone) node->process_outer = !node->process_outer;// outer流取下一个元组,准备下次循环node->hj_OuterTupleSlot = ExecProcNode(outerPlanState(node));}}else if (!node->process_outer && !node->hj_InnerDone){// 先拉取要处理的元组if (TupIsNull(node->hj_InnerTupleSlot)) {node->hj_InnerTupleSlot = ExecProcNode(innerPlanState(node));}TupleTableSlot *slot = node->hj_InnerTupleSlot;if (TupIsNull(slot)){node->hj_InnerDone = true;if (!node->hj_OuterDone) node->process_outer = !node->process_outer;continue;}else{uint32 hashvalue;econtext->ecxt_innertuple = slot;if (ExecHashGetHashValue(node->hj_InnerHashTable, econtext,node->hj_InnerHashKeys, false, HJ_FILL_INNER(node), &hashvalue)){int bucketno, batchno;uint32 outer_hashvalue;ExecHashTableInsert(node->hj_InnerHashTable, slot, hashvalue);ExecHashGetBucketAndBatch(node->hj_InnerHashTable, hashvalue, &bucketno, &batchno);// 获取inner哈希表对应桶的链表头node->hj_CurTuple = node->hj_InnerHashTable->buckets.unshared[bucketno];// 用outer哈希表的参数重新计算hashvalueecontext->ecxt_outertuple = slot; // 注意这里if (ExecHashGetHashValue(node->hj_OuterHashTable, econtext,node->hj_InnerHashKeys, true, false, &outer_hashvalue)) {ExecHashGetBucketAndBatch(node->hj_OuterHashTable, outer_hashvalue, &bucketno, &batchno);node->cur_inner_probe_tuple = node->hj_OuterHashTable->buckets.unshared[bucketno];while (node->cur_inner_probe_tuple){TupleTableSlot *outerslot = node->hj_OuterHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple), outerslot, false);econtext->ecxt_outertuple = outerslot;if (ExecQual(node->hashclauses, econtext)) {if ((!joinqual || ExecQual(joinqual, econtext)) &&(!otherqual || ExecQual(otherqual, econtext))){if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->cur_inner_probe_tuple));}if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple))) {HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple));}node->cur_inner_probe_tuple = node->cur_inner_probe_tuple->next.unshared;if (!node->hj_OuterDone && node->cur_inner_probe_tuple == NULL) {node->process_outer = !node->process_outer;}node->hj_InnerTupleSlot = ExecProcNode(innerPlanState(node));return ExecProject(node->js.ps.ps_ProjInfo);}}node->cur_inner_probe_tuple = node->cur_inner_probe_tuple->next.unshared;}}}if (!node->hj_OuterDone) node->process_outer = !node->process_outer;node->hj_InnerTupleSlot = ExecProcNode(innerPlanState(node));}}}
}
- 如果要实现外连接和全连接,我们还需要进行桶的扫描,因为外连接和全连接需要对未匹配元组进行输出
- 所以我们自己写一个函数,用于找未匹配元组,然后进行返回
__ExecScanHashBucket
bool __ExecScanHashBucket(HashJoinState *hjstate, ExprContext *econtext)
{if (hjstate->process_outer) {for (int i = 0; i < hjstate->hj_OuterHashTable->nbuckets; i++) {hjstate->cur_outer_probe_tuple = hjstate->hj_OuterHashTable->buckets.unshared[i];while (hjstate->cur_outer_probe_tuple) {if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(hjstate->cur_outer_probe_tuple))) {TupleTableSlot *slot = hjstate->hj_OuterHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(hjstate->cur_outer_probe_tuple), slot, false);econtext->ecxt_outertuple = slot;HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(hjstate->cur_outer_probe_tuple));return true;}hjstate->cur_outer_probe_tuple = hjstate->cur_outer_probe_tuple->next.unshared;}}}else {for (int i = 0; i < hjstate->hj_InnerHashTable->nbuckets; i++) {hjstate->cur_inner_probe_tuple = hjstate->hj_InnerHashTable->buckets.unshared[i];while (hjstate->cur_inner_probe_tuple) {if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(hjstate->cur_inner_probe_tuple))) {TupleTableSlot *slot = hjstate->hj_InnerHashTupleSlot;ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(hjstate->cur_inner_probe_tuple), slot, false);econtext->ecxt_innertuple = slot;HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(hjstate->cur_inner_probe_tuple));return true;}hjstate->cur_inner_probe_tuple = hjstate->cur_inner_probe_tuple->next.unshared;}}}return false;
}
- 当然,注意在
nodehash.h
里面添加函数的声明
extern TupleTableSlot *ExecHash(PlanState *pstate);
extern bool __ExecScanHashBucket(HashJoinState *hjstate, ExprContext *econtext);
步骤五:配置文件调整(postgresql.conf
)
- 禁用 merge join、nested loop join,强制优化器只用 hash join。
enable_mergejoin = off
enable_nestloop = off
ableSlot *slot = hjstate->hj_InnerHashTupleSlot;
ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(hjstate->cur_inner_probe_tuple), slot, false);
econtext->ecxt_innertuple = slot;
HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(hjstate->cur_inner_probe_tuple));
return true;
}
hjstate->cur_inner_probe_tuple = hjstate->cur_inner_probe_tuple->next.unshared;
}
}
}
return false;
}
- 当然,注意在`nodehash.h`里面添加函数的声明```cppextern TupleTableSlot *ExecHash(PlanState *pstate);extern bool __ExecScanHashBucket(HashJoinState *hjstate, ExprContext *econtext);
步骤五:配置文件调整(postgresql.conf
)
- 禁用 merge join、nested loop join,强制优化器只用 hash join。
enable_mergejoin = off
enable_nestloop = off