查询是不是为遍布键查看
postgres=# explain select * from tbase_1 where f1=1;
QUERY PLAN
——————————————————————————–
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f1 = 1)
(6 rows)
postgres=# explain select * from tbase_1 where f2=1;
QUERY PLAN
——————————————————————————–
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f2 = 1)
(6 rows)
如上,第一个查看为非遍布键查看,必须发往全部连接点,那样比较慢的连接点决策了全部业务流程的速率,必须维持全部连接点的回应特性一致,如第二个查看所显示,业务流程设计方案查看时尽量携带遍布键。

查询是不是应用数据索引
postgres=# create index tbase_2_f2_idx on tbase_2(f2);
CREATE INDEX
postgres=# explain select * from tbase_2 where f2=1;
QUERY PLAN
————————————————————————————-
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)
Index Cond: (f2 = 1)
(4 rows)
postgres=# explain select * from tbase_2 where f3=’1′;
QUERY PLAN
——————————————————————————–
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_2 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f3 = ‘1’::text)
(6 rows)
postgres=#
第一个查看应用了数据库索引,第二个沒有应用数据库索引,一般状况下,应用数据库索引能够加快查看速率,但数据库索引也会提升升级的花销。

查询是不是为遍布 key join
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
————————————————————————————————
Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)
-> Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
Distribute results by S: f1
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=18.80..18.80 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(8 rows)
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f2=tbase_2.f1 ;
QUERY PLAN
———————————————————————————
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Hash Join (cost=18904.69..46257.08 rows=500564 width=14)
Hash Cond: (tbase_1.f2 = tbase_2.f1)
-> Seq Scan on tbase_1 (cost=0.00..9225.64 rows=500564 width=14)
-> Hash (cost=9225.64..9225.64 rows=500564 width=4)
-> Seq Scan on tbase_2 (cost=0.00..9225.64 rows=500564 width=4)
(7 rows)
第一个查看必须数据信息重遍布,而第二个不用,遍布键 join 查看特性会高些。

查询 join 产生的连接点
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
———————————————————————————————–
Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=126.72..126.72 rows=880 width=4)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..126.72 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(7 rows)
postgres=# set prefer_olap to on;
SET
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
————————————————————————————————
Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)
-> Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
Distribute results by S: f1
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=18.80..18.80 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(8 rows)
第一个 join 在 cn 连接点实行,第二个在 dn 上重遍布后再 join,业务流程设计方案上,一般 OLTP 类业务流程在 cn 上开展少信息量 join ,特性会更好。

查询并行处理的 worker 数
postgres=# explain select count(1) from tbase_1;
QUERY PLAN
—————————————————————————————
Finalize Aggregate (cost=118.81..118.83 rows=1 width=8)
-> Remote Subquery Scan on all (dn001,dn002) (cost=118.80..118.81 rows=1 width=0)
-> Partial Aggregate (cost=18.80..18.81 rows=1 width=8)
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=0)
(4 rows)
postgres=# analyze tbase_1;
ANALYZE
postgres=# explain select count(1) from tbase_1;
QUERY PLAN
—————————————————————————————————-
Parallel Finalize Aggregate (cost=14728.45..14728.46 rows=1 width=8)
-> Parallel Remote Subquery Scan on all (dn001,dn002) (cost=14728.33..14728.45 rows=1 width=0)
-> Gather (cost=14628.33..14628.44 rows=1 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=13628.33..13628.34 rows=1 width=8)
-> Parallel Seq Scan on tbase_1 (cost=0.00..12586.67 rows=416667 width=0)
(6 rows)
上边第一个查看没走并行处理,第二个查看 analyze 后走并行处理才算是恰当的,提议大信息量升级再实行 analyze。

查询各连接点的执行计划是不是一致
./tbase_run_sql_dn_master.sh “explain select * from tbase_2 where f2=1”
dn006 — psql -h 172.16.0.13 -p 11227 -d postgres -U tbase -c “explain select * from tbase_2 where f2=1”
QUERY PLAN
—————————————————————————–
Bitmap Heap Scan on tbase_2 (cost=2.18..7.70 rows=4 width=40)
Recheck Cond: (f2 = 1)
-> Bitmap Index Scan on tbase_2_f2_idx (cost=0.00..2.18 rows=4 width=0)
Index Cond: (f2 = 1)
(4 rows)
dn002 — psql -h 172.16.0.42 -p 11012 -d postgres -U tbase -c “explain select * from tbase_2 where f2=1”
QUERY PLAN
——————————————————————————-
Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)
Index Cond: (f2 = 1)
(2 rows)
2个 dn 的执行计划不一致,较大可能是数据倾斜或是是执行计划被禁止使用。
如有可能,DBA 能够配备在系统软件空余时实行全库 analyze 和 vacuum。