下面是测试not in/not exists对连接列有空值的处理方法
1,环境介绍
orcl1123.htz.pw > select * from v$version where rownum=1; BANNER —————————————————————————— Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production orcl1123.htz.pw > !uname -a Linux orcl9i 2.6.9-89.EL #1 Mon Apr 20 10:22:29 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux orcl1123.htz.pw > @parameter_hide.sql orcl1123.htz.pw > set echo off Enter Search Parameter (i.e. max|all) : _optimizer_null_aware_antijoin
PARAMETER SESSION_VALUE INSTANCE_VALUE DESCRIPTION —————————————- ——————– ——————– —————————– _optimizer_null_aware_antijoin TRUE TRUE null-aware antijoin parameter |
2,创建测试环境
orcl1123.htz.pw > create table htz.ht1 (id number,name varchar2(10)); Table created. orcl1123.htz.pw > create table htz.ht2 (id number,name varchar2(10)); Table created. orcl1123.htz.pw > insert into htz.ht1 values (1,’htz1′); 1 row created. orcl1123.htz.pw > insert into htz.ht2 values (1,’htz1′); 1 row created. orcl1123.htz.pw > insert into htz.ht2 values (2,’htz2′); 1 row created. orcl1123.htz.pw > insert into htz.ht1 values (2,’htz2′); 1 row created. orcl1123.htz.pw > insert into htz.ht1 values (3,’htz3′); 1 row created. orcl1123.htz.pw > commit; Commit complete. orcl1123.htz.pw > select * from htz.ht1; ID NAME ———- ———- 1 htz1 2 htz2 3 htz3 orcl1123.htz.pw > select * from htz.ht2; ID NAME ———- ———- 1 htz1 2 htz2 |
3,两个表无空值的情况
在两个表无空值的情况,not in /not exists结果相等
orcl1123.htz.pw > select a.* from htz.ht1 a where a.id not in (select b.id from htz.ht2 b);
ID NAME ———- ———- 3 htz3 orcl1123.htz.pw > select a.* from htz.ht1 a where not exists (select 1 from htz.ht2 b where a.id=b.id);
ID NAME ———- ———- 3 htz3 |
4,主表有空值的情况
向主表插入一张空值 orcl1123.htz.pw > insert into htz.ht1(name) values(‘NULL’); 1 row created. orcl1123.htz.pw > commit; Commit complete. orcl1123.htz.pw > select a.* from htz.ht1 a where not exists (select 1 from htz.ht2 b where a.id=b.id); ID NAME ———- ———- NULL 3 htz3 orcl1123.htz.pw > select a.* from htz.ht1 a where a.id not in (select b.id from htz.ht2 b); ID NAME ———- ———- 3 htz3 |
这里可以看到not exists将空值显示出来了。
下面是半连接的测试
orcl1123.htz.pw > select a.* from htz.ht1 a where a.id in (select b.id from htz.ht2 b); ID NAME ———- ———- 1 htz1 2 htz2 orcl1123.htz.pw > select a.* from htz.ht1 a where exists (select 1 from htz.ht2 b where a.id=b.id); ID NAME ———- ———- 1 htz1 2 htz2 |
半连接不受空值的影响
5 子表有空值的情况
将主表空值删除 orcl1123.htz.pw > delete htz.ht1 where name=’NULL’; 1 row deleted. orcl1123.htz.pw > commit; Commit complete. 子表插入空值 orcl1123.htz.pw > insert into htz.ht2(name) values(‘NULL’); 1 row created. orcl1123.htz.pw > select a.* from htz.ht1 a where not exists (select 1 from htz.ht2 b where a.id=b.id); ID NAME ———- ———- 3 htz3 orcl1123.htz.pw > select a.* from htz.ht1 a where a.id not in (select b.id from htz.ht2 b); no rows selected |
空值对not exists不影响,但是not in 直接还回0行。
下面测试半连接情况
orcl1123.htz.pw > select a.* from htz.ht1 a where a.id in (select b.id from htz.ht2 b); ID NAME ———- ———- 1 htz1 2 htz2 orcl1123.htz.pw > select a.* from htz.ht1 a where exists (select 1 from htz.ht2 b where a.id=b.id); ID NAME ———- ———- 1 htz1 2 htz2 |
半连接不受空值的影响
6 测试结果
1,not exists当主表有空值时会显示空值,not in不显示主表的空值。
2,not exists不显示子表空值,not in当子表有空值,直接还回0行
3,半连接不受主子表空值的影响。
4,not exists与not in只有当主子表的连接都非空时才相等
7 反连接执行计划说明
7.1 HASH连接
orcl1123.htz.pw > select a.* from htz.ht1 a where a.id not in (select b.id from htz.ht2 b); no rows selected Execution Plan ———————————————————- Plan hash value: 3284385735 ————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ————————————————————————— | 0 | SELECT STATEMENT | | 3 | 99 | 7 (15)| 00:00:01 | |* 1 | HASH JOIN ANTI NA | | 3 | 99 | 7 (15)| 00:00:01 | | 2 | TABLE ACCESS FULL| HT1 | 3 | 60 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| HT2 | 3 | 39 | 3 (0)| 00:00:01 | ————————————————————————— Predicate Information (identified by operation id): ————————————————— 1 – access("A"."ID"="B"."ID") 这里看到有NA,是由于11G新功能,受_optimizer_null_aware_antijoin参数控制,MOS:Optimizer Null-Aware Anti Join (文档 ID 1082123.1) |
反连接走HASH连接,我们可以更改驱动表。
7.2 FILTER方式
FILTER方式原理跟NL差不多,在生产环境一般我们看到FILTER出现的时候,产生应该考虑FILTER在这里是否合理。如果是OLTP环境,通过gather_plan_statistics收集一次统计信息一下就可以判断,我们也可以通过手动写SQL来查询。这里为什么说OLTP环境,因为OLTP环境SQL一般正常下都是S内还回结果。
orcl1123.htz.pw > alter session set "_optimizer_null_aware_antijoin"=false; Session altered. orcl1123.htz.pw > select a.* from htz.ht1 a where a.id not in (select b.id from htz.ht2 b); Execution Plan ———————————————————- Plan hash value: 3667279750
————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ————————————————————————— | 0 | SELECT STATEMENT | | 1 | 20 | 5 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| HT1 | 3 | 60 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| HT2 | 3 | 39 | 2 (0)| 00:00:01 | —————————————————————————
Predicate Information (identified by operation id): —————————————————
1 – filter( NOT EXISTS (SELECT 0 FROM "HTZ"."HT2" "B" WHERE LNNVL("B"."ID"<>:B1))) 3 – filter(LNNVL("B"."ID"<>:B1)) |
7.3 NL方式
orcl1123.htz.pw > select a.* from htz.ht1 a where a.id is not null and a.id not in (select b.id from htz.ht2 b where b.id is not null);
Execution Plan ———————————————————- Plan hash value: 3924610515
——————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ——————————————————————————— | 0 | SELECT STATEMENT | | 3 | 99 | 3 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 3 | 99 | 3 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| HT1 | 3 | 60 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IND_HT2_ID | 1 | 13 | 0 (0)| 00:00:01 | ———————————————————————————
Predicate Information (identified by operation id): —————————————————
2 – filter("A"."ID" IS NOT NULL) 3 – access("A"."ID"="B"."ID") |
如果使用NL方式,是不能更换驱动表的。
另外在抒写NOT IN的时候,建议在连接列上面增加上is not null的限制或者表上增加约束,不然很有可能走FILTER方式的。
not in/not exists对连接列空值的处理:等您坐沙发呢!