今天在9i中遇到了library cache lock/pin等待事件,结果是由于出帐还没有完成,用户还在执行create table相关的表,其它一些应用就已经在对表进行查询了,下面是模拟整个事件。
测试环境为:DB :RHEL 4.8 OS :10.2.0.4
1,会话1创建一张TEST1表,数据来至于test表,这里test表最好大一些,不然还没有暂停进程就已经创建完成 。
www.htz.pw >create table scott.test1 as select * from scott.test;
2,暂停会话1
www.htz.pw >oradebug setospid 8729
Oracle pid: 15, Unix process pid: 8729, image: oracle@rhel4.htz.pw (TNS V1-V3)
www.htz.pw >oradebug event 10046 trace name context forever,level 12;
Statement processed.
www.htz.pw >oradebug suspend;
Statement processed.
www.htz.pw >oradebug tracefile_name;
/u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8729.trc
这里的trace 10046主要是想确认进程是否真的没有动了
3,会话2对test1表进行查询,结果hang住
[oracle10g@rhel4 sql]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Jun 3 21:56:34 2013
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
www.htz.pw >select * from scott.test1;
结果hang住了
4,通过dump systemstate来分析
使用library cache来查找 PROCESS 20: ---------------------------------------- SO: 0x79e65108, type: 2, owner: (nil), flag: INIT/-/-/0x00 (process) Oracle pid=20, calls cur/top: 0x79faf308/0x79fae7f0, flag: (0) - int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 0x79ea6808 O/S info: user: oracle10g, term: pts/3, ospid: 8867 OSD pid info: Unix process pid: 8867, image: oracle@rhel4.htz.pw (TNS V1-V3) Dump of memory from 0x0000000079E4AAD8 to 0x0000000079E4ACE0 079E4AAD0 00000005 00000000 [........] 079E4AAE0 76BA8B30 00000000 00000010 000313A7 [0..v............] 079E4AAF0 79FAE7F0 00000000 00000003 000313A7 [...y............] 079E4AB00 7958C260 00000000 0000000B 000313A7 [`.Xy............] 079E4AB10 79F72930 00000000 00000004 0003129B [0).y............] 079E4AB20 78E0BB28 00000000 00000007 000313A7 [(..x............] 079E4AB30 00000000 00000000 00000000 00000000 [................] Repeat 26 times (FOB) flags=2 fib=0x78f697c8 incno=6 pending i/o cnt=0 fname=/u01/app/oracle/oradata/orcl10g/system01.dbf fno=1 lblksz=8192 fsiz=61440 ---------------------------------------- SO: 0x79f72930, type: 4, owner: 0x79e65108, flag: INIT/-/-/0x00 (session) sid: 147 trans: (nil), creator: 0x79e65108, flag: (80000041) USR/- BSY/-/-/-/-/- DID: 0001-0014-00000001, short-term DID: 0000-0000-00000000 txn branch: (nil) oct: 3, prv: 0, sql: 0x75340208, psql: (nil), user: 0/SYS service name: SYS$USERS O/S info: user: oracle10g, term: pts/3, ospid: 8866, machine: rhel4.htz.pw program: sqlplus@rhel4.htz.pw (TNS V1-V3) application name: sqlplus@rhel4.htz.pw (TNS V1-V3), hash value=3044132538 waiting for 'library cache lock' blocking sess=0x(nil) seq=11 wait_time=0 seconds since wait started=270 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 Dumping Session Wait History for 'library cache lock' count=1 wait_time=2933014 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2930481 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2930503 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2932879 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2938262 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2930560 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2930628 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2953161 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2930606 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 for 'library cache lock' count=1 wait_time=2934815 handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9 通过handle address地址找到了如下信息 PROCESS 15: ---------------------------------------- SO: 0x79e62980, type: 2, owner: (nil), flag: INIT/-/-/0x00 (process) Oracle pid=15, calls cur/top: 0x79fab8d0/0x79fa86f0, flag: (0) - int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 25 last post received-location: ksasnr last process to post me: 79e5da70 1 6 last post sent: 0 0 24 last post sent-location: ksasnd last process posted by me: 79e5e258 1 6 (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 0x79ea6808 O/S info: user: oracle10g, term: pts/0, ospid: 8729 OSD pid info: Unix process pid: 8729, image: oracle@rhel4.htz.pw (TNS V1-V3) Short stack dump: ......................................................................... 首先是找到下面的信息,再使用ctrl+b向前找,就可以找到前面的会话信息 ---------------------------------------- SO: 0x79faf308, type: 3, owner: 0x79fae7f0, flag: INIT/-/-/0x00 (call) sess: cur 79f72930, rec 0, usr 79f72930; depth: 1 ---------------------------------------- SO: 0x76b9b5f0, type: 53, owner: 0x79faf308, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=76b9b5f0 handle=75346180 request=S call pin=(nil) session pin=(nil) hpc=0003 hlc=0000 htl=0x76b9b670[0x76f1c7c8,0x76f1c7c8] htb=0x76f1c7c8 ssga=0x76f1b8c0 user=79f72930 session=79f72930 count=0 flags=RES/[0010] savepoint=0x20 LIBRARY OBJECT HANDLE: handle=75346180 mtx=0x753462b0(0) cdp=0 name=SCOTT.TEST1 SO: 0x76b24238, type: 54, owner: 0x784d4cb8, flag: INIT/-/-/0x00 LIBRARY OBJECT PIN: pin=76b24238 handle=75346180 mode=X lock=0 user=79f82a10 session=79f82a10 count=1 mask=0709 savepoint=0x15a615 flags=[00] ---------------------------------------- SO: 0x76bb7a60, type: 53, owner: 0x784d4cb8, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=76bb7a60 handle=75346180 mode=X call pin=(nil) session pin=(nil) hpc=0000 hlc=0000 htl=0x76bb7ae0[0x76b23058,0x76b23058] htb=0x76b23058 ssga=0x76b22150 user=79f82a10 session=79f82a10 count=1 flags=[0000] savepoint=0x15a615 LIBRARY OBJECT HANDLE: handle=75346180 mtx=0x753462b0(0) cdp=0 name=SCOTT.TEST1 hash=df25314504e786643e240bf673cd59df timestamp=06-03-2013 21:56:00 previous=NULL namespace=TABL flags=KGHP/TIM/PTM/SML/[02000000] kkkk-dddd-llll=0000-0709-0201 lock=X pin=X latch#=2 hpc=0004 hlc=0002 lwt=0x75346228[0x76b9b620,0x76b9b620] ltm=0x75346238[0x75346238,0x75346238] pwt=0x753461f0[0x753461f0,0x753461f0] ptm=0x75346200[0x75346200,0x75346200] ref=0x75346258[0x75346258,0x75346258] lnd=0x75346270[0x75346270,0x75346270] LIBRARY OBJECT: object=74f77a58 这一段进程的开始部分以" name="就可以找到下面的SQL语句 LIBRARY OBJECT HANDLE: handle=7534ad10 mtx=0x7534ae40(1) cdp=1 name=create table scott.test1 as select * from scott.test hash=8ec1c218b47eeb3b9ca907ad0a10a75a timestamp=06-03-2013 21:55:59 namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]
5,通过手动查询视图
www.htz.pw >set lines 170 www.htz.pw >desc dba_kgllock; Name Null? Type ----------------------------------------------------------------------------------------------- -------- ---------------------------------------------------------------- KGLLKUSE RAW(8) KGLLKHDL RAW(8) KGLLKMOD NUMBER KGLLKREQ NUMBER KGLLKTYPE VARCHAR2(4) www.htz.pw >select * from dba_kgllock where kgllkreq>0; KGLLKUSE KGLLKHDL KGLLKMOD KGLLKREQ KGLL ---------------- ---------------- ---------- ---------- ---- 0000000079F72930 0000000075346180 0 2 Lock www.htz.pw >select * from dba_kgllock where kgllkhdl='0000000075346180' order by kgllkuse; KGLLKUSE KGLLKHDL KGLLKMOD KGLLKREQ KGLL ---------------- ---------------- ---------- ---------- ---- 0000000079F72930 0000000075346180 0 2 Lock 0000000079F82A10 0000000075346180 3 0 Pin 0000000079F82A10 0000000075346180 3 0 Lock www.htz.pw >select sid,program,command from v$session where saddr='0000000079F82A10'; SID PROGRAM COMMAND ---------- ------------------------------------------------ ---------- 159 sqlplus@rhel4.htz.pw (TNS V1-V3) 1 www.htz.pw >select action,name from audit_actions where action=1; ACTION NAME ---------- ---------------------------- 1 CREATE TABLE www.htz.pw >select KGLNAOWN ,KGLNAOBJ from x$kglob where kglhdadr='0000000075346180'; KGLNAOWN KGLNAOBJ -------------------- ------------------------------ SCOTT TEST1 如果发现是library cache lock我们可以能x$kgllk等视图我们可以马上查找到一些更详细的信息。 如下。 www.htz.pw >l 1* select kgllkses,kgllkhdl,kgllkmod,kgllkreq,user_name,kglnaobj from x$kgllk where kgllkhdl='0000000075346180' www.htz.pw >. www.htz.pw >/ KGLLKSES KGLLKHDL KGLLKMOD KGLLKREQ USER_NAME KGLNAOBJ ---------------- ---------------- ---------- ---------- ------------------------------ -------------------- 0000000079F82A10 0000000075346180 3 0 SYS TEST1 0000000079F72930 0000000075346180 0 2 SYS TEST1 select kgllkses,kgllkhdl,kgllkmod,kgllkreq,user_name,kglnaobj from x$kgllk where kgllkses='0000000079F82A10' and kgllkmod>0; 这样就可以看到是那句话导致的hang住的了。
6,通过写的脚本来直接查询
www.htz.pw >@event_library_cache_pin_9i.sql Wrote file sqlplusset USERNAME LOCK SID:SERIAL:SPID:LAST_CALL_ET STATUS HASH_VALUE HAVE:REQ HANDLE COMMAND LOCK LOGON_TI PROGRAM CLIENT ----------------------------------- -------------------- ------------- -------- ---------------- --------------- ---- -------- -------------------- -------------------- Holder: 1:159:3:8729.2138 SYS:ACTIVE 168863578 3.0 0000000075346180 CREATE TABLE Lock 03 21:43 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz Holder: 1:159:3:8729.2138 SYS:ACTIVE 168863578 3.0 0000000075346180 CREATE TABLE Pin 03 21:43 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz Waiter: 1:147:28:8867.2096 SYS:ACTIVE 3526267018 0.2 0000000075346180 SELECT Lock 03 21:56 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz ADDR handle KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ ---------------- ---------------- ---------------- -------------------- -------------------- ---------- ---------------- 0000002A97420498 0000000075346180 0000000075346180 SCOTT TEST1 1942837727 0000000074F77A58 www.htz.pw >select event from v$session_wait where sid=159; EVENT ---------------------------------------------------------------- Data file init write www.htz.pw >select sql_text from v$sql where hash_value=168863578; SQL_TEXT -------------------------------------------------------------------------------- create table scott.test1 as select * from scott.test 这里发生通过hash_value找不到sql的具体内容,下面是通过errorstack的方式来查询到的内容 www.htz.pw >select sql_text from v$sql where hash_value=3526267018; no rows selected www.htz.pw >oradebug dump errorstack 2; Statement processed. www.htz.pw >oradebug tracefile_name; /u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc www.htz.pw >!vi /u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc /u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1 System name: Linux Node name: rhel4.htz.pw Release: 2.6.9-89.0.0.0.1.ELsmp Version: #1 SMP Tue May 19 05:31:20 EDT 2009 Machine: x86_64 Instance name: orcl10g Redo thread mounted by this instance: 1 Oracle process number: 20 Unix process pid: 8867, image: oracle@rhel4.htz.pw (TNS V1-V3) *** 2013-06-03 22:35:00.278 *** ACTION NAME:() 2013-06-03 22:35:00.274 *** MODULE NAME:(sqlplus@rhel4.htz.pw (TNS V1-V3)) 2013-06-03 22:35:00.274 *** SERVICE NAME:(SYS$USERS) 2013-06-03 22:35:00.274 *** SESSION ID:(147.28) 2013-06-03 22:35:00.274 Received ORADEBUG command 'dump errorstack 2' from process Unix process pid: 9029, image: *** 2013-06-03 22:35:00.278 ksedmp: internal or fatal error Current SQL statement for this session: select * from scott.test1 ----- Call Stack Trace ----
create table 引起select语句hang住出现library cache lock事件:等您坐沙发呢!