今天在9i中遇到了library cache lock/pin等待事件,结果是由于出帐还没有完成,用户还在执行create table相关的表,其它一些应用就已经在对表进行查询了,下面是模拟整个事件。
测试环境为:DB :RHEL 4.8 OS :10.2.0.4
1,会话1创建一张TEST1表,数据来至于test表,这里test表最好大一些,不然还没有暂停进程就已经创建完成 。
www.htz.pw >create table scott.test1 as select * from scott.test;
2,暂停会话1
www.htz.pw >oradebug setospid 8729
Oracle pid: 15, Unix process pid: 8729, image: oracle@rhel4.htz.pw (TNS V1-V3)
www.htz.pw >oradebug event 10046 trace name context forever,level 12;
Statement processed.
www.htz.pw >oradebug suspend;
Statement processed.
www.htz.pw >oradebug tracefile_name;
/u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8729.trc
这里的trace 10046主要是想确认进程是否真的没有动了
3,会话2对test1表进行查询,结果hang住
[oracle10g@rhel4 sql]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Jun 3 21:56:34 2013
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
www.htz.pw >select * from scott.test1;
结果hang住了
4,通过dump systemstate来分析
使用library cache来查找
PROCESS 20:
  ----------------------------------------
  SO: 0x79e65108, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=20, calls cur/top: 0x79faf308/0x79fae7f0, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: none
              last post sent: 0 0 0
              last post sent-location: No post
              last process posted by me: none
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x79ea6808
    O/S info: user: oracle10g, term: pts/3, ospid: 8867
    OSD pid info: Unix process pid: 8867, image: oracle@rhel4.htz.pw (TNS V1-V3)
Dump of memory from 0x0000000079E4AAD8 to 0x0000000079E4ACE0
079E4AAD0                   00000005 00000000          [........]
079E4AAE0 76BA8B30 00000000 00000010 000313A7  [0..v............]
079E4AAF0 79FAE7F0 00000000 00000003 000313A7  [...y............]
079E4AB00 7958C260 00000000 0000000B 000313A7  [`.Xy............]
079E4AB10 79F72930 00000000 00000004 0003129B  [0).y............]
079E4AB20 78E0BB28 00000000 00000007 000313A7  [(..x............]
079E4AB30 00000000 00000000 00000000 00000000  [................]
  Repeat 26 times
    (FOB) flags=2 fib=0x78f697c8 incno=6 pending i/o cnt=0
     fname=/u01/app/oracle/oradata/orcl10g/system01.dbf
     fno=1 lblksz=8192 fsiz=61440
    ----------------------------------------
    SO: 0x79f72930, type: 4, owner: 0x79e65108, flag: INIT/-/-/0x00
    (session) sid: 147 trans: (nil), creator: 0x79e65108, flag: (80000041) USR/- BSY/-/-/-/-/-
              DID: 0001-0014-00000001, short-term DID: 0000-0000-00000000
              txn branch: (nil)
              oct: 3, prv: 0, sql: 0x75340208, psql: (nil), user: 0/SYS
    service name: SYS$USERS
    O/S info: user: oracle10g, term: pts/3, ospid: 8866, machine: rhel4.htz.pw
              program: sqlplus@rhel4.htz.pw (TNS V1-V3)
    application name: sqlplus@rhel4.htz.pw (TNS V1-V3), hash value=3044132538
    waiting for 'library cache lock' blocking sess=0x(nil) seq=11 wait_time=0 seconds since wait started=270
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
    Dumping Session Wait History
     for 'library cache lock' count=1 wait_time=2933014
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2930481
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2930503
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2932879
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2938262
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2930560
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2930628
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2953161
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2930606
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
     for 'library cache lock' count=1 wait_time=2934815
                handle address=75346180, lock address=76b9b5f0, 100*mode+namespace=c9
                
通过handle address地址找到了如下信息
PROCESS 15:
  ----------------------------------------
  SO: 0x79e62980, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=15, calls cur/top: 0x79fab8d0/0x79fa86f0, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 25
              last post received-location: ksasnr
              last process to post me: 79e5da70 1 6
              last post sent: 0 0 24
              last post sent-location: ksasnd
              last process posted by me: 79e5e258 1 6 
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x79ea6808
    O/S info: user: oracle10g, term: pts/0, ospid: 8729
    OSD pid info: Unix process pid: 8729, image: oracle@rhel4.htz.pw (TNS V1-V3)
    Short stack dump: 
    
    
.........................................................................
首先是找到下面的信息,再使用ctrl+b向前找,就可以找到前面的会话信息
      ----------------------------------------
      SO: 0x79faf308, type: 3, owner: 0x79fae7f0, flag: INIT/-/-/0x00
      (call) sess: cur 79f72930, rec 0, usr 79f72930; depth: 1
        ----------------------------------------
        SO: 0x76b9b5f0, type: 53, owner: 0x79faf308, flag: INIT/-/-/0x00
        LIBRARY OBJECT LOCK: lock=76b9b5f0 handle=75346180 request=S
        call pin=(nil) session pin=(nil) hpc=0003 hlc=0000
        htl=0x76b9b670[0x76f1c7c8,0x76f1c7c8] htb=0x76f1c7c8 ssga=0x76f1b8c0
        user=79f72930 session=79f72930 count=0 flags=RES/[0010] savepoint=0x20
        LIBRARY OBJECT HANDLE: handle=75346180 mtx=0x753462b0(0) cdp=0
        name=SCOTT.TEST1
        
        
         SO: 0x76b24238, type: 54, owner: 0x784d4cb8, flag: INIT/-/-/0x00
        LIBRARY OBJECT PIN: pin=76b24238 handle=75346180 mode=X lock=0
        user=79f82a10 session=79f82a10 count=1 mask=0709 savepoint=0x15a615 flags=[00]
        ----------------------------------------
        SO: 0x76bb7a60, type: 53, owner: 0x784d4cb8, flag: INIT/-/-/0x00
        LIBRARY OBJECT LOCK: lock=76bb7a60 handle=75346180 mode=X
        call pin=(nil) session pin=(nil) hpc=0000 hlc=0000
        htl=0x76bb7ae0[0x76b23058,0x76b23058] htb=0x76b23058 ssga=0x76b22150
        user=79f82a10 session=79f82a10 count=1 flags=[0000] savepoint=0x15a615
        LIBRARY OBJECT HANDLE: handle=75346180 mtx=0x753462b0(0) cdp=0
        name=SCOTT.TEST1
        hash=df25314504e786643e240bf673cd59df timestamp=06-03-2013 21:56:00 previous=NULL
        namespace=TABL flags=KGHP/TIM/PTM/SML/[02000000]
        kkkk-dddd-llll=0000-0709-0201 lock=X pin=X latch#=2 hpc=0004 hlc=0002
        lwt=0x75346228[0x76b9b620,0x76b9b620] ltm=0x75346238[0x75346238,0x75346238]
        pwt=0x753461f0[0x753461f0,0x753461f0] ptm=0x75346200[0x75346200,0x75346200]
        ref=0x75346258[0x75346258,0x75346258] lnd=0x75346270[0x75346270,0x75346270]
          LIBRARY OBJECT: object=74f77a58
这一段进程的开始部分以" name="就可以找到下面的SQL语句
      LIBRARY OBJECT HANDLE: handle=7534ad10 mtx=0x7534ae40(1) cdp=1
      name=create table scott.test1 as select * from scott.test
      hash=8ec1c218b47eeb3b9ca907ad0a10a75a timestamp=06-03-2013 21:55:59
      namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]     
5,通过手动查询视图
www.htz.pw >set lines 170 www.htz.pw >desc dba_kgllock; Name Null? Type ----------------------------------------------------------------------------------------------- -------- ---------------------------------------------------------------- KGLLKUSE RAW(8) KGLLKHDL RAW(8) KGLLKMOD NUMBER KGLLKREQ NUMBER KGLLKTYPE VARCHAR2(4) www.htz.pw >select * from dba_kgllock where kgllkreq>0; KGLLKUSE KGLLKHDL KGLLKMOD KGLLKREQ KGLL ---------------- ---------------- ---------- ---------- ---- 0000000079F72930 0000000075346180 0 2 Lock www.htz.pw >select * from dba_kgllock where kgllkhdl='0000000075346180' order by kgllkuse; KGLLKUSE KGLLKHDL KGLLKMOD KGLLKREQ KGLL ---------------- ---------------- ---------- ---------- ---- 0000000079F72930 0000000075346180 0 2 Lock 0000000079F82A10 0000000075346180 3 0 Pin 0000000079F82A10 0000000075346180 3 0 Lock www.htz.pw >select sid,program,command from v$session where saddr='0000000079F82A10'; SID PROGRAM COMMAND ---------- ------------------------------------------------ ---------- 159 sqlplus@rhel4.htz.pw (TNS V1-V3) 1 www.htz.pw >select action,name from audit_actions where action=1; ACTION NAME ---------- ---------------------------- 1 CREATE TABLE www.htz.pw >select KGLNAOWN ,KGLNAOBJ from x$kglob where kglhdadr='0000000075346180'; KGLNAOWN KGLNAOBJ -------------------- ------------------------------ SCOTT TEST1 如果发现是library cache lock我们可以能x$kgllk等视图我们可以马上查找到一些更详细的信息。 如下。 www.htz.pw >l 1* select kgllkses,kgllkhdl,kgllkmod,kgllkreq,user_name,kglnaobj from x$kgllk where kgllkhdl='0000000075346180' www.htz.pw >. www.htz.pw >/ KGLLKSES KGLLKHDL KGLLKMOD KGLLKREQ USER_NAME KGLNAOBJ ---------------- ---------------- ---------- ---------- ------------------------------ -------------------- 0000000079F82A10 0000000075346180 3 0 SYS TEST1 0000000079F72930 0000000075346180 0 2 SYS TEST1 select kgllkses,kgllkhdl,kgllkmod,kgllkreq,user_name,kglnaobj from x$kgllk where kgllkses='0000000079F82A10' and kgllkmod>0; 这样就可以看到是那句话导致的hang住的了。
6,通过写的脚本来直接查询
www.htz.pw >@event_library_cache_pin_9i.sql
Wrote file sqlplusset
                                    USERNAME                           LOCK
SID:SERIAL:SPID:LAST_CALL_ET        STATUS                  HASH_VALUE HAVE:REQ HANDLE           COMMAND         LOCK LOGON_TI PROGRAM              CLIENT
----------------------------------- -------------------- ------------- -------- ---------------- --------------- ---- -------- -------------------- --------------------
Holder: 1:159:3:8729.2138           SYS:ACTIVE               168863578 3.0      0000000075346180 CREATE TABLE    Lock 03 21:43 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz
Holder: 1:159:3:8729.2138           SYS:ACTIVE               168863578 3.0      0000000075346180 CREATE TABLE    Pin  03 21:43 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz
Waiter: 1:147:28:8867.2096          SYS:ACTIVE              3526267018 0.2      0000000075346180 SELECT          Lock 03 21:56 sqlplus@rhel4.htz.pw oracle10g@rhel4.htz
ADDR             handle           KGLHDPAR         KGLNAOWN             KGLNAOBJ               KGLNAHSH KGLHDOBJ
---------------- ---------------- ---------------- -------------------- -------------------- ---------- ----------------
0000002A97420498 0000000075346180 0000000075346180 SCOTT                TEST1                1942837727 0000000074F77A58
www.htz.pw >select event from v$session_wait where sid=159;
EVENT
----------------------------------------------------------------
Data file init write
www.htz.pw >select sql_text from v$sql where hash_value=168863578;
SQL_TEXT
--------------------------------------------------------------------------------
create table scott.test1 as select * from scott.test
这里发生通过hash_value找不到sql的具体内容,下面是通过errorstack的方式来查询到的内容
www.htz.pw >select sql_text from v$sql where hash_value=3526267018;
no rows selected
www.htz.pw >oradebug dump errorstack 2;
Statement processed.
www.htz.pw >oradebug tracefile_name;
/u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc
www.htz.pw >!vi /u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc
/u01/app/oracle/admin/orcl10g/udump/orcl10g_ora_8867.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name:    Linux
Node name:      rhel4.htz.pw
Release:        2.6.9-89.0.0.0.1.ELsmp
Version:        #1 SMP Tue May 19 05:31:20 EDT 2009
Machine:        x86_64
Instance name: orcl10g
Redo thread mounted by this instance: 1
Oracle process number: 20
Unix process pid: 8867, image: oracle@rhel4.htz.pw (TNS V1-V3)
*** 2013-06-03 22:35:00.278
*** ACTION NAME:() 2013-06-03 22:35:00.274
*** MODULE NAME:(sqlplus@rhel4.htz.pw (TNS V1-V3)) 2013-06-03 22:35:00.274
*** SERVICE NAME:(SYS$USERS) 2013-06-03 22:35:00.274
*** SESSION ID:(147.28) 2013-06-03 22:35:00.274
Received ORADEBUG command 'dump errorstack 2' from process Unix process pid: 9029, image:
*** 2013-06-03 22:35:00.278
ksedmp: internal or fatal error
Current SQL statement for this session:
select * from scott.test1
----- Call Stack Trace ----


create table 引起select语句hang住出现library cache lock事件:等您坐沙发呢!