当前位置: 首页 > EVENT > 正文

      数据库今天早上突然出现大量的log file sync等待事件,如下:

I EVENT                                    SQL_ID               COUNT(*)

— —————————————- —————— ———-

 1 Backup: MML write backup piece                                       8

   PX Deq: Execution Msg                    81prbu1ddkd5c               1

   db file scattered read                   f404sdb940xrp               1

   log file sync                                                        7

   read by other session                    f404sdb940xrp               4

 2 PX Deq: Execute Reply                    81prbu1ddkd5c               1

   PX Deq: Execution Msg                    81prbu1ddkd5c               1

   db file sequential read                  06xny3qtp17j4               1

   db file sequential read                  0pthc62tk3b6d               1

   db file sequential read                  1rmtm7zh5qvyq               1

   db file sequential read                  2m6tz655jw57t               1

   db file sequential read                  3yfa7shq4vvgw               1

   db file sequential read                  48kt9xh5g1tdj               1

   db file sequential read                  4xjm94x3v9pyu               1

   db file sequential read                  5j6c12bw59bvk               1

   db file sequential read                  6dkrf3bzhgxrp               1

   db file sequential read                  6nwm098n4vs5z               1

   db file sequential read                  77zm7d3rhnu5z               1

   db file sequential read                  7r3jkwzkcuz5d               1

   db file sequential read                  81p0m707yjwbs               1

   db file sequential read                  94jsv1hdp6ks9               1

   db file sequential read                  96ncsaz7tfh3v               1

   db file sequential read                  9mcnbjwx7qynf               1

   db file sequential read                  9nx093r7hr0g3               1

   db file sequential read                  a4j25uk2q0jdp               1

   db file sequential read                  asphg45zk6d7m               1

   db file sequential read                  bcsjzna3fvrb1               1

   db file sequential read                  dp2jf1dpvd7x3               1

   db file sequential read                  dwbxgb7838xht               1

   db file sequential read                  fpabwmm0c5812               1

   db file sequential read                  gb995a2pwuqu2               1

   db file sequential read                  gf25ng83zukxx               1

   db file sequential read                  gm42ngbs5ac9a               1

   db file sequential read                  gmk8cs80tb9u9               1

   enq: TX – contention                     3dhx24xm7u1xp               1

   library cache lock                       85spadpadkvhj               1

   log file sync                                                       41

 

37 rows selected.

 

 

 I EVENT                                      COUNT(*)

— —————————————- ———-

 1 Backup: MML write backup piece                    8

   log file sync                                     7

   read by other session                             4

   db file scattered read                            1

   PX Deq: Execution Msg                             1

 2 log file sync                                    38

   db file sequential read                          29

   enq: TX – contention                              1

   PX Deq: Execution Msg                             1

   PX Deq: Execute Reply                             1

   SQL*Net message from client                       1

基本都出现在节点2上,节点1只有少量的log file sync等待事件,不过在节点1上面出现了一个rman的等待事件,并且数量达到了8个,有点异常了。在原来做NBU的时候,一般配置都不会操作过4个。

 

查看lgwrtrace文件,怀疑是由于自适应日志写导致的,这是11G中的新功能。

Warning: log write elapsed time 589ms, size 660KB

Warning: log write elapsed time 635ms, size 1290KB

Warning: log write elapsed time 866ms, size 1312KB

……………..

Warning: log write elapsed time 1414ms, size 930KB

Warning: log write elapsed time 510ms, size 1844KB

通过trace文件,未发现相当的trace信息输出

 

查看io速度

由于aix平台,sar命令需要root用户权限才可以执行,所以这里直接查看光纤的速度就可以了

节点2的光纤速度

lqtopas_nmonqqL=LargePageStatsqqqHost=abmdb02qqqqqqqqRefresh=2 secsqqq09:32

x Fibre-Channel-Adapter qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

x       Adapter   Receive  Transmit        -Requests-        –Size KB–  

xNumber Name       KB/s     KB/s           In     Out        In      Out  

x    1      fcs0  23594.9  11831.3        546.5   481.7      43.2     24.6

x    2      fcs1      0.0      0.0          0.0     0.0       0.0      0.0

x    3      fcs2  22541.6  10784.7        541.1   471.9      41.7     22.9

x    4      fcs3      0.0      0.0          0.0     0.0       0.0      0.0

lqtopas_nmonqqk=Kernel-statsqqqqqHost=abmdb02qqqqqqqqRefresh=2 secsqqq09:32

x Fibre-Channel-Adapter qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

x       Adapter   Receive  Transmit        -Requests-        –Size KB–  

xNumber Name       KB/s     KB/s           In     Out        In      Out  

x    1      fcs0  28903.6  21866.5        705.2   811.4      41.0     26.9

x    2      fcs1      0.0      0.0          0.0     0.0       0.0      0.0

x    3      fcs2  28384.6  20360.7        639.4   796.6      44.4     25.6

x    4      fcs3      0.0      0.0          0.0     0.0       0.0      0.0

这里发现节点2的传递速度只有20M/S左右,太不正常了

节点1的光纤速度

lqtopas_nmonqq#=PURR StatsqqqqqqqHost=abmdb01qqqqqqqqRefresh=2 secsqqq09:50

x Fibre-Channel-Adapter qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

x       Adapter   Receive  Transmit        -Requests-        –Size KB–  

xNumber Name       KB/s     KB/s           In     Out        In      Out  

x    1      fcs0 145064.1   1317.7       1245.3    67.3     116.5     19.6

x    2      fcs1      0.0 224332.0          0.0  1752.6       0.0    128.0

x    3      fcs2 144409.3   1985.3       1259.9    63.9     114.6     31.1

x    4      fcs3      0.0      0.0          0.0     0.0       0.0      0.0

lqtopas_nmonqqc=CPUqqqqqqqqqqqqqqHost=abmdb01qqqqqqqqRefresh=2 secsqqq09:50

x Fibre-Channel-Adapter qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq

x       Adapter   Receive  Transmit        -Requests-        –Size KB–  

xNumber Name       KB/s     KB/s           In     Out        In      Out  

x    1      fcs0 146195.3    133.1       1289.7    11.3     113.4     11.8

x    2      fcs1      0.0 228583.8          0.0  1785.8       0.0    128.0

x    3      fcs2 143075.5    151.9       1267.7    13.2     112.9     11.5

x    4      fcs3      0.0      0.0          0.0     0.0       0.0      0.0

这里发现节点1的传递速度达到了220M/S的速度。

通过两个节点的光纤速度相加差不250M/S的速度了,基本上已经达到了存储的限制值了

 

收集两个节点AWR数据

节点2异常时的IO统计

clip_image001[4]

节点2异常时的IO统计

clip_image002[4]

基本上跟之前的光纤卡的速度对应,并且节点1RMAN占用的IO达到了274M/S,太吓人了

下面来看看前一天的相同时段的AWR数据

节点2

clip_image003[4]

节点1

clip_image004[4]

可以看到在正常时段的IO还不到150M/S

 

分析总结

由于节点1RMAN消耗大量的IO,导致存储IO性能下降,导致节点2出现大量的log file sync,通知TSM备份工程师,暂停RMAN备份后,系统恢复正常。

一次log file sync等待事件分析:等您坐沙发呢!

发表评论

gravatar

? razz sad evil ! smile oops grin eek shock ??? cool lol mad twisted roll wink idea arrow neutral cry mrgreen

快捷键:Ctrl+Enter