ORA-01173的模拟与故障处理

下面的测试来至于一位网友，他们生产数据库异常，在drop表空间，重建控制文件后，报下面的错误：

Sat Jul 19 00:45:47 2014

SMON: enabling cache recovery

Sat Jul 19 00:45:47 2014

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_12464.trc:

ORA-01173: data dictionary indicates missing data file from system tablespace

Sat Jul 19 00:45:47 2014

Error 1173 happened during db open, shutting down database

USER: terminating instance due to error 1173

Instance terminated by USER, pid = 12464

ORA-1092 signalled during: alter database open resetlogs…

下面是简单的测试一下，提供2种方法来解决此故障。

欢迎大家加入ORACLE超级群：17115662 免费解决各种ORACLE问题,以后BLOG将迁移到http://www.htz.pw

1，数据库版本

www.htz.pw > select * from v$version where rownum<3;

BANNER

—————————————————————-

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bi

PL/SQL Release 10.2.0.4.0 – Production

www.htz.pw > !lsb_release -a

LSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch

Distributor ID: RedHatEnterpriseAS

Description: Red Hat Enterprise Linux AS release 4 (Nahant Update 8)

Release: 4

Codename: NahantUpdate8

2，查询undo段的名字

因为在实验过程中，我们需要使用到undo段的名字，所以这里提前查询出来，如果在生产环境，我们可以直接使用bbed去查询undo$表，或者是使用odu,dul等工具去直接抽取undo$表，另外了可以使用strings system数据文件来过滤UNDO段。

www.htz.pw > @undo_segment.sql

SEGMENT_HEADER

TABLESPACE SEGMENT_NAME FILE#.BLOCK STATUS SEGMENT_SIZE(M)

——————– —————————— ————– ———- —————

SYSTEM.OLD PRI.SYSTEM 1.9 ONLINE 0

UNDOTBS1.CURRENT PUB._SYSSMU1$ 2.9 ONLINE 1

UNDOTBS1.CURRENT PUB._SYSSMU10$ 2.153 ONLINE 1

UNDOTBS1.CURRENT PUB._SYSSMU9$ 2.137 ONLINE 13

UNDOTBS1.CURRENT PUB._SYSSMU8$ 2.121 ONLINE 18

UNDOTBS1.CURRENT PUB._SYSSMU7$ 2.105 ONLINE 0

UNDOTBS1.CURRENT PUB._SYSSMU6$ 2.89 ONLINE 6

UNDOTBS1.CURRENT PUB._SYSSMU5$ 2.73 ONLINE 2

UNDOTBS1.CURRENT PUB._SYSSMU4$ 2.57 ONLINE 0

UNDOTBS1.CURRENT PUB._SYSSMU3$ 2.41 ONLINE 1

UNDOTBS1.CURRENT PUB._SYSSMU2$ 2.25 ONLINE 0

3，生成创建控制文件脚本

[oracle@www.htz.pw sql]$./create_controlfile_sql.sh

please input direcotry default /tmp:

please input file name default control.ctl:

Database altered.

这里生成的默认文件位置在/tmp/control.ctl

4，重建控制文件

www.htz.pw > shutdown abort;

ORACLE instance shut down.

STARTUP NOMOUNT

CREATE CONTROLFILE REUSE DATABASE "ORCL1024" NORESETLOGS NOARCHIVELOG

MAXLOGFILES 16

MAXLOGMEMBERS 3

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 292

LOGFILE

GROUP 1 ‘/oracle/app/oracle/oradata/orcl1024/redo01.log’ SIZE 50M,

GROUP 2 ‘/oracle/app/oracle/oradata/orcl1024/redo02.log’ SIZE 50M,

GROUP 3 ‘/oracle/app/oracle/oradata/orcl1024/redo03.log’ SIZE 50M

DATAFILE

‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’,

‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’,（需要删除这行）

‘/oracle/app/oracle/oradata/orcl1024/sysaux01.dbf’,

‘/oracle/app/oracle/oradata/orcl1024/users01.dbf’

CHARACTER SET ZHS16GBK

;

RECOVER DATABASE

ALTER DATABASE OPEN;

ALTER TABLESPACE TEMP ADD TEMPFILE ‘/oracle/app/oracle/oradata/orcl1024/temp01.dbf’

SIZE 1482M REUSE AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;

5，故障现象出现

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2170641 generated at 07/19/2014 00:35:54 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_72

_%u_.arc

ORA-00280: change 2170641 for thread 1 is in sequence #72

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/oracle/app/oracle/oradata/orcl1024/redo02.log

ORA-00310: archived log contains sequence 71; sequence 72 required

ORA-00334: archived log: ‘/oracle/app/oracle/oradata/orcl1024/redo02.log’

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2170641 generated at 07/19/2014 00:35:54 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_72

_%u_.arc

ORA-00280: change 2170641 for thread 1 is in sequence #72

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/oracle/app/oracle/oradata/orcl1024/redo03.log

Log applied.

Media recovery complete.

www.htz.pw > alter database open resetlogs;

alter database open resetlogs

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

alert中出现下面的报错

Sat Jul 19 00:45:47 2014

SMON: enabling cache recovery

Sat Jul 19 00:45:47 2014

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_12464.trc:

ORA-01173: data dictionary indicates missing data file from system tablespace

Sat Jul 19 00:45:47 2014

Error 1173 happened during db open, shutting down database

USER: terminating instance due to error 1173

Instance terminated by USER, pid = 12464

ORA-1092 signalled during: alter database open resetlogs…

6，故障处理方法1

在运气比较好的情况下使用此方案是可行的，朋友的数据库使用此方法，数据库能正常的OPEN。

6.1 修改参数文件

这里手动创建pfile文件，直接修改pfile文件比较简单，并且不影响原spfile文件，增加下面红色部分参数

www.htz.pw > !vi /tmp/123.ora

orcl1024.__db_cache_size=54525952

orcl1024.__java_pool_size=4194304

orcl1024.__large_pool_size=8388608

orcl1024.__shared_pool_size=88080384

orcl1024.__streams_pool_size=0

*._backup_ksfq_bufsz=1048576

*._log_parallelism=2

*._log_parallelism_max=4

*._pga_max_size=5368709120

*._smm_max_size=3145728

*.audit_file_dest=’/oracle/app/oracle/admin/orcl1024/adump’

*.background_dump_dest=’/oracle/app/oracle/admin/orcl1024/bdump’

*.compatible=’10.2.0.3.0′

*.control_files=’/oracle/app/oracle/oradata/orcl1024/control01.ctl’,’/oracle/app/oracle/oradata/orcl1024/control02.ctl’,’/oracle/app/oracle/oradata/orcl1024/control03.ctl’

*.core_dump_dest=’/oracle/app/oracle/admin/orcl1024/cdump’

*.cpu_count=3

*.db_block_size=8192

*.db_domain=”

*.db_file_multiblock_read_count=16

*.db_name=’orcl1024′

*.db_recovery_file_dest=’/oracle/app/oracle/flash_recovery_area’

*.db_recovery_file_dest_size=4294967296

*.dbwr_io_slaves=4

*.disk_asynch_io=FALSE

*.dispatchers='(PROTOCOL=TCP) (SERVICE=orcl1024XDB)’

*.event=”

*.job_queue_processes=10

*.open_cursors=300

*.pga_aggregate_target=1073741824

*.processes=150

*.recyclebin=’OFF’

*.remote_login_passwordfile=’EXCLUSIVE’

*.sga_target=167772160

#*.undo_management=’AUTO’

*.undo_management=’manual’

*.undo_tablespace=’UNDOTBS1′

*.user_dump_dest=’/oracle/app/oracle/admin/orcl1024/udump’

_corrupted_rollback_segments=(_SYSSMU1$,_SYSSMU2$,_SYSSMU3$,_SYSSMU4$,_SYSSMU5$,_SYSSMU6$,_SYSSMU7$,_SYSSMU8$,_SYSSMU9$,_SYSSMU10$)

这里通常还需要增加下面的2个参数

_allow_resetlogs_corruption=true

_allow_error_simulation=true

另外还可以会增加一个event，如果smon一些功能的event。

6.2 启动数据库

www.htz.pw > startup mount pfile=’/tmp/123.ora’;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Database mounted.

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2171386 generated at 07/19/2014 00:45:47 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2171386 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/oracle/app/oracle/oradata/orcl1024/redo03.log

ORA-00339: archived log does not contain any redo

ORA-00334: archived log: ‘/oracle/app/oracle/oradata/orcl1024/redo03.log’

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2171386 generated at 07/19/2014 00:45:47 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2171386 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/oracle/app/oracle/oradata/orcl1024/redo01.log

Log applied.

Media recovery complete.

www.htz.pw > alter database open resetlogs;

Database altered.

这里看到数据库已经正常打开，这里还需要注意观察，alert日志文件是否有异常报错。

6.3 重建undo表空间

www.htz.pw > !rm /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf

www.htz.pw > create undo tablespace undotbs1 datafile ‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’ size 10m;

Tablespace created.

6.4 使用源参数文件启动数据库

www.htz.pw > shutdown immediate;

Database closed.

Database dismounted.

ORACLE instance shut down.

www.htz.pw > startup

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Database mounted.

Database opened.

www.htz.pw >

数据库启动正常，注意观察alert日志中是否有报错。

7 故障处理方法2

使用此方法，要求原来的UNDO数据文件存在，此方法就是将原来的undo数据文件再次增加到控制文件中去，此方法比较复制，因为在开启数据库的都会遇到其它很多的一些问题。

7.1 故障现象重现

www.htz.pw > select open_mode from v$database;

OPEN_MODE

———-

READ WRITE

数据库的状态是正常的

www.htz.pw > select name from v$dbfile;

NAME

——————————————————————————–

/oracle/app/oracle/oradata/orcl1024/users01.dbf

/oracle/app/oracle/oradata/orcl1024/sysaux01.dbf

/oracle/app/oracle/oradata/orcl1024/system01.dbf

/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf

存在的数据文件

www.htz.pw > shutdown abort;

ORACLE instance shut down.

重建控制文件，控制文件中不包括undo表空间

www.htz.pw > @/tmp/control.ctl

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Control file created.

ORA-00283: recovery session canceled due to errors

ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

ALTER DATABASE OPEN

ERROR at line 1:

ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

ALTER TABLESPACE TEMP ADD TEMPFILE ‘/oracle/app/oracle/oradata/orcl1024/temp01.dbf’

ERROR at line 1:

ORA-01109: database not open

www.htz.pw >

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2171941 generated at 07/19/2014 00:51:38 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2171941 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

cancel

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

ORA-01112: media recovery not started

*.remote_login_passwordfile=’EXCLUSIVE’

*.sga_target=167772160

#*.undo_management=’AUTO’

*.undo_management=’manual’

*.undo_tablespace=’UNDOTBS1′

*.user_dump_dest=’/oracle/app/oracle/admin/orcl1024/udump’

_allow_resetlogs_corruption=true

_allow_error_simulation=true

增加上面的参数文件

www.htz.pw > startup force mount pfile=’/tmp/123.ora’;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Database mounted.

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2171941 generated at 07/19/2014 00:51:38 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2171941 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/oracle/app/oracle/oradata/orcl1024/redo01.log

Log applied.

Media recovery complete.

www.htz.pw >

www.htz.pw > alter database open resetlogs;

alter database open resetlogs

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

重现故障现在

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_14960.trc:

ORA-00704: bootstrap process failure

ORA-00604: error occurred at recursive SQL level 2

ORA-01173: data dictionary indicates missing data file from system tablespace

Sat Jul 19 01:01:40 2014

Error 704 happened during db open, shutting down database

USER: terminating instance due to error 704

Instance terminated by USER, pid = 14960

ORA-1092 signalled during: alter database open resetlogs…

7.2 重建控制文件

重建控制文件，控制文件中包括undo表空间的数据文件

[oracle@www.htz.pw sql]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 – Production on Sat Jul 19 01:09:05 2014

Connected to an idle instance.

www.htz.pw > startup nomount pfile=’/tmp/123.ora’;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

www.htz.pw > @/tmp/control.ctl

ORA-01081: cannot start already-running ORACLE – shut it down first

CREATE CONTROLFILE REUSE DATABASE "ORCL1024" RESETLOGS NOARCHIVELOG

ERROR at line 1:

ORA-01503: CREATE CONTROLFILE failed

ORA-01189: file is from a different RESETLOGS than previous files

ORA-01110: data file 2: ‘/oracle/app/oracle/oradata/orcl1024/undotbs01.dbf’

这里提示ORA-01189的错误。

1189的错误很简单，因为数据文件头的resetlogs信息不一致导致的。

7.3 bbed修改数据文件头中RESETLOG与SCN信息

www.htz.pw > !cat /tmp/bbed.par

listfile=/tmp/bbed.datafile

www.htz.pw > !cat /tmp/bbed.datafile

1 /oracle/app/oracle/oradata/orcl1024/system01.dbf

2 /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf

3 /oracle/app/oracle/oradata/orcl1024/sysaux01.dbf

4 /oracle/app/oracle/oradata/orcl1024/users01.dbf

[oracle@www.htz.pw ~]$bbed parfile=/tmp/bbed.par

Password:

BBED: Release 2.0.0.0.0 – Limited Production on Sat Jul 19 01:12:13 2014

************* !!! For Oracle Internal Use only !!! ***************

BBED> info

File# Name Size(blks)

—– —- ———-

1 /oracle/app/oracle/oradata/orcl1024/system01.dbf 0

2 /oracle/app/oracle/oradata/orcl1024/undotbs01.dbf 0

3 /oracle/app/oracle/oradata/orcl1024/sysaux01.dbf 0

4 /oracle/app/oracle/oradata/orcl1024/users01.dbf 0

这里只需要修改上次resetlogs与SCN的值就可以了

BBED> assign file 2 block 1 offset 112 = file 1 block 1 offset 112;

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y

ub4 kcvfhrlc @112 0x32dc2c73

BBED> assign file 2 block 1 offset 116 = file 1 block 1 offset 116;

ub4 kscnbas @116 0x002125e0

BBED> assign file 2 block 1 offset 484 = file 1 block 1 offset 484;

ub1 pad @484 0xe1

BBED> assign file 2 block 1 offset 492 = file 1 block 1 offset 492;

ub1 pad @492 0x74

BBED> sum apply dba 2,1

Check value for File 2, Block 1:

current = 0x093b, required = 0x093b

www.htz.pw > startup force nomount pfile=’/tmp/123.ora’;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

7.4 重建控制文件

www.htz.pw > @/tmp/control.ctl

ORA-01081: cannot start already-running ORACLE – shut it down first

Control file created.

ORA-00283: recovery session canceled due to errors

ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

控制文件重建成功

7.5 遇到600错误

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2172129 generated at 07/19/2014 00:53:08 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2172129 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

cancel

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

ORA-01112: media recovery not started

www.htz.pw > alter database open resetlogs;

alter database open resetlogs

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

ORA-00704: bootstrap process failure

ORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [],[], []

这里触发了ORA-00600 kcbgtcr_13错误，只需要手动提交事务就可以了。

7.6 手动提交事务信息

[09:50:33]www.htz.pw > startup mount pfile=’/tmp/123.ora’;

[09:50:34]ORACLE instance started.

[09:50:34]

[09:50:34]Total System Global Area 167772160 bytes

[09:50:34]Fixed Size 2082432 bytes

[09:50:34]Variable Size 100665728 bytes

[09:50:34]Database Buffers 54525952 bytes

[09:50:34]Redo Buffers 10498048 bytes

[09:50:38]Database mounted.

[09:50:51]www.htz.pw > recover database using backup controlfile until cancel;

[09:50:51]ORA-00279: change 2172135 generated at 07/19/2014 01:41:17 needed for thread 1

[09:50:51]ORA-00289: suggestion :

[09:50:51]/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

[09:50:51]%u_.arc

[09:50:51]ORA-00280: change 2172135 for thread 1 is in sequence #1

[09:50:51]

[09:50:51]Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

[09:50:53]cancel

[09:50:54]ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

[09:50:54]ORA-01194: file 1 needs more recovery to be consistent

[09:50:54]ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

[09:50:54]

[09:50:54]ORA-01112: media recovery not started

[09:50:54]

[09:51:06]www.htz.pw > alter database open resetlogs;

[09:51:09]alter database open resetlogs

[09:51:09]*

[09:51:09]ERROR at line 1:

[09:51:09]ORA-01092: ORACLE instance terminated. Disconnection forced

后面alert报下面的错误

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc:

ORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []

Sat Jul 19 01:43:01 2014

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc:

ORA-00704: bootstrap process failure

ORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []

Sat Jul 19 01:43:01 2014

Error 704 happened during db open, shutting down database

在trace文件中查看有那些块没有提交。

[oracle@www.htz.pw ~]$grep -E ‘^Block header dump|^0x0’ /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_32544.trc

0x01 0x1f64 0x02 0x1ef8

Block header dump: 0x0040007a

0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000

Block header dump: 0x0040017c

0x01 0x0000.022.00000002 0x00400196.0004.37 –U- 12 fsc 0x0000.00000147

Block header dump: 0x004000da

0x01 0x0004.00c.0000011d 0x0080559d.00d3.02 C— 0 scn 0x0000.0008ab18

Block header dump: 0x004000db

0x01 0x0008.017.00000002 0x00800080.0000.01 CBU- 0 scn 0x0000.00002404

0x02 0x0004.01a.0000017a 0x0080003c.016b.32 –U- 1 fsc 0x000e.001e8de0

Block header dump: 0x0040007a

0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000

Block header dump: 0x0040006a

0x01 0x0000.008.00000034 0x0040019e.003b.07 C— 0 scn 0x0000.0021245c

通过10046跟踪报错的SQL语句

ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [kcbgtcr_13], [], [], [], [], [], [], []

Current SQL statement for this session:

select ctime, mtime, stime from obj$ where obj# = :1

—– Call Stack Trace —–

这里看到了报错的SQL语句，以SQL语句来搜索，直到搜索到如下的

Cursor#5(0x2a97ca18b0) state=FETCH curiob=0x2a97cba468

curflg=f fl2=0 par=0x2a97ca1710 ses=0x69f82a30

sqltxt(0x69a944b0)=select ctime, mtime, stime from obj$ where obj# = :1

hash=fa0bd3f60d6ee4f2495f9af8199b75b9

parent=0x6677c4b8 maxchild=01 plk=0x66f56af0 ppn=n

cursor instantiation=0x2a97cba468 used=1405705379

child#0(0x69a94288) pcs=0x6677c0c8

clk=0x66f56dd0 ci=0x6677b7b0 pn=0x69ad37f0 ctx=0x6616fe90

kgsccflg=0 llk[0x2a97cba470,0x2a97cba470] idx=0

xscflg=e0141476 fl2=45000401 fl3=4022210c fl4=100

Bind bytecodes

Opcode = 1 Unoptimized

Offsi = 48, Offsi = 0

kkscoacd

Bind#0

oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00

oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0

kxsbbbfp=2a97cba020 bln=22 avl=02 flg=05

value=20

这里中可以看到绑定变量的值是20.在相同的版本其它的数据库中执行下面的操作

SQL> select rowid from obj$ where obj# =20;

ROWID

——————

AAAAASAABAAAAB6AAA

SQL> @rowid_to_info.sql

Enter value for rowid: AAAAASAABAAAAB6AAA

ROWID_TYPE: 1

OBJECT_NUMBER: 18

RELATIVE_FNO: 1

BLOCK_NUMBER: 122

ROW_NUMBER: 0

PL/SQL procedure successfully completed.

正在好trace文件中的

Block header dump: 0x0040007a

0x01 0x0003.001.00000191 0x0080002b.014c.03 —- 1 fsc 0x0000.00000000

其实我们还可以从10046trace文件中找到此信息如下：

=====================

PARSING IN CURSOR #5 len=52 dep=1 uid=0 oct=3 lid=0 tim=1372757926978214 hv=429618617 ad=’69a944b0′

select ctime, mtime, stime from obj$ where obj# = :1

END OF STMT

PARSE #5:c=0,e=234,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1372757926978212

BINDS #5:

kkscoacd

Bind#0

oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00

oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0

kxsbbbfp=2a97cba020 bln=22 avl=02 flg=05

value=20

EXEC #5:c=0,e=330,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1372757926978586

WAIT #5: nam=’db file sequential read’ ela= 23 file#=1 block#=218 blocks=1 obj#=-1 tim=1372757926978753

WAIT #5: nam=’db file sequential read’ ela= 9 file#=1 block#=219 blocks=1 obj#=-1 tim=1372757926978804

WAIT #5: nam=’db file sequential read’ ela= 7 file#=1 block#=122 blocks=1 obj#=-1 tim=1372757926978841

这里需要注意的是绑定变量的值。

在trace文件中可以发现下面的内容

tab 0, row 26, @0x18f1

tl: 70 fb: –H-FL– lb: 0x1 cc: 17

col 0: [ 2] c1 02

col 1: [ 4] c3 06 17 08

col 2: [ 1] 80

col 3: [12] 5f 4e 45 58 54 5f 4f 42 4a 45 43 54

col 4: [ 2] c1 02

col 5: *NULL*

col 6: [ 1] 80

col 7: [ 7] 78 6c 03 0c 01 28 31

col 8: [ 7] 78 72 07 13 01 3b 01

col 9: [ 7] 78 6c 03 0c 01 28 31

col 10: [ 1] 80

col 11: *NULL*

col 12: *NULL*

col 13: [ 1] 80

col 14: *NULL*

col 15: [ 1] 80

col 16: [ 4] c3 07 38 24

bbed手动提交事务，需要更改itl与行中的lck值

BBED> p ktbbh

struct ktbbh, 48 bytes @20

ub1 ktbbhtyp @20 0x01 (KDDBTDATA)

union ktbbhsid, 4 bytes @24

ub4 ktbbhsg1 @24 0x00000012

ub4 ktbbhod1 @24 0x00000012

struct ktbbhcsc, 8 bytes @28

ub4 kscnbas @28 0x0021251b

ub2 kscnwrp @32 0x0000

b2 ktbbhict @36 1

ub1 ktbbhflg @38 0x02 (NONE)

ub1 ktbbhfsl @39 0x00

ub4 ktbbhfnx @40 0x00000000

struct ktbbhitl[0], 24 bytes @44

struct ktbitxid, 8 bytes @44

ub2 kxidusn @44 0x0003

ub2 kxidslt @46 0x0001

ub4 kxidsqn @48 0x00000191

struct ktbituba, 8 bytes @52

ub4 kubadba @52 0x0080002b

ub2 kubaseq @56 0x014c

ub1 kubarec @58 0x03

ub2 ktbitflg @60 0x0001 (NONE)

union _ktbitun, 2 bytes @62

b2 _ktbitfsc @62 0

ub2 _ktbitwrp @62 0x0000

ub4 ktbitbas @64 0x00000000

BBED> modify /x 80 offset 61

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y

File: /oracle/app/oracle/oradata/orcl1024/system01.dbf (1)

Block: 122 Offsets: 61 to 70 Dba:0x0040007a

————————————————————————

80000000 00000000 016c

BBED> x /rn *kdbr[26]

rowdata[5278] @6453

————-

flag@6453: 0x2c (KDRHFL, KDRHFF, KDRHFH)

lock@6454: 0x01

cols@6455: 17

col 0[2] @6456: 1

col 1[4] @6459: 52207

col 2[1] @6464: 0

col 3[12] @6466: -0

col 4[2] @6479: 1

col 5[0] @6482: *NULL*

col 6[1] @6483: 0

col 7[7] @6485: -0

col 8[7] @6493: -0

col 9[7] @6501: -0

col 10[1] @6509: 0

col 11[0] @6511: *NULL*

col 12[0] @6512: *NULL*

col 13[1] @6513: 0

col 14[0] @6515: *NULL*

col 15[1] @6516: 0

col 16[4] @6518: 65535

BBED> modify /x 00 offset 6454

File: /oracle/app/oracle/oradata/orcl1024/system01.dbf (1)

Block: 122 Offsets: 6454 to 6463 Dba:0x0040007a

————————————————————————

001102c1 0204c306 1708

BBED> sum apply

Check value for File 1, Block 122:

current = 0x3d20, required = 0x3d20

BBED> verify

DBVERIFY – Verification starting

FILE = /oracle/app/oracle/oradata/orcl1024/system01.dbf

BLOCK = 122

DBVERIFY – Verification complete

Total Blocks Examined : 1

Total Blocks Processed (Data) : 1

Total Blocks Failing (Data) : 0

Total Blocks Processed (Index): 0

Total Blocks Failing (Index): 0

Total Blocks Empty : 0

Total Blocks Marked Corrupt : 0

Total Blocks Influx : 0

7.7 报00600坏块的错误

www.htz.pw > alter database open resetlogs;

alter database open resetlogs

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

Sat Jul 19 01:56:37 2014

Errors in file /oracle/app/oracle/admin/orcl1024/udump/orcl1024_ora_1894.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-00607: Internal error occurred while making a change to a data block

ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [106], [6101], [], [], [], []

Error 604 happened during db open, shutting down database

USER: terminating instance due to error 604

Instance terminated by USER, pid = 1894

ORA-1092 signalled during: alter database open resetlogs…

这里可以看到数据文件1，块106，出现了6101的错误。此错误由于是ITL中的值与LOCK不一致导致的。

bbed修改行的lock值

BBED> set dba 1,106

DBA 0x0040006a (4194410 1,106)

BBED> verify

DBVERIFY – Verification starting

FILE = /oracle/app/oracle/oradata/orcl1024/system01.dbf

BLOCK = 106

Block Checking: DBA = 4194410, Block Type = KTB-managed data block

data header at 0x2a97696244

kdbchk: row locked by non-existent transaction

table=0 slot=10

lockid=1 ktbbhitc=1

Block 106 failed with check code 6101

DBVERIFY – Verification complete

Total Blocks Examined : 1

Total Blocks Processed (Data) : 1

Total Blocks Failing (Data) : 1

Total Blocks Processed (Index): 0

Total Blocks Failing (Index): 0

Total Blocks Empty : 0

Total Blocks Marked Corrupt : 0

Total Blocks Influx : 0

此报错的修改见6101(row locked by non-existent transaction)

7.8 启动数据库

通过上面几步操作，再次启动数据库

www.htz.pw > startup mount pfile=’/tmp/123.ora’;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Database mounted.

www.htz.pw > recover database using backup controlfile unitl cancel;

ORA-00905: missing keyword

www.htz.pw > recover database using backup controlfile until cancel;

ORA-00279: change 2172139 generated at 07/19/2014 01:56:34 needed for thread 1

ORA-00289: suggestion :

/oracle/app/oracle/flash_recovery_area/ORCL1024/archivelog/2014_07_19/o1_mf_1_1_

%u_.arc

ORA-00280: change 2172139 for thread 1 is in sequence #1

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

cancel

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: ‘/oracle/app/oracle/oradata/orcl1024/system01.dbf’

ORA-01112: media recovery not started

www.htz.pw > alter database open resetlogs;

Database altered.

使用原参数能正常启动数据库。

www.htz.pw > startup force;

ORACLE instance started.

Total System Global Area 167772160 bytes

Fixed Size 2082432 bytes

Variable Size 100665728 bytes

Database Buffers 54525952 bytes

Redo Buffers 10498048 bytes

Database mounted.

Database opened.

www.htz.pw >

整个实验测试结束

00600 bbed kcbgtcr_13kddummy_blkchk ORA-01173 ORA-01610

认真就输

ORA-01173的模拟与故障处理

您可能还会对这些文章感兴趣！

ORA-01173的模拟与故障处理：等您坐沙发呢！

发表评论

分类

归档

友情链接

最活跃的读者

最新评论

2014 年 7 月
一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31