最近几天发现库里有坏块了,环境是11gR2, linux平台的64位的库。以下是我的修复办法,基于dbms_repair做的在线修复,也可以基于备份rman来修复,archivelog,noarchive log可能修复的方式有所不同。
-->首先从alert.log里面发现如下的错误。

DDE: Problem Key 'ORA 1110' was flood controlled (0x1) (no incident)
ORA-01110: data file 8: '/dbTS2/oracle/TESTDB2/oradata3/data/TESTDB2_pool_data_03.dbf'
Byte offset to file# 8 block# 570051 is 374890496
Incident 1567129 created, dump file:

-->从trace文件里有更详细的描述。

/opt/app/oracle/testdb2/admin/TESTDB2/diag/rdbms/TESTDB2/TESTDB2/incident/incdir_1567129/TESTDB2_o
ra_5396_i1567129.trc
ORA-01578: ORACLE data block corrupted (file # 8, block # 570051)
ORA-01110: data file 8: '/dbTS2/oracle/TESTDB2/oradata3/data/TESTDB2_pool_data_03.dbf'

Dump continued from file: /opt/app/oracle/testdb2/admin/TESTDB2/diag/rdbms/TESTDB2/TESTDB2/trace/TESTDB2_ora_5396.trc
ORA-01578: ORACLE data block corrupted (file # 8, block # 570051)
ORA-01110: data file 8: '/dbTS2/oracle/TESTDB2/oradata3/data/TESTDB2_pool_data_03.dbf'

========= Dump for incident 1567129 (ORA 1578) ========
*** 2013-12-11 07:25:21.257
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=7u9gsk798bvrp) -----
SELECT  xxxxxFROM  APP_CONTROL
AC,  APP_BILL_PROC BL WHERE APP.DATA_GROUP IS NOT NULL AND BL.PROCESS_ID = APP.NXT_PGM_NAME AND
APP.FILE_STATUS IN ('RD', 'IU', 'CN')
GROUP BY xxxxxxx

-->尝试查看坏块的segment_type,确认一下是Index还是table segment出问题了。查询没有任何结果。
SQL> select segment_name,tablespace_name,segment_type,block_id,file_id,bytes from dba_extents where block_id=570051 and file_id=8;
no rows selected

-->运行日志中的sql,果断的报错了。

SELECT  xxxxxFROM  APP_CONTROL
AC,  APP_BILL_PROC BL WHERE APP.DATA_GROUP IS NOT NULL AND BL.PROCESS_ID = APP.NXT_PGM_NAME AND
APP.FILE_STATUS IN ('RD', 'IU', 'CN')
GROUP BY xxxxxxx

*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 570051)
ORA-01110: data file 8:
'/dbTS2/oracle/TESTDB2/oradata3/data/TESTDB2_pool_data_03.dbf'

-->只是从相关的表里select count没有任何问题。

SQL> select count(*)from APP_CONTROL;
COUNT(*)
----------
1613

SQL> select count(*)from APP_BILL_PROC ;
COUNT(*)
----------
103

-->再次验证,还是报错。

SELECT  xxxxxFROM  APP_CONTROL
AC,  APP_BILL_PROC BL WHERE APP.DATA_GROUP IS NOT NULL AND BL.PROCESS_ID = APP.NXT_PGM_NAME AND
APP.FILE_STATUS IN ('RD', 'IU', 'CN')
GROUP BY xxxxxxx
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 570051)
ORA-01110: data file 8:
'/dbTS2/oracle/TESTDB2/oradata3/data/TESTDB2_pool_data_03.dbf'

--通过sys来调用dbms_repair来修复。
SQL> BEGIN
2 DBMS_REPAIR.ADMIN_TABLES (
3 TABLE_NAME => 'REPAIR_TABLE',
4 TABLE_TYPE => dbms_repair.repair_table,
5 ACTION => dbms_repair.create_action,
6 TABLESPACE => '&tablespace_name');
7 END;
8 /
Enter value for tablespace_name:
old 6: TABLESPACE => '&tablespace_name');
new 6: TABLESPACE => 'POOL_DATA');

PL/SQL procedure successfully completed.

-->以上的步骤会生成一个表repair_table
SQL> desc repair_table

Name                   Null?  Type
----------------------------------------- -------- ----------------------------
OBJECT_ID                 NOT NULL NUMBER
TABLESPACE_ID               NOT NULL NUMBER
RELATIVE_FILE_ID             NOT NULL NUMBER
BLOCK_ID                 NOT NULL NUMBER
CORRUPT_TYPE               NOT NULL NUMBER
SCHEMA_NAME                NOT NULL VARCHAR2(30)
OBJECT_NAME                NOT NULL VARCHAR2(30)
BASEOBJECT_NAME                  VARCHAR2(30)
PARTITION_NAME                   VARCHAR2(30)
CORRUPT_DESCRIPTION                VARCHAR2(2000)
REPAIR_DESCRIPTION                 VARCHAR2(200)
MARKED_CORRUPT              NOT NULL VARCHAR2(10)
CHECK_TIMESTAMP              NOT NULL DATE
FIX_TIMESTAMP                   DATE
REFORMAT_TIMESTAMP                 DATE

-->来定位schema object中的坏块情况

SQL> set serveroutput on
DECLARE num_corrupt INT;
SQL>  2 BEGIN
3  num_corrupt := 0;
4  DBMS_REPAIR.CHECK_OBJECT (
5  SCHEMA_NAME => '&schema_name',
6  OBJECT_NAME => '&object_name',
7  REPAIR_TABLE_NAME => 'REPAIR_TABLE',
8  corrupt_count => num_corrupt);
9  DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt));
10 END;
11 /
Enter value for schema_name: TSTAPPO2
old  5:  SCHEMA_NAME => '&schema_name',
new  5:  SCHEMA_NAME => 'TSTAPPO2',
Enter value for object_name: APP_CONTROL
old  6:  OBJECT_NAME => '&object_name',
new  6:  OBJECT_NAME => 'APP_CONTROL',
number corrupt: 1

PL/SQL procedure successfully completed.

-->查询生成的坏块表,里面有相应的记录。指向的坏块确实是日志中指定的。

SQL> select BLOCK_ID, CORRUPT_TYPE, CORRUPT_DESCRIPTION
2 from REPAIR_TABLE;

BLOCK_ID CORRUPT_TYPE
---------- ------------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------
570051     6148

-->修复坏块

SQL> DECLARE num_fix INT;
2 BEGIN
3  num_fix := 0;
4  DBMS_REPAIR.FIX_CORRUPT_BLOCKS (
5  SCHEMA_NAME => '&schema_name',
6  OBJECT_NAME=> '&object_name',
7  OBJECT_TYPE => dbms_repair.table_object,
8  REPAIR_TABLE_NAME => 'REPAIR_TABLE',
9  FIX_COUNT=> num_fix);
10  DBMS_OUTPUT.PUT_LINE('num fix: ' || to_char(num_fix));
11 END;
12 /
Enter value for schema_name: TSTAPPO2
old  5:  SCHEMA_NAME => '&schema_name',
new  5:  SCHEMA_NAME => 'TSTAPPO2',
Enter value for object_name: APP_CONTROL
old  6:  OBJECT_NAME=> '&object_name',
new  6:  OBJECT_NAME=> 'APP_CONTROL',
num fix: 0

PL/SQL procedure successfully completed.

-->对于坏块的操作都能够skip

SQL> BEGIN
DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
2  3  SCHEMA_NAME => '&schema_name',
4  OBJECT_NAME => '&object_name',
5  OBJECT_TYPE => dbms_repair.table_object,
6  FLAGS => dbms_repair.SKIP_FLAG);
7 END;
8 /
Enter value for schema_name: TSTAPPO2
old  3:  SCHEMA_NAME => '&schema_name',
new  3:  SCHEMA_NAME => 'TSTAPPO2',
Enter value for object_name: APP_CONTROL
old  4:  OBJECT_NAME => '&object_name',
new  4:  OBJECT_NAME => 'APP_CONTROL',

PL/SQL procedure successfully completed.

-->再次运行以上的sql,尝试。

SQL> l
SELECT  xxxxxFROM  APP_CONTROL
AC,  APP_BILL_PROC BL WHERE APP.DATA_GROUP IS NOT NULL AND BL.PROCESS_ID = APP.NXT_PGM_NAME AND 
APP.FILE_STATUS IN ('RD', 'IU', 'CN')
GROUP BY xxxxxxx  ;

转载自jeanron100

最后修改:2022 年 02 月 27 日
如果觉得我的文章对你有用,请随意赞赏