Benjamin Zhang

目前就职于中科院计算机网络信息中心,主要兴趣集中在数据库,操作系统,中间件。

【RMAN】数据恢复顾问(Data Recovery Advisor)

01 Feb 2019 » RMAN

数据恢复顾问(Data Recovery Advisor)—Oracle 11gR2新特性

Data Recovery Advisor是一个帮助处理数据故障的工具,Data Recovery Advisor使诊断和修复数据故障很容易,并降低MTTR。Data Recovery Advisor有助于在数据库不能使用之前,主动地检测和修复数据库故障。并以报表的形式提供故障的影响评估。它还给出最好的修复选项,并把决定权留给你。

它帮助诊断以下类型的故障:

  • 不能访问的数据字段和控制文件:
  • 数据文件比其他数据库文件旧
  • 物理讹误
  • I/O故障,如硬件错误或操作系统故障

如何使用Data Recovery Advisor?

0.数据文件和数据块的定位

---查出test表所在的数据文件和开始的块id以及块数量
ARP@prod> select a.file_id,a.block_id,a.blocks,b.name from dba_extents a,v$datafile b where a.file_id=b.file# and a.owner='ARP' and a.segment_name='TEST';

   FILE_ID   BLOCK_ID     BLOCKS NAME
---------- ---------- ---------- --------------------
         5        128          8 /u01/app/oradata/pro
                                 d/apps_arp01.dbf
---查出表中记录所在的块
ARP@prod> select distinct dbms_rowid.rowid_block_number(rowid) from arp.test;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                 132
                                 135
---列行对应的数据块
ARP@prod> col LOCATION format a10;
ARP@prod> set linesize 120
ARP@prod> select id,rowid,dbms_rowid.rowid_relative_fno(rowid) || '_' ||dbms_rowid.rowid_block_number(rowid) || '_' || dbms_rowid.rowid_row_number(rowid) location from arp.test;

        ID ROWID              LOCATION
---------- ------------------ ----------
         1 AAADX3AAFAAAACEAAA 5_132_0
         2 AAADX3AAFAAAACEAAB 5_132_1
         3 AAADX3AAFAAAACEAAC 5_132_2
         4 AAADX3AAFAAAACEAAD 5_132_3
         5 AAADX3AAFAAAACEAAE 5_132_4
         6 AAADX3AAFAAAACEAAF 5_132_5
         7 AAADX3AAFAAAACEAAG 5_132_6
         8 AAADX3AAFAAAACEAAH 5_132_7
         9 AAADX3AAFAAAACHAAA 5_135_0

9 rows selected.
其中可以通过ROWID包通过这个ROWID伪列转换出来的值为5_132_0,代表该行在数据文件5的第132个数据块的第0行。

1.在模拟环境中,模拟坏块

在apps_arp01.dbf的数据文件中,损坏132块。

[oracle@db1 ~]$ dd of=/u01/app/oradata/prod/apps_arp01.dbf bs=8192 conv=notrunc seek=132 <<EOF 
zhangshengdonghahahahahah
EOF

记录了0+1 的读入
记录了0+1 的写出
26字节(26 B)已复制,0.000129843 秒,200 kB/秒
  • dbv查看数据库的坏块
[oracle@db1 ~]$ dbv file=/u01/app/oradata/prod/apps_arp01.dbf blocksize=8192

DBVERIFY: Release 11.2.0.4.0 - Production on Fri Feb 1 15:40:24 2019

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oradata/prod/apps_arp01.dbf
Page 10 is marked corrupt
Corrupt block relative dba: 0x0140000a (file 5, block 10)
Bad header found during dbv: 
Data in bad block:
 type: 122 format: 0 rdba: 0x65687367
 last change scn: 0x676e.6f64676e seq: 0x68 flg: 0x61
 spare1: 0x61 spare2: 0x6e spare3: 0x6168
 consistency value in tail: 0x35b71e01
 check value in block header: 0x6168
 block checksum disabled

Page 11 is marked corrupt
Corrupt block relative dba: 0x0140000b (file 5, block 11)
Bad header found during dbv: 
Data in bad block:
 type: 122 format: 0 rdba: 0x65687367
 last change scn: 0x676e.6f64676e seq: 0x68 flg: 0x61
 spare1: 0x61 spare2: 0x6e spare3: 0x6168
 consistency value in tail: 0x35b91e01
 check value in block header: 0x6168
 block checksum disabled

Page 130 is marked corrupt
Corrupt block relative dba: 0x01400082 (file 5, block 130)
Bad header found during dbv: 
Data in bad block:
 type: 115 format: 4 rdba: 0x666b6c64
 last change scn: 0x6a66.6c64736a seq: 0x6b flg: 0x6c
 spare1: 0x66 spare2: 0x73 spare3: 0x6466
 consistency value in tail: 0x8b372301
 check value in block header: 0x7364
 computed block checksum: 0xcfaf

Page 131 is marked corrupt
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during dbv: 
Data in bad block:
 type: 115 format: 4 rdba: 0x666b6c64
 last change scn: 0x6a66.6c64736a seq: 0x6b flg: 0x6c
 spare1: 0x66 spare2: 0x73 spare3: 0x6466
 consistency value in tail: 0x8b370601
 check value in block header: 0x7364
 computed block checksum: 0xaf61

Page 132 is marked corrupt
Corrupt block relative dba: 0x01400084 (file 5, block 132)
Bad header found during dbv: 
Data in bad block:
 type: 100 format: 4 rdba: 0x6c6b6473
 last change scn: 0x6666.6666666a seq: 0x66 flg: 0x66
 spare1: 0x66 spare2: 0x6a spare3: 0x6666
 consistency value in tail: 0x8b3c0601
 check value in block header: 0x6666
 computed block checksum: 0x8e47

Page 133 is marked corrupt
Corrupt block relative dba: 0x01400085 (file 5, block 133)
Bad header found during dbv: 
Data in bad block:
 type: 115 format: 4 rdba: 0x666b6c64
 last change scn: 0x6a66.6c64736a seq: 0x6b flg: 0x6c
 spare1: 0x66 spare2: 0x73 spare3: 0x6466
 consistency value in tail: 0x8b370601
 check value in block header: 0x7364
 computed block checksum: 0xaf61



DBVERIFY - Verification complete

Total Pages Examined         : 64000
Total Pages Processed (Data) : 2
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 127
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 63865
Total Pages Marked Corrupt   : 6
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 691925 (0.691925)

可以看到Page 132 is marked corrupt已经被损坏。其中一共有Total Pages Marked Corrupt : 6。6块被损坏。

2 使用数据恢复顾问

  1. 验证数据是否有问题

    使用了手动触发检查点alter system checkpoint; ,并重启数据库。查看arp.test表的时候出现如下问题:

    SYS@prod> select * from arp.test;
    select * from arp.test
                      *
    ERROR at line 1:
    ORA-01578: ORACLE data block corrupted (file # 5, block # 130)
    ORA-01110: data file 5: '/u01/app/oradata/prod/apps_arp01.dbf'
    
  2. 使用数据恢复顾问

    其核心命令如下:

    • list failure;
    • advise failure;
    • repair failure preview;
    • repair failure;

    List failure 命令直接列出目前数据库的故障

       
    RMAN> list failure;
       
    using target database control file instead of recovery catalog
    List of Database Failures
    =========================
       
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    102        HIGH     OPEN      01-FEB-19     Datafile 5: '/u01/app/oradata/prod/apps_arp01.dbf' contains one or more corrupt blocks
    

    advise failure命令列出修复选项

    RMAN> advise failure;
       
    List of Database Failures
    =========================
       
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    102        HIGH     OPEN      01-FEB-19     Datafile 5: '/u01/app/oradata/prod/apps_arp01.dbf' contains one or more corrupt blocks
       
    analyzing automatic repair options; this may take some time
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=8 device type=DISK
    analyzing automatic repair options complete
       
    Mandatory Manual Actions
    ========================
    no manual actions available
       
    Optional Manual Actions
    =======================
    no manual actions available
       
    Automated Repair Options
    ========================
    Option Repair Description
    ------ ------------------
    1      Recover multiple corrupt blocks in datafile 5  
      Strategy: The repair includes complete media recovery with no data loss
      Repair script: /u01/app/diag/rdbms/prod/prod/hm/reco_56363893.hm
    

    预览修复的策略(没有实际运行)

    RMAN> repair failure preview;
       
    Strategy: The repair includes complete media recovery with no data loss
    Repair script: /u01/app/diag/rdbms/prod/prod/hm/reco_56363893.hm
       
    contents of repair script:
       # block media recovery for multiple blocks
       recover datafile 5 block 130 to 133;
       
    

    开始修复坏块

    RMAN> repair failure;
       
    Strategy: The repair includes complete media recovery with no data loss
    Repair script: /u01/app/diag/rdbms/prod/prod/hm/reco_56363893.hm
       
    contents of repair script:
       # block media recovery for multiple blocks
       recover datafile 5 block 130 to 133;
       
    Do you really want to execute the above repair (enter YES or NO)? YES
    executing repair script
       
    Starting recover at 01-FEB-19
    using channel ORA_DISK_1
       
    channel ORA_DISK_1: restoring block(s)
    channel ORA_DISK_1: specifying block(s) to restore from backup set
    restoring blocks of datafile 00005
    channel ORA_DISK_1: reading from backup piece /u01/app/flash_recovery_area/PROD/backupset/2019_02_01/o1_mf_nnndf_TAG20190201T144514_g57tqtpz_.bkp
    channel ORA_DISK_1: piece handle=/u01/app/flash_recovery_area/PROD/backupset/2019_02_01/o1_mf_nnndf_TAG20190201T144514_g57tqtpz_.bkp tag=TAG20190201T144514
    channel ORA_DISK_1: restored block(s) from backup piece 1
    channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
       
    starting media recovery
    media recovery complete, elapsed time: 00:00:03
       
    Finished recover at 01-FEB-19
    repair failure complete
    

    修复完成后,查询之前的表

    SYS@prod> select * from arp.test;
       
            ID
    ----------
             1
             2
             3
             4
             5
             6
             7
             8
             9
       
    9 rows selected.