- This article is a complete copy from: http://www.toadworld.com/platforms/oracle/w/wiki/10944.data-block-recovering-process-using-normal-redundancy.aspx
Table of Contents
Overview
ASM always use the primary AU for to read data. If the primary AU is corrupted then ASM will read the secondary AU. If the secondary AU is well then ASM tries to overwrite the corrupted primary AU using the secondary AU. If the corrupted primary AU is fixed then that AU will be the primary AU as always. If the corrupted primary AU can’t be overwritten then ASM tries to write the new AU to other location in the disk. If that write operation is successfully then that AU will be the new primary AU.
Prepare test case
Checking OS location for ASM AU SQL> select PXN_KFFXP, -- physical extent number \ XNUM_KFFXP, -- virtual extent number DISK_KFFXP, -- disk number AU_KFFXP, -- allocation unit number decode(LXN_KFFXP,0,'Primary',1,'Secondary','header metadata') "AU type" from X$KFFXP where NUMBER_KFFXP=256 -- ASM file 272 AND GROUP_KFFXP=4 -- group number 1 order by 1; SK_KFFXP AU_KFFXP AU type ---------- ---------- ---------- ---------- --------------- 0 0 0 144 Primary 1 0 1 144 Secondary 2 1 1 145 Primary 3 1 0 145 Secondary Summary --> Data block OFFset: 133 --> Database block size: 8k --> ASM File number : 256 --> ASM DG : 4 --> AU size: 1Mbyte --> ASM disks : Disk# 0 : /dev/asm_test_1G_disk1 - Disk# 1: /dev/asm_test_1G_disk2 --> ASM disk /dev/asm_test_1G_disk2 is the Primary for AU 145 Reading the Primary AU [root@grac41 Desktop]# dd if=/dev/asm_test_1G_disk2 bs=8k count=1 skip=18565 | od -a 0017760 stx A stx bs A S M - T E S T soh ack i n Reading Secondary AU [root@grac41 Desktop]# dd if=/dev/asm_test_1G_disk1 bs=8k count=1 skip=18565 | od -a 0017760 stx A stx bs A S M - T E S T soh ack i n Erasing 8k block in our primary AU and verify deletion # dd if=/dev/zero of=/dev/asm_test_1G_disk2 bs=8k count=1 seek=18565 1+0 records in 1+0 records out 8192 bytes (8.2 kB) copied, 0.0345691 s, 237 kB/s # dd if=/dev/asm_test_1G_disk2 bs=8k count=1 skip=18565 | od -a 0000000 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 1+0 records in 1+0 records out SQL> select * from test_tab; N NAME ---------- ---------------- 1 ASM-TEST --> Data is still valid - Is primary block already fixed by ASM ? ot@grac41 Desktop]# dd if=/dev/zero of=/dev/asm_test_1G_disk2 bs=8k count=1 seek=18565 1+0 records in 1+0 records out 8192 bytes (8.2 kB) copied, 0.0279776 s, 293 kB/s --> Block is still corrupted Flush buffer cache and monitor Database alert.log SQL> alter system flush buffer_cache; System altered. SQL> select * from test_tab; N NAME ---------- ---------------- 1 ASM-TEST Checking alert.log Mon Jul 14 18:09:25 2014 ALTER SYSTEM: Flushing buffer cache Mon Jul 14 18:09:42 2014 Hex dump of (file 7, block 133) in trace file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_ora_29037.trc Corrupt block relative dba: 0x01c00085 (file 7, block 133) Completely zero block found during multiblock buffer read Reading datafile '+TEST/grac4/datafile/test_ts.256.852905863' for corruption at rdba: 0x01c00085 (file 7, block 133) Read datafile mirror 'TEST_0001' (file 7, block 133) found same corrupt data (no logical check) Read datafile mirror 'TEST_0000' (file 7, block 133) found valid data Hex dump of (file 7, block 133) in trace file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_ora_29037.trc Repaired corruption at (file 7, block 133) --> block fixed by reading data from secondary disk 'TEST_0000' (file 7, block 133) Verify fix using dd [root@grac41 Desktop]# dd if=/dev/asm_test_1G_disk2 bs=8k count=1 skip=18565 | od -a 0000200 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0017740 nul nul nul nul nul nul nul nul nul nul nul nul nul , soh stx 0017760 stx A stx bs A S M - T E S T soh ack i n 1+0 records in 1+0 records out 8192 bytes (8.2 kB) copied0020000
Summary
- ASM recovers automatically the primary AU if it is corrupted.
- The secondary AU will not be used unless a disk fail occurs.
- The secondary AU is used for recovering the primary AU.
- If ASM can’t overwrite the primary AU it will write the new primary AU in other disk part.
- ASM writes an entry in the alert log when a recovering process occurs.