The views expressed on this blog are my own and do not necessarily reflect the views of Oracle

April 26, 2010

kfed - ASM metadata editor


The kfed is an undocumented ASM utility that can be used to read and modify ASM metadata. The kfed does not depend on the mount state of ASM instance or ASM disk group, so it can be used with ASM instance down and on a disk group that does not mount. In fact kfed can be used on ASM disks with corrupt ASM metadata.

Oracle Support may use kfed to correct some problems with damaged ASM metadata. You may want to use kfed to read from ASM disks - e.g. for educational purposes or to check the ASM metadata.

If you don't see the kfed binary in your $ORACLE_HOME/bin directory (in ASM /Grid Infrastructure home), you can build as follows:

$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins* ikfed

Check if ASM disk header is healthy

The following is an example of using kfed to read the contents of the ASM disk header from ASMLIB disk DISK4 (I left out the bits of no immediate interest).

$ kfed read /dev/oracleasm/disks/DISK4
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
...
kfbh.check: 1539641569 ; 0x00c: 0x5bc510e1
...
kfdhdb.driver.provstr: ORCLDISKDISK4 ; 0x000: length=13
...
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: PLAY0 ; 0x028: length=5
kfdhdb.grpname: PLAY ; 0x048: length=4
kfdhdb.fgname: P1 ; 0x068: length=2
...
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 4194304 ; 0x0bc: 0x00400000
...
kfdhdb.dsksize: 1221 ; 0x0c4: 0x000004c5
...

Valid ASM disk should have kfbh.type=KFBTYP_DISKHEAD (ASM disk header).

ASMLIB disk name should follow after 'ORCLDISK' in kfdhdb.driver.provstr field. Note that ASMLIB disk name does not have to be the same as ASM disk name.

Is this really the disk with ASM name PLAY0 (kfdhdb.dskname) and disk number 0 (kfdhdb.dsknum=0) in disk group PLAY (kfdhdb.grpname)? If you are not sure, check the ASM alert log entries around the time of the last successful mount of disk group PLAY.

Header status says MEMBER for this disk (kfdhdb.hdrsts=KFDHDR_MEMBER). That is what we want to see.

ASM metadata block size is 4 KB (kfdhdb.blksize=4096) and allocation unit size is 4 MB for this disk (kfdhdb.ausize=4194304) and the disk size is 1221 AUs, i.e 4884 MB. Is that what you think it should be? Is that what you see at the OS level for that device?

If you see kfbh.type=KFBTYP_INVALID in the disk header on a disk you believe belongs to an ASM disk group, that may indicate that the ASM disk header is damaged. But don't jump to conclusions. Are you looking at the right disk? Is this the right disk partition? Can you access that disk via some other name - in a multipath setup? If you are not sure, or if the disk is in fact damaged, log an SR with Oracle Support to check it out.

I should say that the ASM disk header may look fine, but in fact be corrupt. For example the block checksum (kfbh.check) could be wrong in which case that would need to be corrected. Please log an SR with Oracle Support to assist with that problem.

Note that kfed was used with no additional options. Of note is that no allocation unit number and no block number were specified, which means that default values would be used (0 for both). The command used was:

$ kfed read /dev/oracleasm/disks/DISK4

That is equivalent to:

$ kfed read /dev/oracleasm/disks/DISK4 aun=0 blkn=0

Also note that for reading block 0 or allocation unit 0, there is no need to specify the allocation unit size or the block size.

Check if allocation unit 0 has expected ASM metadata

Read from the same disk, but look at different ASM metadata blocks.

Check block 1:

$ kfed read /dev/oracleasm/disks/DISK4 blkn=1 | grep KFBTYP

Expected result is KFBTYP_FREESPC (free space table).

Check the rest of the blocks in allocation unit 0:

ausize=`kfed read DISK4 | grep ausize | tr -s ' ' | cut -d' ' -f2`
blksize=`kfed read DISK4 | grep blksize | tr -s ' ' | cut -d' ' -f2`
let n=$ausize/$blksize

for (( i=2; i<$n; i++ ))
do
  kfed read /dev/oracleasm/disks/DISK4 blkn=$i | grep KFBTYP
done


Expected result for all those blocks is KFBTYP_ALLOCTBL (allocation table).

Again, if you see kfbh.type=KFBTYP_INVALID in any of the results, that may indicate a corrupted ASM metadata block. My advice is the same as above - log an SR with Oracle Support.

I should point out that with the above we only looked at the expected ASM metadata block types. We did not look at the actual metadata block contents. Some ASM metadata block corruptions are indeed with the block contents. Such corruptions are only detected when ASM reads the corrupt block, in which case ASM instance will report ORA-15196 error. You should always log an SR with Oracle Support if you are unfortunate enough to encounter that error.

Check if allocation unit 1 has expected ASM metadata

Read from the same disk, this time look at all but two last blocks in allocation unit 1.

ausize=`kfed read DISK4 | grep ausize | tr -s ' ' | cut -d' ' -f2`
blksize=`kfed read DISK4 | grep blksize | tr -s ' ' | cut -d' ' -f2`
let n=$ausize/$blksize-2

for (( i=0; i<$n; i++ ))
do
  kfed read /dev/oracleasm/disks/DISK4 ausz=$ausize aun=1 blkn=$i | grep KFBTYP
done


Expected result for all these blocks is one of the following: KFBTYP_PST_META, KFBTYP_PST_DTA or KFBTYP_PST_NONE (partnership and status table blocks).

Read the second last block from allocation unit 1.

$ kfed read /dev/oracleasm/disks/DISK4 ausz=$ausize aun=1 blkn=$n | grep KFBTYP

Expected result in ASM version before 11.1.0.7 is one of the PST block types.

Expected result in ASM version 11.1.0.7 and later is KFBTYP_DISKHEAD. Yes, the backup copy of the ASM disk header. Indeed, if the only problem with the disk is the damaged disk header block, this can be easily repaired. But not on your own - please log an SR with Oracle Support to assist with this kind of problem.

Finally, read the last block in allocation unit 1.

$ let n=$n+1
$ kfed read /dev/oracleasm/disks/DISK4 ausz=$ausize aun=1 blkn=$n | grep KFBTYP

Expected result is KFBTYP_HBEAT (heartbeat block).

Note that on the kfed command line, in addition to allocation unit number (aun) I had to specify the allocation unit size (ausz). The ausz parameter is not required for default allocation unit size (1 MB) and for reading from allocation unit 0.

15 comments:

  1. Where is backup header located which can be restored in case disk header is curropted

    ReplyDelete
    Replies
    1. A backup copy of ASM disk header is in the second last block of allocation unit 1. If the allocation unit (AU) size is 1 MB there will be 256 blocks per AU. In that case the copy of the disk header will be in block 254 (note that blocks go from 0 to 255) of AU 1. The command to read that block would be:

      $ kfed read aun=1 blkn=254

      If the AU size is 4 MB, there will be 1024 blocks per AU. In that case the copy of the disk header will be in block 1022 (blocks now go from 0 to 1023) of AU 1. The command to read that block would be:

      $ kfed read ausz=4194304 aun=1 blkn=1022

      Note that this time I had to specify the AU size in the kfed command as AU had a non default size.

      While we are on this topic, if the only problem is the disk header corruption, that can easily be repaired with 'kfed repair' command. I didn't give an example for that in the original post as you should really consult Oracle Support for assistance with that kind of problem.

      Cheers,
      Bane

      Delete
    2. Hi Bane,

      thanks for the kfed repair command :-) Just dug me out of a hole.

      We tried using the kfed merge option based on other postings and in the end ran the repair command. This brought the ASM diskgroup online and I was able to bring up 5 of the 7 development databases. Still working on the other 2.

      Cheers

      Dave - Reading, UK

      Delete
  2. Hi Dave,

    Good to hear you found the repair option useful. Let me know if you get stuck or if you have any questions on this.

    Cheers,
    Bane

    ReplyDelete
  3. Hi Bane,
    I was looking for ASM stuffs to understand it more on google and found your article. It's really very informative. I can easily understand each and every step as it has been explained very clearly.

    Thanks for your article.

    Best Regards,
    Ramakant

    ReplyDelete
    Replies
    1. Cool! Thanks for your kind words Ramakant.
      Cheers,
      Bane

      Delete
  4. Bane,

    A few weeks back we had a production issue where block 40 and 41 on one of the ASM disk were corrupted which was in the area of ASM allocation table. During a rebalance the corruption was detected and the disk group went offline. We opened a SR with Oracle and they reviewed logs, ask DMP files, IMG and kfed output. I have two questions first how or what can you see from a kfed output that would tell you if you can use a kfed repair to resolve your problem? Second question have you ever come across where something has zero'ed consecutive blocks and all the vendors involve can not find what caused the issue? We final resolved the issue but Oracle support made us recreate the disk group and recover the database. Lucky for use we had a standby and good backups. If possible I would like to pick your brain some more around are issues since no one has been able to give us a good root cause analysis.

    ReplyDelete
    Replies
    1. Hi Javier,

      There are two types of repairs we can do with the kfed. The first one is the actual 'kfed repair' command, that fixes the corrupt or lost disk header block only.

      The other type of repair is a manual editing of the damaged block. We basically read the damaged block (and couple of blocks around it (as the damaged block may well be all zeroed out, as it was in your case), and then see if we can repair or reconstruct the block. This type of repair is of course more challenging - you need to understand the structure of the block, you need to know what data it had and finally understand if it can be manually repaired or not.

      Now to the specific metadata block in your case - allocation table block. It's much easier to fix a partially corrupt block then the completely zeroed out block. With the completely zeroed blocks we can recreate them as empty blocks. That way they have incorrect contents, but they are valid as far as the structure goes. That allows us to mount the disk group and make sure it stays mounted. We would then attempt disk group repair (ALTER DISKGROUP REPAIR) to see if ASM can fix this (using the file directory data that we hope is not damaged). If that works we are good to go. If not, we can attempt to find what should have been in those blocks (by querying X$KFFXP, which again needs a good file directory). With that info we then attempt to recreate the blocks...

      Now, none of this is trivial and even if you know exactly what you are doing, it can take hours. The biggest challenge may be finding the person that can do this for you. And finally, there is no guarantee that the problem can be fixes. All this is a best effort based as patching is not a supported method of data recovery. I know customers expect this type of service as a matter of course, but in reality this is out of scope for Oracle Support. The only time you can insist on Oracle fixing this (or at least attempting to fix it) is when it's clear that the problem is caused by Oracle bug...

      Back to the root cause question. Yes, I have seen this type of problem, but it is rare that we can tell with 100% confidence what caused the problem. The reason we claim it's not Oracle/ASM is because there is no routine/function that writes zeros to ASM metadata blocks. When we write empty blocks, sure they are empty, but they are formatted - they have the header/tail and the check-sum - they are never zeroed out blocks. That is our justification that the this type of change came from outside.

      And yes, there is no substitute for backups. Unfortunately some people still don't appreciate that...

      Cheers,
      Bane

      Delete
  5. Hi Bane,

    Is there a way to use kfed to read the entire AU at once ? (not individual blocks).

    regards,
    VK

    ReplyDelete
    Replies
    1. Hi VK,
      No. That's why I used shell scripting to read multiple blocks in my examples.
      Cheers,
      Bane

      Delete
  6. Okay, thanks Bane for a instant reply. I think I can give a try using AMDU then.

    VK

    ReplyDelete
  7. Hi Bane,

    I am getting some conflicting information when I query the v$asm_disk view. The header status for all 114 disks should be "Member". But ,with the exception of three disks, all other disks appear as "Candidates"?

    There are no errors in either the RDBMS or the ASM alert logs.

    The database is working fine for the time being.
    Is it a case of disk header corruption? or simple misreporting? And more importantly, how can I fix the problem?

    The details are as follows

    O/S: AIX 5.3.12.1
    Database: 11.2.0.3

    select DISK_NUMBER,HEADER_STATUS,substr(PATH,1,20),label
    from v$asm_disk;

    *Output:*

    17 CANDIDATE /dev/rhdisk10
    18 CANDIDATE /dev/rhdisk100
    19 CANDIDATE /dev/rhdisk101
    20 CANDIDATE /dev/rhdisk102
    21 CANDIDATE /dev/rhdisk103
    22 CANDIDATE /dev/rhdisk104
    23 CANDIDATE /dev/rhdisk105
    24 CANDIDATE /dev/rhdisk106
    ...
    112 MEMBER /dev/rhdisk147
    113 MEMBER /dev/rhdisk148
    114 MEMBER /dev/rhdisk149

    ReplyDelete
    Replies
    1. Hmm, interesting...

      To get a complete info about your setup, please connect to ASM with 'sqlplus / as sysasm' and run the following:
      spool /tmp/asm_gv.html
      set markup HTML on
      break on INST_ID on GROUP_NUMBER
      alter session set NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS';
      select SYSDATE "Date and Time" from DUAL;
      select * from GV$ASM_OPERATION order by 1;
      select * from V$ASM_DISKGROUP order by 1, 2;
      select * from V$ASM_DISK order by 1, 2, 3;
      select * from V$ASM_ATTRIBUTE where NAME not like 'template%' order by 1;
      select * from V$VERSION where BANNER like '%Database%' order by 1;
      select * from V$ASM_CLIENT order by 1, 2;
      show parameter asm
      show parameter cluster
      show parameter instance
      show parameter spfile
      show sga
      spool off
      exit

      To determine if the problem is with disk headers, please do the following:
      kfed read /dev/rhdisk10 blkn=0 > /tmp/rhdisk10.kfed
      kfed read /dev/rhdisk10 blkn=1 >> /tmp/rhdisk10.kfed
      kfed read /dev/rhdisk100 blkn=0 > /tmp/rhdisk100.kfed
      kfed read /dev/rhdisk100 blkn=1 >> /tmp/rhdisk100.kfed
      kfed read /dev/rhdisk147 blkn=0 > /tmp/rhdisk147.kfed
      kfed read /dev/rhdisk147 blkn=1 >> /tmp/rhdisk147.kfed

      That will show me blocks 0 (disk header) and block 1 (free space table) for those 3 disks. If the problem is with disk header block only, this will be easy to fix.

      Now send me /tmp/asm_gv.html, /tmp/rhdisk10.kfed, /tmp/rhdisk100.kfed and /tmp/rhdisk147.kfed (email to bane dot radulovic at gmail dot com). Once we sort it out, we can post the solution here.

      Cheers,
      Bane

      Delete
  8. Hi Bane,

    Went thru your article and must tell you how good it is... I ran into to issues with Disk group corruption and had to consult the Oracle Support. They told us to use KFED but I never really got what they were upto, until today..

    Keep up.

    Regards
    Hardik
    http://handsonoracle.blogspot.in/

    ReplyDelete