The views expressed on this blog are my own and do not necessarily reflect the views of Oracle

December 25, 2011

Forcing the issue

Some ASM commands have the "force" option that allows the administrator to override a default behaviour. While some uses of the force option are perfectly safe and indeed required, some may render your disk group unusable. Let's have a closer look.

Mount force

The force option becomes a must when a disk group mount reports missing disks. This is one of the cases when it's safe and required to use the force option. Provided we are not missing too many disks, the mount force should succeed. Basically, at least one partner disk - from every disk partnership in the disk group - must be available.

Let's look at one example. I have created a normal redundancy disk group PLAY with three disks:

SQL> create diskgroup PLAY disk '/dev/ASMPLAY01','/dev/ASMPLAY02','/dev/ASMPLAY03';

Diskgroup created.

I then dismounted the disk group and deleted disk /dev/ASMPLAY01. After that, my disk group mount fails, telling me that a disk is missing:

SQL> alter diskgroup PLAY mount;
alter diskgroup PLAY mount
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing from group number "2"

As I am missing only one disk, I should be able to mount force the disk group:

SQL> alter diskgroup PLAY mount force;

Diskgroup altered.

ASM will now do some clean up - it will offline the missing disk and eventually drop it from the disk group. These actions will be logged in the ASM alert log for all to see:

SQL> alter diskgroup PLAY mount force
NOTE: cache registered group PLAY number=2 incarn=0xb71d3834
NOTE: cache began mount (first) of group PLAY number=2 incarn=0xb71d3834
NOTE: Assigning number (2,2) to disk (/dev/ASMPLAY03)
NOTE: Assigning number (2,1) to disk (/dev/ASMPLAY02)
NOTE: process _user5733_+asm (5733) initiating offline of disk 0.3916286251 () with mask 0x7e in group 2
NOTE: checking PST: grp = 2
GMON checking disk modes for group 2 at 29 for pid 19, osid 5733
NOTE: checking PST for grp 2 done.
WARNING: Disk 0 () in group 2 mode 0x7f is now being offlined
SUCCESS: diskgroup PLAY was mounted
SUCCESS: alter diskgroup PLAY mount force
WARNING: PST-initiated drop of 1 disk(s) in group 2(.3072145460))
SQL> alter diskgroup PLAY drop disk PLAY_0000 force /* ASM SERVER */
NOTE: starting rebalance of group 2/0xb71d3834 (PLAY) at power 1
Starting background process ARB0
SUCCESS: alter diskgroup PLAY drop disk PLAY_0000 force /* ASM SERVER */
ARB0 started with pid=21, OS id=5762
NOTE: assigning ARB0 to group 2/0xb71d3834 (PLAY) with 1 parallel I/O
SUCCESS: PST-initiated drop disk in group 2(3072145460))
NOTE: F1X0 copy 1 relocating from 0:2 to 2:2 for diskgroup 2 (PLAY)
NOTE: F1X0 copy 3 relocating from 2:2 to 65534:4294967294 for diskgroup 2 (PLAY)
NOTE: Attempting voting file refresh on diskgroup PLAY
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 2/0xb71d3834 (PLAY)
SUCCESS: grp 2 disk _DROPPED_0000_PLAY going offline

Interestingly, ASM used the force option with the DROP DISK operation. More on that later.

The mount force operation would fail in a clustered environment if the ASM instance is not the first to mount the disk group.

There is a change in disk group mount force behavior in ASM version A disk group mount, without the force option, will succeed in Exadata and Oracle Database Appliance - as long as the result leaves more than one failgroup for normal redundancy or more than two failgroups for high redundancy disk groups.

It is important to understand that this discussion only applies to normal and high redundancy disk groups. An external redundancy disk group cannot be mounted if it has missing disks.

Disk force

The CREATE DISKGROUP command does not have the force option. But if I am creating a disk group with disks that are not CANDIDATE, PROVISIONED or FORMER, I have to add force next to the disk name. Here is an example.

SQL> create diskgroup PLAY disk '/dev/ASMPLAY01','/dev/ASMPLAY02','/dev/ASMPLAY03';
create diskgroup PLAY disk '/dev/ASMPLAY01','/dev/ASMPLAY02','/dev/ASMPLAY03'
ERROR at line 1:
ORA-15018: diskgroup cannot be created
ORA-15033: disk '/dev/ASMPLAY01' belongs to diskgroup "PLAY"

SQL> select disk_number, path, header_status from v$asm_disk where path like '%PLAY%';

----------- ---------------- ----------------
          0 /dev/ASMPLAY01   MEMBER
          2 /dev/ASMPLAY02   FORMER
          1 /dev/ASMPLAY03   FORMER


If I am 100% sure that it is safe to (re)use disk '/dev/ASMPLAY01', I can specify the force option for that disk in my CREATE DISKGROUP statement:

SQL> create diskgroup PLAY disk

Diskgroup created.

Let me say that again. My confidence that the disk can be reused has to be 100%. Anything less is not acceptable, as I will be destroying the content on that disk and taking it away from the disk group it belongs to.

The same applies to an ADD DISK operation of the ALTER DISKGROUP command. If the disk to be added to a disk group is not CANDIDATE, PROVISIONED or FORMER, I have to specify force next to the disk name.

This behavior has an interesting and time-consuming consequence. The other day I had to recreate a disk group in an Exadata environment. As it was a full rack, I had 168 disks for that disk group. That would normally be a trivial operation with a create disk group statement like this:

create diskgroup RECO
disk 'o/*/RECO*'

For reasons beyond the scope of this post, some disks had the header marked as MEMBER and some FORMER. So I had to compile a complete list of MEMBER disks and then specify every single disk in the CREATE DISKGROUP statement, making sure to specify FORCE next to each MEMBER and not to specify anything next to any FORMER disks. The create disk statement then looked like this:

create diskgroup RECO disk
'o/' FORCE,
'o/' FORCE,
'o/' FORCE,
'o/' FORCE,
'o/' FORCE,
'o/' FORCE,

Forcing disk drop

As we have seen in the ASM alert log above, a forced disk drop is required when the disk fails or is not accessible by ASM for any reason.

When we issue ALTER DISKGROUP ... DROP DISK command (without the FORCE option), the ASM moves data from the disk to be dropped to the remaining disks in the disk group. It then marks the disk as FORMER, updates the Partnership and Status Table (PST) and then drops the disk.

If ASM cannot access the disk (to be dropped) for any reason, we have to use DROP DISK FORCE. In that case the ASM has to copy the data from its partner disks. Once the data redundancy has been re-established, it simply updates the PST to say that the disk is no longer a member of that disk group. As ASM cannot access the disk, it is not able to mark its disk header as FORMER.

Forcing disk group drop

To drop a disk group I have to mount it first. If I cannot mount a disk group, but must drop it, I can use the force option of the DROP DISKGROUP statement, like this:

SQL> drop diskgroup PLAY force including contents;

Diskgroup dropped.

If ASM determines that the disk group is mounted anywhere (in the clustered environment), this operation fails.

Forcing disk group dismount

ASM does not allow a disk group to be dismounted if it's still being accessed. But I can force the disk group dismount even if some files in the disk group are open. Here is an example:

SQL> alter diskgroup PLAY dismount;
alter diskgroup PLAY dismount
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "PLAY" precludes its dismount

Yes, a database is using the disk group:

SQL> select group_number, db_name, status from v$asm_client;

------------ -------- ------------
           1 BR       CONNECTED
           2 BR       CONNECTED

But I am not very considerate today, so I will dismount the disk group anyway:

SQL> alter diskgroup PLAY dismount force;

Diskgroup altered.

Note that the forced disk group dismount will cause all datafiles in that database to go offline, which means they will need recovery (and restore if I drop disk group PLAY).

Undrop disks

The UNDROP DISKS clause of the ALTER DISKGROUP statement cancels all pending drops of disks within disk groups. But the UNDROP DISKS cannot be used to restore disks that are being dropped as the result of a DROP DISKGROUP statement, or for disks that are being dropped using the force clause.

Command line force

The equivalent of the force option in asmcmd is the -f flag on the command line and the FORCE keyword in the XML configuration file.

The asmcmd has an additional feature relevant to this discussion. The asmcmd lsdsk command with a -M flag displays the disks that are visible to some but not all active instances, as explained by asmcmd itself:

$ asmcmd help lsdsk

List Oracle ASM disks.

lsdsk [-kptgMI][-G diskgroup ] [--suppressheader] [ --member|--candidate] [--discovery][--statistics][pattern]

The options for the lsdsk command are described below.

-M - Displays the disks that are visible to some but not all active instances. These are disks that, if included in a disk group, will cause the mount of that disk group to fail on the instances where the disks are not visible.


It is important to understand the power of the force and use it wisely.

November 27, 2011

Rebalancing act

ASM ensures that file extents are evenly distributed across all disks in a disk group. This is true for the initial file creation and for file resize operations. That means we should always have a balanced space distribution across all disks in a disk group.

Rebalance operation

Disk group rebalance is triggered automatically on ADD, DROP and RESIZE disk operations and on moving a file between hot and cold regions. Running rebalance by explicitly issuing ALTER DISKGROUP ... REBALANCE is called a manual rebalance. We might want to do that to change the rebalance power for example. We can also run the rebalance manually if a disk group becomes unbalanced for any reason.

The POWER clause of the ALTER DISKGROUP ... REBALANCE statement specifies the degree of parallelism of the rebalance operation. It can be set to a minimum value of 0 which halts the current rebalance until the statement is either implicitly or explicitly re-run. A higher values may reduce the total time it takes to complete the rebalance operation.

The ALTER DISKGROUP ... REBALANCE command by default returns immediately so that we can run other commands while the rebalance operation takes place in the background. To check the progress of the rebalance operations we can query V$ASM_OPERATION view.

Three phase power

The rebalance operation has three distinct phases. First, ASM has to come up with the rebalance plan. That will depend on the rebalance reason, disk group size, number of files in the disk group, whether or not partnership has to modified, etc. In any case this shouldn't take more than a couple of minutes.

The second phase is the moving or relocating the extents among the disks in the disk group. This is where the bulk of the time will be spent. As this phase is progressing, ASM will keep track of the number of extents moved, and the actual I/O performance. Based on that it will be calculating the estimated time to completion (GV$ASM_OPERATION.EST_MINUTES). Keep in mind that this is an estimate and that the actual time may change depending on the overall (mostly disk related) load. If the reason for the rebalance was a failed disk(s) in a redundant disk group, at the end of this phase the data mirroring is fully re-established.

The third phase is disk(s) compacting (ASM version and later). The idea of the compacting phase is to move the data as close to the outer tracks of the disks as possible. Note that at this stage or the rebalance, the EST_MINUTES will keep showing 0. This is a 'feature' that will hopefully be addressed in the future. The time to complete this phase will again depend on the number of disks, reason for rebalance, etc. Overall time should be a fraction of the second phase.

Some notes about rebalance operations
  • Rebalance is per file operation.
  • An ongoing rebalance is restarted if the storage configuration changes either when we alter the configuration, or if the configuration changes due to a failure or an outage. If the new rebalance fails because of a user error a manual rebalance may be required.
  • There can be one rebalance operation per disk group per ASM instance in a cluster.
  • Rebalancing continues across a failure of the ASM instance performing the rebalance.
  • The REBALANCE clause (with its associated POWER and WAIT/NOWAIT keywords) can also be used in ALTER DISKGROUP commands for ADD, DROP or RESIZE disks.
Tuning rebalance operations

If the POWER clause is not specified in an ALTER DISKGROUP statement, or when rebalance is implicitly run by ADD/DROP/RESIZE disk, then the rebalance power defaults to the value of the ASM_POWER_LIMIT initialization parameter. We can adjust the value of this parameter dynamically. Higher power limit should result in a shorter time to complete the rebalance, but this is by no means linear and it will depends on the (storage system) load, available throughput and underlying disk response times.

The power can be changed for a rebalance that is in progress. We just need to issue another ALTER DISKGROUP ... REBALANCE command with different value for POWER. This interrupts the current rebalance and restarts it with modified POWER.

Relevant initialization parameters and disk group attributes


The ASM_POWER_LIMIT initialization parameter specifies the default power for disk rebalancing in a disk group. The range of values is 0 to 11 in versions prior to Since version the range of values is 0 to 1024, but that still depends on the disk group compatibility (see the notes below). The default value is 1. A value of 0 disables rebalancing.
  • For disk groups with COMPATIBLE.ASM set to or greater, the operational range of values is 0 to 1024 for the rebalance power.
  • For disk groups that have COMPATIBLE.ASM set to less than, the operational range of values is 0 to 11 inclusive.
  • Specifying 0 for the POWER in the ALTER DISKGROUP REBALANCE command will stop the current rebalance operation (unless you hit bug 7257618).

Setting initialization parameter _DISABLE_REBALANCE_COMPACT=TRUE will disable the compacting phase of the disk group rebalance - for all disk groups.


This is a hidden disk group attribute. Setting _REBALANCE_COMPACT=FALSE will disable the compacting phase of the disk group rebalance - for that disk group only.


This initialization parameter controls the percentage of imbalance between disks. Default value is 3%.


The following table has a brief summary of the background processes involved in the rebalance operation.

Process Description
ARBn ASM Rebalance Process. Rebalances data extents within an ASM disk group. Possible processes are ARB0-ARB9 and ARBA.
RBAL ASM Rebalance Master Process. Coordinates rebalance activity. In an ASM instance, it coordinates rebalance activity for disk groups. In a database instances, it manages ASM disk groups.
Xnnn Exadata only - ASM Disk Expel Slave Process. Performs ASM post-rebalance activities. This process expels dropped disks at the end of an ASM rebalance.

When a rebalance operation is in progress, the ARBn processes will generate trace files in the background dump destination directory, showing the rebalance progress.


In an ASM instance, V$ASM_OPERATION displays one row for every active long running ASM operation executing in the current ASM instance. GV$ASM_OPERATION will show cluster wide operations.

During the rebalance, the OPERATION will show REBAL, STATE will shows the state of the rebalance operation, POWER will show the rebalance power and EST_MINUTES will show an estimated time the operation should take.

In an ASM instance, V$ASM_DISK displays information about ASM disks. During the rebalance, the STATE will show the current state of the disks involved in the rebalance operation.

Is your disk group balanced

Run the following query in your ASM instance to get the report on the disk group imbalance.

SQL> column "Diskgroup" format A30
SQL> column "Imbalance" format 99.9 Heading "Percent|Imbalance"
SQL> column "Variance" format 99.9 Heading "Percent|Disk Size|Variance"
SQL> column "MinFree" format 99.9 Heading "Minimum|Percent|Free"
SQL> column "DiskCnt" format 9999 Heading "Disk|Count"
SQL> column "Type" format A10 Heading "Diskgroup|Redundancy"

SQL> SELECT "Diskgroup",
  100*(max((d.total_mb-d.free_mb)/d.total_mb)-min((d.total_mb-d.free_mb)/d.total_mb))/max((d.total_mb-d.free_mb)/d.total_mb) "Imbalance",
  100*(max(d.total_mb)-min(d.total_mb))/max(d.total_mb) "Variance",
  100*(min(d.free_mb/d.total_mb)) "MinFree",
  count(*) "DiskCnt",
  g.type "Type"
FROM v$asm_disk d, v$asm_diskgroup g
WHERE d.group_number = g.group_number and
  d.group_number <> 0 and
  d.state = 'NORMAL' and
  d.mount_status = 'CACHED'
GROUP BY, g.type;

                                           Percent Minimum
                                 Percent Disk Size Percent  Disk Diskgroup
Diskgroup                      Imbalance  Variance    Free Count Redundancy
------------------------------ --------- --------- ------- ----- ----------
ACFS                                  .0        .0    12.5     2 NORMAL
DATA                                  .0        .0    48.4     2 EXTERN
PLAY                                 3.3        .0    98.1     3 NORMAL
RECO                                  .0        .0    82.9     2 EXTERN

NOTE: The above query is from Oracle Press book Oracle Automatic Storage Management, Under-the-Hood & Practical Deployment Guide, by Nitin Vengurlekar, Murali Vallath and Rich Long.

October 5, 2011

Offline or drop?

When an ASM disk becomes unavailable, ASM drops it from the disk group, right? Well, that depends on the ASM version and on the disk group redundancy. An external redundancy disk group would be simply dismounted, so we will focus on the normal and high redundancy disk groups.

The disk would simply be dropped in ASM version 10g. Starting from 11gR1, a disk that becomes unavailable is first offlined, the disk repair timer kicks in, and if the time exceeds the value specified by DISK_REPAIR_TIME (disk group) attribute, the disk is dropped from the disk group. If the disk becomes available before the timer expires and its state is changed to online, the disk is not dropped from the disk group. But how would ASM learn that the disk is available and who would put the disk online?


A disk is considered unavailable if it cannot be read from or written to, by ASM or an ASM client. A database is a typical but not the only ASM client.

A disk can become unavailable for many reasons - faulty SCSI cable (for local disks), SAN network/switch issue (for SAN based storage), NFS server problem (for NFS based disks), site failure (in a stretched cluster setup), disk failure, etc. Whatever the reason, the ASM and/or the ASM client would report an I/O error and the ASM would take an action.


In version 10g, the ASM will immediately drop the disk that becomes unavailable. That would trigger a rebalance operation that will attempt to restore the data redundancy. Once the rebalance finishes, the data redundancy is fully restored and the disk is expelled from the disk group. Once the problem is resolved the disk can be added back into the disk group with an alter diskgroup command like this:

alter diskgroup DATA add disk 'ORCL:DISK077';

That will again trigger a rebalance and once it finishes, the disk will again be part of the disk group.

But what if multiple disks fail at the same time? What if one disk fails, the rebalance starts and then another disk fails? The outcome depends - on disk group redundancy, whether the disks are from the same or different failgroup and whether the disks are partners or not.

In a normal redundancy disk group, ASM can tolerate one or more (including all) disks becoming unavailable if they are all from the same failgroup. If disks from different failgroups become unavailable, ASM will tolerate it as long as the disks are not partners. By tolerate I mean the disk group will stay online and there will be no interruption for ASM clients.

In a high redundancy disk group, ASM can tolerate one or more (including all) disks becoming unavailable if they are all from two failgroups only. As for the disks from more than two failgroups, the same partnership rule applies. Basically ASM will tolerate unavailability of any number of disks as long as they are not partners.


So when a disk is dropped the whole disk group needs to be rebalanced and that takes time. During that time other disks can fail, increasing the risk of data loss. For that reason a fast disk resync(hronization)  was introduced in 11gR1. Instead of dropping the disk when it becomes unavailable, ASM simply takes the disk offline. The idea here is that the ASM administrator will be notified and the disk issue resolved - during the time it takes for the disk repair timer to expire.

The default value for the disk repair timer is 3.6 hours. That can be adjusted, to say 12 hours, with an alter diskgroup command, like this:

alter diskgroup DATA set attribute 'DISK_REPAIR_TIME' = '12h';

During the time the disk is offline, ASM keeps track of the changes that would have gone to the offlined disk. If the disks is made available before the timer expires and the disk is put back online, ASM applies the outstanding changes and no rebalance is required. That is called the fast disk resync.

If the issue with the disk becoming unavailable is not resolved and the disk repair timer expires, the disk is dropped from the disk group.


So a system engineer or an ASM administrator fixes the issue that caused the disk to became unavailable. Say they replace the faulty cable. But, who makes the disk online and how? Can that be automated?

Again it depends. If you are on Exadata or Oracle Database Appliance, the disk is put back online automatically. In all other cases an ASM administrator has to put the disk online with an alter diskgroup command, like this:

alter diskgroup DATA online disk 'ORCL:DISK077';


alter diskgroup DATA online all;


It is always a good idea to understand what happens in a fault scenario, what your version of ASM can and cannot do and what level of protection your disk group redundancy gives you.

September 22, 2011

amdu - ASM Metadata Dump Utility

The ASM Metadata Dump Utility - better knows as amdu - is used by Oracle Support and Oracle Development to diagnose and sometimes resolve ASM issues. It can be used to print ASM metadata and extract both ASM metadata and database datafiles from ASM disk groups.

The amdu does not depend on the mount state of an ASM instance or the mount state of a disk group, so it can be used with ASM instance down and with dismounted disk groups. It can even be used with damaged or missing ASM disks!

Use amdu to extract a controlfile from a mounted disk group

In the first example I will work with a disk group that is still mounted and will extract one of the controlfiles for database BR.

Let's first locate all controlfiles:

$ asmcmd find --type controlfile + "*"

So we have three copies of the controlfile in disk group DATA and three copies in disk group RECO. I will extract one controlfile, say Current.276.723906721, from disk group DATA.

I need to know the disks for disk group DATA:

$ asmcmd lsdsk -G DATA

So disk group DATA has two disks - DISK1 and DISK2, and these are ASMLIB disks (note the prefix 'ORCL'). Strictly speaking I did not need to know the disk names for this particular exercise, as all I need is the ASM discovery string (the value of ASM_DISKSTRING parameter).

Let's extract that controlfile out of the disk group DATA onto the file system:

$ cd /tmp
$ amdu -diskstring="ORCL:*" -extract DATA.276 -output control.276 -noreport -nodir
AMDU-00204: Disk N0001 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0001: 'ORCL:DISK1'
$ ls -l control.276
-rw-r--r-- 1 grid oinstall 9748480 Sep 22 22:42 control.276

The options used were as follows:
-diskstring: either the full path to disk devices or the value of ASM_DISKSTRING parameter.
-extract: the disk group name, full stop, the ASM file number.
-output: the output file name (in the current directory).
-noreport: do not generate the amdu run report
-nodir: do not create dump directory

Use amdu to extract a datafile from a dismounted disk group

Getting the controlfile out of a mounted disk group was fairly straightforward. But back in real life, a customer calls me to see if I can extract that important datafile, from the disk group that doesn't mount, with no backups and they are not sure of the exact name of that datafile. Let's see how I would go about extracting such file.

The objective is to extract a single datafile, named something like NSA, from disk group DATA that cannot be mounted. That means we cannot use sqlplus or asmcmd to locate the datafile.

Let's dump all metadata for disk group DATA

$ cd /tmp
$ amdu -dump DATA -noimage
$ cd amdu_2011_09_22_22_57_05
$ ls -l
total 28
-rw-r--r-- 1 grid oinstall  5600 Sep 22 22:57
-rw-r--r-- 1 grid oinstall 10462 Sep 22 22:57 report.txt

This time the dump directory was created and two files were generated.

File report.txt contains the information about the server, amdu command and options used, disks that are possibly members of the disk group DATA and the information about allocation units (AUs) on those disks. Let's review the contents of the report file:

$ more report.txt
******************************* AMDU Settings ********************************
ORACLE_HOME = /u01/app/11.2.0/grid
System name:    Linux
Node name:      
Release:        2.6.18-
Version:        #1 SMP Tue Aug 4 15:10:25 EDT 2009
Machine:        i686
amdu run:      
Endianess:      1
----------------------------- DISK REPORT N0001 ------------------------------
                Disk Path: ORCL:DISK1
           Unique Disk ID:
               Disk Label: DISK1
     Physical Sector Size: 512 bytes
                Disk Size: 4886 megabytes
               Group Name: DATA
                Disk Name: DISK1
       Failure Group Name: DISK1
              Disk Number: 0
            Header Status: 3
       Disk Creation Time: 2010/03/01 15:07:47.135000
          Last Mount Time: 2011/09/02 15:35:52.676000
    Compatibility Version: 0x0b200000(11020000)
         Disk Sector Size: 512 bytes
         Disk size in AUs: 4886 AUs
         Group Redundancy: 1
      Metadata Block Size: 4096 bytes
                  AU Size: 1048576 bytes
                   Stride: 113792 AUs
      Group Creation Time: 2010/03/01 15:07:46.819000
  File 1 Block 1 location: AU 2
              OCR Present: NO
************************** SCANNING DISKGROUP DATA ***************************
            Creation Time: 2010/03/01 15:07:46.819000
         Disks Discovered: 2
               Redundancy: 1
                  AU Size: 1048576 bytes
      Metadata Block Size: 4096 bytes
     Physical Sector Size: 512 bytes
          Metadata Stride: 113792 AU
   Duplicate Disk Numbers: 0
---------------------------- SCANNING DISK N0001 -----------------------------
Disk N0001: 'ORCL:DISK1'
           Allocated AU's: 2563
                Free AU's: 2323
       AU's read for dump: 34
       Block images saved: 6661
        Map lines written: 34
          Heartbeats seen: 0
  Corrupt metadata blocks: 0
        Corrupt AT blocks: 0

File contains the data map. While much more useful for our purposes, it is also very cryptic. Let's have a look:

$ more
N0001 D0000 R00 A00000000 F00000000 I0 E00000000 U00 C00256 S0000 B0000000000
N0001 D0000 R00 A00000001 F00000000 I0 E00000000 U00 C00256 S0000 B0000000000
N0001 D0000 R00 A00000002 F00000001 I0 E00000000 U00 C00256 S0000 B0000000000
N0001 D0000 R00 A00000003 F00000003 I0 E00000001 U00 C00256 S0000 B0000000000
N0001 D0000 R00 A00000004 F00000003 I0 E00000011 U00 C00256 S0000 B0000000000
N0001 D0000 R00 A00000234 F00000267 I1 E00000000 U00 C00001 S0000 B0000000000
N0004 D0001 R00 A00001512 F00000292 I1 E00000000 U00 C00001 S0000 B0000000000
N0004 D0001 R00 A00002304 F00000290 I1 E00000000 U00 C00003 S0000 B0000000000
N0004 D0001 R00 A00002643 F00000264 I1 E00000000 U00 C00001 S0000 B0000000000

Of the immediate interest are fields A and F. A00000234, for example, is telling me that this line is for AU 234. F00000267 is telling me that this line is about ASM file 267. This will come in handy later on.

Now we need to hunt down that NSA datafile...

ASM metadata file 6 is the alias directory, so that is the first place I will look at. From, I can work out AUs for ASM file 6:

$ grep F00000006
N0004 D0001 R00 A00000008 F00000006 I0 E00000000 U00 C00256 S0000 B0000000000

Pure luck - a single line! That tells me two things - there are not many files in this disk group (all their aliases fit in a single AU) and that the alias directory is in allocation unit 8 (A00000008) on disk 1 (D0001). From report.txt, I know that disk 1 is ORCL:DISK2 and that the AU size is 1MB. Let's use kfed to look at the alias directory.

$ ls -l /dev/oracleasm/disks/DISK2
brw-rw---- 1 grid asmadmin 8, 6 Aug 24 14:38 /dev/oracleasm/disks/DISK2

$ kfed read /dev/oracleasm/disks/DISK2 aun=8 | more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           11 ; 0x002: KFBTYP_ALIASDIR

Yep, this is indeed the alias directory. Now look for a datafile named NSA:

for (( i=0; i<256; i++ ))
  kfed read /dev/oracleasm/disks/DISK2 aun=8 blkn=$i | grep -1 NSA

That gave me the following output:

kfade[15].entry.refer.incarn:         0 ; 0x4a4: A=0 NUMM=0x0
kfade[15].name:             NSA_TN_DATA ; 0x4a8: length=11
kfade[15].fnum:                     267 ; 0x4d8: 0x0000010b

So the actual datafile name is NSA_TN_DATA and I can see it is ASM file number 267. With that information I can extract the datafile:

$ amdu -diskstring="ORCL:*" -extract DATA.267 -output NSA_TN_DATA.267 -noreport -nodir

$ ls -l
total 102544
-rw-r--r-- 1 grid oinstall      5600 Sep 22 22:57
-rw-r--r-- 1 grid oinstall 104865792 Sep 22 23:42 NSA_TN_DATA.267
-rw-r--r-- 1 grid oinstall     10462 Sep 22 22:57 report.txt

So what can I do with this file? Well, if I can extract the database controlfile, system and sysaux files, I might be able to use them to open the database. I may be able to plug this file into another database. Maybe I will need to use DUL to extract data from that file...

It is important to note that while the amdu will extract the file - the file itself may be corrupt or damaged in some way. After all there is a reason for the disk group not mounting - chances are the ASM metadata is corrupt or missing, but that can be the case with the datafile data as well. The point it that there is no substitute for backup, so keep that in mind.

The amdu dump triggered on an error

Since ASM version, an amdu dump may be triggered automatically by ORA-600 [kfd...] class of errors. When that happens, in addition to the error being logged in the ASM alert log, there will a message indicating an amdu dump as well. The dump will go into the diagnostic dump directory.


The amdu is a very handy utility, but it may be of limited value to an end user or a DBA. Still, knowing about amdu can be useful when dealing with Oracle Support.

September 21, 2011

ASM Toolbox

Here are the ASM tools I recommend you get familiar with.

asmcmd - command line interface to ASM

When ASM was released the asmcmd was useless. Today, in version 11gR2, the asmcmd is a truly versatile and useful tool.

ASMCA - ASM configuration assistant

ASMCA has two flavours - the pretty one, with the graphical user interface and the silent workhorse. While the GUI version is certainly useful, it's the silent asmcmd that is the powerful one.

kfed - ASM metadata editor

I have talked about using kfed to perform a health check on your ASM metadata blocks, have a closer look at ASM disk header and how it can be useful in mapping ASM disks to OS device names.

While kfed is not just a reader - it is a true editor - I did not talk about editing ASM metadata blocks as any such activity would render your ASM and database unsupported. As with other Oracle block editing tools, their use is limited to Oracle Support for data salvage and data export purposes only.

amdu - ASM metadata dump utility

This is also an unpublished utility, so I have to careful with my posts about it. It is most useful to Oracle Support, but it is good to at least understand what it can do. As its name implies, it can read and dump ASM metadata, but unlike kfed, it can extract database datafiles. And like kfed, amdu does not need the ASM instance up or the disk group to be mounted!

kfod - ASM discovery tool

This is a specialised tool that does one thing only - it discovers ASM disks. It is used during the software installation, but it can be used on its own at any time. Indeed, when ASM has problems mounting the disk group and in particular when it complains that it cannot find a disk(s), quick kfod run may prove a valuable diagnostic aid.

renamedg - disk group rename

Another specialised utility - it can be used to rename any disk group.

Caution should be exercised with this utility as it is not integrated with the rest of the software stack. That means it does not talk to the Clusterware, ASM or the database. While ASM can simply discover a new disk group, by means of reading disk headers from the disks it finds in the discovery path, your database would not have a clue that its files are now in a renamed disk group.

cellcli - Exadata storage cell command line interface

Now this is not an ASM tool. But understanding cellcli is essential for those lucky enough to have Exadata.


Hey, don't forget the sqlplus! It is still the most powerful ASM management and administration tool.

August 23, 2011

ASM disk header

ASM disk header is probably the best known piece of ASM metadata. Chances are you learned about it when it was damaged or lost and hopefully Oracle Support was able to get you up and running. In this post I will try to explain why ASM disk header is important and what it contains.

Block zero

ASM disks are formatted into Allocation Units. Some Allocation units contain ASM metadata and some contain database data. Allocation units that contain ASM metadata are formatted into ASM metadata blocks. Allocation unit 0 is at the beginning of an ASM disk and it always contain ASM metadata. The very first block (block 0) of Allocation Unit 0 contains the ASM disk header.

ASM disk header contents

Most of the data in the ASM disk header is of interest to that disk only. But some information in the ASM disk header is relevant to the whole disk group and some is even relevant to the whole cluster!

Let's use kfed to have a closer look at block 0 of an ASMLIB disk on Linux.

$ kfed read /dev/oracleasm/disks/ASMD1
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check: 473773689 ; 0x00c: 0x1c3d3679
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKASMD1 ; 0x000: length=13
kfdhdb.driver.reserved[0]: 1145918273 ; 0x008: 0x444d5341
kfdhdb.driver.reserved[1]: 49 ; 0x00c: 0x00000031
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: ASMD1 ; 0x028: length=5
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: ASMD1 ; 0x068: length=5

The result of the above kfed command shows us that this ASM block has two types of data - block header data - prefixed with kfbh, and ASM disk header data - prefixed with kfdhdb. In fact, every ASM metadata block will have the block header data plus the data specific to its block type.

Important ASM metadata block 0 header data

Data type Value
kfbh.endian System endianness. 0 - big endian, 1 - little endian.
kfbh.type ASM block type. KFBTYP_DISKHEAD tells us this is an ASM disk header block.
kfbh.block.blk ASM block number. Note the ASM disk header is block number 0.

Important ASM disk header specific data

Data type Value
kfdhdb.driver.provstr ORCLDISK+[ASM disk name] for ASMLIB disks. ORCLDISK for non-ASMLIB disks.

kfdhdb.dsknum ASM disk number.
kfdhdb.grptyp Disk group redundancy. KFDGTP_EXTERNAL - external, KFDGTP_NORMAL - normal, KFDGTP_HIGH - high.
kfdhdb.hdrsts ASM disk header status. For possible values see V$ASM_DISK.HEADER_STATUS.
kfdhdb.dskname ASM disk name.
kfdhdb.grpname ASM disk group name.
kfdhdb.fgname ASM failgroup name
kfdhdb.crestmp.hi|lo The date and time disk was added to the disk group.
kfdhdb.mntstmp.hi|lo Last time the disk was mounted.
kfdhdb.secsize Disk sector size (bytes).
kfdhdb.blksize ASM metadata block size (bytes).
kfdhdb.ausize Alloocation unit size (bytes). 1 MB is the default allocation unit size.
kfdhdb.dsksize Disk size (allocation units). In this case the disk size is 10239 MB.
kfdhdb.fstlocn Pointer to ASM Free Space Table. 1 = ASM block 1 in this allocation unit.
kfdhdb.altlocn Pointer to ASM Allocation Table. 2 = ASM block 2 in this allocation unit.
kfdhdb.f1b1locn Pointer to ASM File Directory. 2 = allocation unit 2.
kfdhdb.dbcompat Minimum database version. 0x0a100000 = 10.1.
kfdhdb.grpstmp.hi|lo The date and time the disk group was created.
kfdhdb.vfstart|vfend Start and end allocation unit number for the clusterware voting disk. If this is zero, the disk does not have voting disk data. Version 11.2 and later only.
kfdhdb.spfile Allocation unit number of the ASM spfile. Version 11.2 and later only.

kfdhdb.spfflg ASM spfile flag. If this is 1, the ASM spfile is on this disk in allocation unit kfdhdb.spfile. Version 11.2 and later only.

ASM disk header backup

In ASM versions and later, the ASM disk header block is backed up in the second last ASM metadata block in the allocation unit 1. To work out the second last block number we need to know the allocation unit size and ASM metadata block size.

I talked about this in my post on kfed, but let's do that again - get those values from the block header and calculate the second last block number in allocation unit 1:

$ ausize=`kfed read /dev/oracleasm/disks/ASMD1 | grep ausize | tr -s ' ' | cut -d' ' -f2`
$ blksize=`kfed read /dev/oracleasm/disks/ASMD1 | grep blksize | tr -s ' ' | cut -d' ' -f2`
$ let n=$ausize/$blksize-2
$ echo $n

$ kfed read /dev/oracleasm/disks/ASMD1 aun=1 blkn=254
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check: 473773689 ; 0x00c: 0x1c3d3679
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKASMD1 ; 0x000: length=13
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: ASMD1 ; 0x028: length=5
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: ASMD1 ; 0x068: length=5

So we see the same contents as in block 0 in allocation unit 0.

This can be very handy when the disk header is damaged or lost. All we have to do is run kfed repair [disk_name], and specify the allocation unit size if the value is not default (1MB). But as I said in the kfed post, please do not do this on your own - seek Oracle Support assistance if you suspect problems with ASM disk header.

ASM disk header in Exadata

ASM disks in Exadata are not exposed to the OS via device names. Instead they can be accessed via special name - "o/[IP address]/[disk name]". The kfed understands that syntax, so we can still use it in Exadata.

Let's have a look at the ASM disk header on an Exadata disk:

$ kfed read o/
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:DBFS_DG_CD_03_EXADATACEL01 ; 0x028: length=26
kfdhdb.grpname: DBFS_DG ; 0x048: length=7
kfdhdb.fgname: EXADATACEL01 ; 0x068: length=12
kfdhdb.ausize: 4194304 ; 0x0bc: 0x00400000

Some Exadata specific values in the ASM disk header are as follows:
  • ASM disk name that consists of the disk group name (DBFS_DG), cell disk label (CD), cell disk number (3) and the storage cell name (exadatacel01)
  • Failgroup name is the same as the storage cell name
  • Default allocation unit size in Exadata is 4 MB

ASM disk header contains the metadata essential for the operation and availability of an ASM disk group. To prevent the loss and accidental damage of the ASM disk header, Oracle recommends to protect it by partitioning the disk - thus 'moving' it away from the physical beginning of the disk. The ASM disk header in Exadata is protected by not exposing it to the database server OS. In ASM version and later, the ASM disk header is further protected by maintaining a copy in allocation unit 1.

July 19, 2011

How many partners?

ASM provides data redundancy by placing mirrored extent copies on partner disks in different failgroups. Each ASM disk, in a redundant disk group of course, would have up to 8 partner disks (up to 10 partners in ASM version 10). There is no disk partnership and there are no failgroups in external redundancy disk groups.

If a normal redundancy disk group has two disks, they are partners. Every extent from disk 0 is mirrored on disk 1, and every extent on disk 1 is mirrored on disk 0. So every disk has one partner.

In a normal redundancy disk group with 3 disks - and no manually specified failgroups - every disk would have two partners. Disk 0 partners with disks 1 and 2, disk 1 partners with disks 0 and 2, and finally disk 2 partners with disks 0 and 1. When an extent is placed on disk 0, its mirror copy is placed on either disk 1 or disk 2 - but not both. Remember, this is a normal redundancy disk group, so there are two copies of each extent, not three. Similarly an extent on disk 1 will have a mirror on either disk 0 or disk 2. And an extent on disk 2 will have a mirror on eitehr disk 0 or disk 1. Still fairly simple.

Similar situation with high redundancy disk group with 3 disks. The disk partnership will be exactly the same as in the normal redundancy disk group with 3 disks described above. The difference will be in mirroring - each extent on disk 0 will have copies on both partner disks 1 and 2. Same with extents on disk 1 - copies will be on both disk 0 and 2, and finally extents on disk 2 will have copies on both disks 0 and 1.

Now, in a normal redundancy disk group with 'lot of disks', every disk would have 8 partners. That means an extent on any of the disks will have a copy on one of its 8 partners. It is worth repeating this - an extent will always have a mirror copy on a partner disk only.

We can query x$kfdpartner to find out more about the disk partnership. Let's have a look at a disk group with 'lot of disks':

SQL> SELECT count(disk_number)
FROM v$asm_disk
WHERE group_number = 1;


The result shows that there are more than a few disks in this disk group. Let's see how many partners they have:

SQL> SELECT disk "Disk", count(number_kfdpartner) "Number of partners"
FROM x$kfdpartner
WHERE grp=1

      Disk Number of partners
---------- ------------------
         0                  8
         1                  8
         2                  8
       165                  8
       166                  8
       167                  8

168 rows selected.

The result shows that each disk has exactly 8 partners.

The following query will show us the partnership information for all disks in all disk groups:

SQL> set pages 1000
SQL> break on Group# on Disk#
SQL> SELECT d.group_number "Group#", d.disk_number "Disk#", p.number_kfdpartner "Partner disk#"
FROM x$kfdpartner p, v$asm_disk d
WHERE p.disk=d.disk_number and p.grp=d.group_number
ORDER BY 1, 2, 3;

    Group#      Disk# Partner disk#
---------- ---------- -------------
         1          0            12
                    1            13

                   29             4

816 rows selected..

The partnership is established automatically by ASM at CREATE DISKGROUP time and is updated on every ADD DISK and DROP DISK operation.

The partnership information is maintained in the Partnership and Status Table (PST) and the Disk Directory - both of which are important ASM metadata structures.

June 20, 2011

ASM file extent map

When ASM creates a file, e.g. on a request from an RDBMS instance, it allocates space in extents. Once the file is created, ASM passes the extent map to the RDBMS instance that can then access the file without involving ASM. If a file extent needs to be relocated (e.g. due to a disk group rebalance), ASM would advise RDBMS instance about the modifications to the extent map.

We can access ASM file extent maps by querying X$KFFXP in ASM instance. There is one row in X$KFFXP for every physical extent of every file in every mounted disk group.

The important columns of X$KFFXP are:
  • GROUP_KFFXP The disk group number. Note that the disk group number is not persistent, i.e. it can change every time the disk group is mounted. Same as V$ASM_DISKGROUP.GROUP_NUMBER.
  • NUMBER_KFFXP The file number - same as V$ASM_FILE.FILE_NUMBER. Note that this is an ASM file number, not to be confused with the database datafile number. File numbers under 256 are reserved for ASM metadata files.
  • INCARN_KFFXP The file incarnation number. It is changed every time an ASM file number is reused for a new file. Same as V$ASM_FILE.INCARNATION. Note that ASM file name ends in NUMBER_KFFXP.INCARN_KFFXP.
  • XNUM_KFFXP The virtual extent number. For external redundancy disk groups this is the same as the physical extent. For normal redundancy disk groups this is the physical extent divided by 2. For high redundancy disk groups this is the physical extent divided by 3.
  • PXN_KFFXP The physical extent number. The first physical extent of a file is number 0.
  • LXN_KFFXP The physical extent number within the virtual extent. 0 = primary extent, 1 = secondary extent, 2 = third copy of the extent.
  • DISK_KFFXP The disk number -  same as V$ASM_DISK.DISK_NUMBER.
  • AU_KFFXP The allocation unit number.
The following query - in an ASM instance - shows ASM metadata file numbers, names and allocation unit count in disk group number 3:

$ sqlplus / as sysasm
1, 'File directory', 2, 'Disk directory', 3, 'Active change directory', 4, 'Continuing operations directory',
5, 'Template directory', 6, 'Alias directory', 7, 'ADVM file directory', 8, 'Disk free space directory',
9, 'Attributes directory', 10, 'ASM User directory', 11, 'ASM user group directory', 12, 'Staleness directory',
253, 'spfile for ASM instance', 254, 'Stale bit map space registry ', 255, 'Oracle Cluster Repository registry') "ASM metadata file name",
count(AU_KFFXP) "Allocation units"
from X$KFFXP
order by 1;

ASM file number ASM metadata file name             Allocation units
--------------- ---------------------------------- ----------------
              1 File directory                                    3
              2 Disk directory                                    3
              3 Active change directory                          69
              4 Continuing operations directory                   6
              5 Template directory                                3
              6 Alias directory                                   3
              8 Disk free space directory                         3
              9 Attributes directory                              3
             12 Staleness directory                               3
            253 spfile for ASM instance                           2
            254 Stale bit map space registry                      3
            255 Oracle Cluster Repository registry              135

12 rows selected.

As we can see, this disk group does not have all types of metadata files. It is interesting to note that there are at least 3 allocation units for each file (except ASM spfile). More on this in a separate post...

Let's look at the extent map of a database control file.

First see if we have any control files in disk group DATA (run asmcmd as Grid Infrastructure OS user):

$ asmcmd find --type controlfile +DATA "*"

Now check the disk group number for disk group DATA (connect to ASM instance)

$ sqlplus / as sysasm



Now look at the extent map for ASM file 256 (+DATA/DBM/CONTROLFILE/Current.256.738247649) in disk group 1:

SQL> select XNUM_KFFXP "Virtual extent", PXN_KFFXP "Physical extent", LXN_KFFXP "Extent copy", DISK_KFFXP "Disk", AU_KFFXP "Allocation unit"
from X$KFFXP
where GROUP_KFFXP=1 and NUMBER_KFFXP=256 and XNUM_KFFXP<>2147483648
order by 1,2;

Virtual extent Physical extent Extent copy       Disk Allocation unit
-------------- --------------- ----------- ---------- ---------------
             0               0           0         20               5
             0               1           1         29            1903
             0               2           2          6              82
             1               3           0         22               6
             1               4           1         31               8
             1               5           2          9               3
             2               6           0         30               8
             2               7           1         23            1907
             2               8           2          7              63
             3               9           0         26               2
             3              10           1         16            1904
             3              11           2          6               4
            39             117           0         25            1913
            39             118           1         15            1906
            39             119           2          3              27

120 rows selected.


So this control file is tripple mirrored - each virtual extent has 3 physical extents. And the result shows the actual location of every allocation unit for this file.