June 20, 2011

ASM file extent map

When ASM creates a file, e.g. on a request from an RDBMS instance, it allocates space in extents. Once the file is created, ASM passes the extent map to the RDBMS instance that can then access the file without involving ASM. If a file extent needs to be relocated (e.g. due to a disk group rebalance), ASM would advise RDBMS instance about the modifications to the extent map.

We can access ASM file extent maps by querying X$KFFXP in ASM instance. There is one row in X$KFFXP for every physical extent of every file in every mounted disk group.

The important columns of X$KFFXP are:
  • GROUP_KFFXP The disk group number. Note that the disk group number is not persistent, i.e. it can change every time the disk group is mounted. Same as V$ASM_DISKGROUP.GROUP_NUMBER.
  • NUMBER_KFFXP The file number - same as V$ASM_FILE.FILE_NUMBER. Note that this is an ASM file number, not to be confused with the database datafile number. File numbers under 256 are reserved for ASM metadata files.
  • INCARN_KFFXP The file incarnation number. It is changed every time an ASM file number is reused for a new file. Same as V$ASM_FILE.INCARNATION. Note that ASM file name ends in NUMBER_KFFXP.INCARN_KFFXP.
  • XNUM_KFFXP The virtual extent number. For external redundancy disk groups this is the same as the physical extent. For normal redundancy disk groups this is the physical extent divided by 2. For high redundancy disk groups this is the physical extent divided by 3.
  • PXN_KFFXP The physical extent number. The first physical extent of a file is number 0.
  • LXN_KFFXP The physical extent number within the virtual extent. 0 = primary extent, 1 = secondary extent, 2 = third copy of the extent.
  • DISK_KFFXP The disk number -  same as V$ASM_DISK.DISK_NUMBER.
  • AU_KFFXP The allocation unit number.
The following query - in an ASM instance - shows ASM metadata file numbers, names and allocation unit count in disk group number 3:

$ sqlplus / as sysasm
SQL> select NUMBER_KFFXP "ASM file number", DECODE (NUMBER_KFFXP,
1, 'File directory', 2, 'Disk directory', 3, 'Active change directory', 4, 'Continuing operations directory',
5, 'Template directory', 6, 'Alias directory', 7, 'ADVM file directory', 8, 'Disk free space directory',
9, 'Attributes directory', 10, 'ASM User directory', 11, 'ASM user group directory', 12, 'Staleness directory',
253, 'spfile for ASM instance', 254, 'Stale bit map space registry ', 255, 'Oracle Cluster Repository registry') "ASM metadata file name",
count(AU_KFFXP) "Allocation units"
from X$KFFXP
where GROUP_KFFXP=3 and NUMBER_KFFXP<256
group by NUMBER_KFFXP
order by 1;

ASM file number ASM metadata file name             Allocation units
--------------- ---------------------------------- ----------------
              1 File directory                                    3
              2 Disk directory                                    3
              3 Active change directory                          69
              4 Continuing operations directory                   6
              5 Template directory                                3
              6 Alias directory                                   3
              8 Disk free space directory                         3
              9 Attributes directory                              3
             12 Staleness directory                               3
            253 spfile for ASM instance                           2
            254 Stale bit map space registry                      3
            255 Oracle Cluster Repository registry              135

12 rows selected.
SQL>

As we can see, this disk group does not have all types of metadata files. It is interesting to note that there are at least 3 allocation units for each file (except ASM spfile). More on this in a separate post...

Let's look at the extent map of a database control file.

First see if we have any control files in disk group DATA (run asmcmd as Grid Infrastructure OS user):

$ asmcmd find --type controlfile +DATA "*"
+DATA/DBM/CONTROLFILE/Current.256.738247649
+DATA/BR/CONTROLFILE/Current.299.748434267
$

Now check the disk group number for disk group DATA (connect to ASM instance)

$ sqlplus / as sysasm
SQL> select GROUP_NUMBER from V$ASM_DISKGROUP where NAME='DATA';

GROUP_NUMBER
------------
1

SQL>

Now look at the extent map for ASM file 256 (+DATA/DBM/CONTROLFILE/Current.256.738247649) in disk group 1:

SQL> select XNUM_KFFXP "Virtual extent", PXN_KFFXP "Physical extent", LXN_KFFXP "Extent copy", DISK_KFFXP "Disk", AU_KFFXP "Allocation unit"
from X$KFFXP
where GROUP_KFFXP=1 and NUMBER_KFFXP=256 and XNUM_KFFXP<>2147483648
order by 1,2;

Virtual extent Physical extent Extent copy       Disk Allocation unit
-------------- --------------- ----------- ---------- ---------------
             0               0           0         20               5
             0               1           1         29            1903
             0               2           2          6              82
             1               3           0         22               6
             1               4           1         31               8
             1               5           2          9               3
             2               6           0         30               8
             2               7           1         23            1907
             2               8           2          7              63
             3               9           0         26               2
             3              10           1         16            1904
             3              11           2          6               4
...
            39             117           0         25            1913
            39             118           1         15            1906
            39             119           2          3              27

120 rows selected.


SQL>

So this control file is tripple mirrored - each virtual extent has 3 physical extents. And the result shows the actual location of every allocation unit for this file.

18 comments:

  1. Hi Bane. I want say thank you for your blog. Greate job!!! I glad to read it.

    I have one some question. I wrote query:

    select AU_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=2 and DISK_KFFXP = 26 and LXN_KFFXP = 0
    minus
    select AU_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=2 and DISK_KFFXP != 26 and LXN_KFFXP = 1

    and saw some AUs without copy. I use diskgroup with normal redundancy.

    How that can be? Thats true even if i do rebalance manualy.

    ReplyDelete
  2. Thanks Maxim!

    I am not sure what you are trying to show, but the problem with the query is that AU_KFFXP on its own does not mean much.

    Let's look at couple of examples from my system. First your query:

    SQL> select AU_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=1 and DISK_KFFXP = 3 and LXN_KFFXP = 0
    minus
    select AU_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=1 and DISK_KFFXP != 3 and LXN_KFFXP = 1
    /
    AU_KFFXP
    ----------
    8
    ...
    1083

    39 rows selected.

    Now let's have a closer look at AU 8:

    select NUMBER_KFFXP, XNUM_KFFXP, AU_KFFXP, DISK_KFFXP, LXN_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=1 and AU_KFFXP=8 and DISK_KFFXP<>3
    order by 1, 2, 3, 4, 5;

    NUMBER_KFFXP XNUM_KFFXP AU_KFFXP DISK_KFFXP LXN_KFFXP
    ------------ ---------- ---------- ---------- ----------
    3 5 8 0 2
    3 6 8 2 2
    3 7 8 1 0

    This is showing us that we have allocation unit 8 on disks 0, 1 and 2 - not just on disk 3. And it also shows the this file (3) has third copy (LNX_KFFXP=2). Remember, even in a normal redundancy disk group you can have some files that are triple mirrored. This is a metadata file and those are always triple mirrored (of course if you have at least 3 failgroups).

    Let's have a look at another one - AU 1083:

    select NUMBER_KFFXP, XNUM_KFFXP, AU_KFFXP, DISK_KFFXP, LXN_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=1 and AU_KFFXP=1083 and DISK_KFFXP<>3
    order by 1, 2, 3, 4, 5;

    NUMBER_KFFXP XNUM_KFFXP AU_KFFXP DISK_KFFXP LXN_KFFXP
    ------------ ---------- ---------- ---------- ----------
    257 533 1083 2 0
    257 543 1083 0 0
    258 104 1083 1 0

    The interesting information here is that both files 257 and 257 have AU 1083. But this does not mean that AU 1083 for file 258 is not mirrored. Let's have a look:

    SQL> select NUMBER_KFFXP, XNUM_KFFXP, AU_KFFXP, DISK_KFFXP, LXN_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=1 and XNUM_KFFXP=104 and NUMBER_KFFXP=258
    order by 1, 2, 3, 4, 5; 2 3 4

    NUMBER_KFFXP XNUM_KFFXP AU_KFFXP DISK_KFFXP LXN_KFFXP
    ------------ ---------- ---------- ---------- ----------
    258 104 1083 1 0
    258 104 1090 0 1

    So that AU is indeed mirrored.

    Hope all this makes sense. Feel free to contact me via email (bane.radulovic at gmail.com) or G+ if you would like to discuss this further.

    Cheers,
    Bane

    ReplyDelete
  3. Thanks my bad. Need use XNUM_KFFXP instead of AU_KFFXP.

    Execute the query before drop disk to know every fine with our extents.

    select XNUM_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=2 and DISK_KFFXP = 3 and LXN_KFFXP in (0,1)
    minus
    select XNUM_KFFXP
    from X$KFFXP
    where GROUP_KFFXP=2 and DISK_KFFXP != 3 and LXN_KFFXP in (0,1)

    ReplyDelete
  4. Thank you for sharing this useful info, Bane

    On 11.1 with two failgroups I got:

    SQL> select GROUP_NUMBER,FILE_NUMBER,BLOCK_SIZE ,BLOCKS,REDUNDANCY, TYPE from V$ASM_FILE where type = 'CONTROLFILE';

    GROUP_NUMBER FILE_NUMBER BLOCK_SIZE BLOCKS REDUNDANCY TYPE
    ------------ ----------- ---------- ---------- ------------------ --------------------
    1 308 16384 4577 HIGH CONTROLFILE
    ...

    select XNUM_KFFXP "Virtual extent", PXN_KFFXP "Physical extent", LXN_KFFXP "Extent copy", DISK_KFFXP "Disk", AU_KFFXP "Allocation unit"
    from X$KFFXP
    3 where GROUP_KFFXP=1 and NUMBER_KFFXP=308;

    Virtual extent Physical extent Extent copy Disk Allocation unit
    -------------- --------------- ----------- ---------- ---------------
    0 0 0 12 11400
    0 1 1 1 11201
    0 2 2 65534 4294967294
    1 3 0 9 11547
    1 4 1 0 11567
    1 5 2 65534 4294967294
    ...

    "Disk"(DISK_KFFXP) = 65534 - is some type virtual record, produced by insufficient (for HIGH redundancy) failgroup count?

    ReplyDelete
    Replies
    1. Спасибо Igor,
      Yes, I have noticed those 'extreme' disk and allocation unit numbers, but I am not sure what they mean.
      Let me see if I can find out...
      Cheers,
      Bane

      Delete
    2. Hi Igor,

      I finally have some answers for you. You pretty much guessed what those number mean.

      When you have a triple mirrored file in a normal redundancy disk group with only two failgroups, the third extent will not be allocated, but the query from X$KFFXP will show the 'missing' extent. The disk number for such extent will be 65534, which represents a 'missing' disk, or in ASM speak, a hole disk. And the associated allocation unit will be AU number 4294967294.

      Cheers,
      Bane

      Delete
  5. hi, I have a question. You talk about ASM File Extent Map. There is a view to see its data. But where ASM Store File Extent Map ???

    ReplyDelete
  6. ASM keeps that information in file number 1. Have a look at http://asmsupportguy.blogspot.com/2012/01/asm-file-number-1.html.
    Cheers,
    Bane

    ReplyDelete
  7. E.g.
    CREATE DISKGROUP data NORMAL REDUNDANCY
    FAILGROUP controller1 DISK
    '/devices/diska1' NAME diska1,
    '/devices/diska2' NAME diska2,
    '/devices/diska3' NAME diska3,
    '/devices/diska4' NAME diska4
    FAILGROUP controller2 DISK
    '/devices/diskb1' NAME diskb1,
    '/devices/diskb2' NAME diskb2,
    '/devices/diskb3' NAME diskb3,
    '/devices/diskb4' NAME diskb4
    ;

    As per Oracle documentation "when a block is written to a file, each extent in the extent set is written in parallel". It means Primary and secondary extents are written paralelly at the same time.
    Now in above scenario, How ASM decides to write the primary extent and secondary extent? Is it always the case that ASM will consider the controller 1 as primary failgroup and write the primary extents into controller1 disks? How ASM chooses where to write primary extents and where to write secondary extents, what is the logic behind this selection? Please explain.

    Thanks a lot.

    ReplyDelete
    Replies
    1. See http://asmsupportguy.blogspot.com.au/2010/04/about-asm-allocation-units-extents.html?showComment=1371802418441#c6729264458726240169

      Delete
  8. Hi Bane,

    I had gone thru your post on allocation unit but seems I could not catch the point that how ASM decide where to put the primary and secondary extents? Doubt is still not clear, so kindly explain.

    Here I have got another question, in your post you mentioned "If we omit the failgroup specification, then ASM automatically places each disk into its own failgroup.", now e.g.
    CREATE DISKGROUP data NORMAL REDUNDANCY
    FAILGROUP controller1 DISK
    '/devices/diska1' NAME diska1,
    '/devices/diska2' NAME diska2,
    '/devices/diska3' NAME diska3,
    '/devices/diska4' NAME diska4
    ;
    In this normal redundancy case, how ASM will create the second fail group while I have only provided one fail group controller1. Kindly explain this as well.

    Thank you so much.

    ReplyDelete
    Replies
    1. That command will fail as you need to specify at least 2 failgroups in a normal redundancy disk group.

      Delete
  9. Hi Bane,

    CREATE DISKGROUP data NORMAL REDUNDANCY
    FAILGROUP controller1 DISK
    '/devices/diska1' NAME diska1
    FAILGROUP controller2 DISK
    '/devices/diskb1' NAME diskb1
    ;

    Now both failgroups are having only 1 disk, now how the striping will work in this case?

    Thanks a lot.

    ReplyDelete
    Replies
    1. Files will be striped across two disks. A file's extent 0 will be placed on diska1, copy of extent 0 on diskb1, extent 1 on diskb1, copy of extent 1 on diska1, extent 2 on diska1, copy of extent 2 on diskb1, etc...

      Delete
  10. Thanks a lot Bane.
    You have cleared a lot of doubts of mine.
    Great work.

    ReplyDelete