Every ASM disk contains at least one Allocation Table (AT) that describes the contents of the disk. The AT has one entry for every allocation unit (AU) on the disk. If an AU is allocated, the Allocation Table will have the extent number and the file number the AU belongs to.
Finding the Allocation Table
The location of the first block of the Allocation Table is stored in the ASM disk header (field kfdhdb.altlocn). In the following example, the look up of that field shows that the AT starts at block 2.
$ kfed read /dev/sdc1 | grep kfdhdb.altlocn
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
Let’s have a closer look at the first block of the Allocation Table.
$ kfed read /dev/sdc1 blkn=2 | more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
kfdatb.aunum: 0 ; 0x000: 0x00000000
kfdatb.shrink: 448 ; 0x004: 0x01c0
...
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
kfdatb.aunum: 0 ; 0x000: 0x00000000
kfdatb.shrink: 448 ; 0x004: 0x01c0
...
The kfdatb.aunum=0, means that AU0 is the first AU described by this AT block. The kfdatb.shrink=448 means that this AT block can hold the information for 448 AUs. In the next AT block we should see kfdatb.aunum=448, meaning that it will have the info for AU448 + 448 more AUs. Let’s have a look:
$ kfed read /dev/sdc1 blkn=3 | grep kfdatb.aunum
kfdatb.aunum: 448 ; 0x000: 0x000001c0
kfdatb.aunum: 448 ; 0x000: 0x000001c0
The next AT block should show kfdatb.aunum=896:
$ kfed read /dev/sdc1 blkn=4 | grep kfdatb.aunum
kfdatb.aunum: 896 ; 0x000: 0x00000380
kfdatb.aunum: 896 ; 0x000: 0x00000380
And so on...
Allocation table entries
For allocated AUs, the Allocation Table entry (kfdate[i]) holds the extent number, file number and the state of the allocation unit - normally allocated (flag V=1), vs a free or unallocated AU (flag V=0).
Let’s have a look at Allocation Table block 3.
$ kfed read /dev/sdc1 blkn=3 | more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
kfdatb.aunum: 448 ; 0x000: 0x000001c0
...
kfdate[142].discriminator: 1 ; 0x498: 0x00000001
kfdate[142].allo.lo: 0 ; 0x498: XNUM=0x0
kfdate[142].allo.hi: 8388867 ; 0x49c: V=1 I=0 H=0 FNUM=0x103
kfdate[143].discriminator: 1 ; 0x4a0: 0x00000001
kfdate[143].allo.lo: 1 ; 0x4a0: XNUM=0x1
kfdate[143].allo.hi: 8388867 ; 0x4a4: V=1 I=0 H=0 FNUM=0x103
kfdate[144].discriminator: 1 ; 0x4a8: 0x00000001
kfdate[144].allo.lo: 2 ; 0x4a8: XNUM=0x2
kfdate[144].allo.hi: 8388867 ; 0x4ac: V=1 I=0 H=0 FNUM=0x103
kfdate[145].discriminator: 1 ; 0x4b0: 0x00000001
kfdate[145].allo.lo: 3 ; 0x4b0: XNUM=0x3
kfdate[145].allo.hi: 8388867 ; 0x4b4: V=1 I=0 H=0 FNUM=0x103
kfdate[146].discriminator: 1 ; 0x4b8: 0x00000001
kfdate[146].allo.lo: 4 ; 0x4b8: XNUM=0x4
kfdate[146].allo.hi: 8388867 ; 0x4bc: V=1 I=0 H=0 FNUM=0x103
kfdate[147].discriminator: 1 ; 0x4c0: 0x00000001
kfdate[147].allo.lo: 5 ; 0x4c0: XNUM=0x5
kfdate[147].allo.hi: 8388867 ; 0x4c4: V=1 I=0 H=0 FNUM=0x103
kfdate[148].discriminator: 0 ; 0x4c8: 0x00000000
kfdate[148].free.lo.next: 16 ; 0x4c8: 0x0010
kfdate[148].free.lo.prev: 16 ; 0x4ca: 0x0010
kfdate[148].free.hi: 2 ; 0x4cc: V=0 ASZM=0x2
kfdate[149].discriminator: 0 ; 0x4d0: 0x00000000
kfdate[149].free.lo.next: 0 ; 0x4d0: 0x0000
kfdate[149].free.lo.prev: 0 ; 0x4d2: 0x0000
kfdate[149].free.hi: 0 ; 0x4d4: V=0 ASZM=0x0
...
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
kfdatb.aunum: 448 ; 0x000: 0x000001c0
...
kfdate[142].discriminator: 1 ; 0x498: 0x00000001
kfdate[142].allo.lo: 0 ; 0x498: XNUM=0x0
kfdate[142].allo.hi: 8388867 ; 0x49c: V=1 I=0 H=0 FNUM=0x103
kfdate[143].discriminator: 1 ; 0x4a0: 0x00000001
kfdate[143].allo.lo: 1 ; 0x4a0: XNUM=0x1
kfdate[143].allo.hi: 8388867 ; 0x4a4: V=1 I=0 H=0 FNUM=0x103
kfdate[144].discriminator: 1 ; 0x4a8: 0x00000001
kfdate[144].allo.lo: 2 ; 0x4a8: XNUM=0x2
kfdate[144].allo.hi: 8388867 ; 0x4ac: V=1 I=0 H=0 FNUM=0x103
kfdate[145].discriminator: 1 ; 0x4b0: 0x00000001
kfdate[145].allo.lo: 3 ; 0x4b0: XNUM=0x3
kfdate[145].allo.hi: 8388867 ; 0x4b4: V=1 I=0 H=0 FNUM=0x103
kfdate[146].discriminator: 1 ; 0x4b8: 0x00000001
kfdate[146].allo.lo: 4 ; 0x4b8: XNUM=0x4
kfdate[146].allo.hi: 8388867 ; 0x4bc: V=1 I=0 H=0 FNUM=0x103
kfdate[147].discriminator: 1 ; 0x4c0: 0x00000001
kfdate[147].allo.lo: 5 ; 0x4c0: XNUM=0x5
kfdate[147].allo.hi: 8388867 ; 0x4c4: V=1 I=0 H=0 FNUM=0x103
kfdate[148].discriminator: 0 ; 0x4c8: 0x00000000
kfdate[148].free.lo.next: 16 ; 0x4c8: 0x0010
kfdate[148].free.lo.prev: 16 ; 0x4ca: 0x0010
kfdate[148].free.hi: 2 ; 0x4cc: V=0 ASZM=0x2
kfdate[149].discriminator: 0 ; 0x4d0: 0x00000000
kfdate[149].free.lo.next: 0 ; 0x4d0: 0x0000
kfdate[149].free.lo.prev: 0 ; 0x4d2: 0x0000
kfdate[149].free.hi: 0 ; 0x4d4: V=0 ASZM=0x0
...
The excerpt shows the Allocation Table entries for file 259 (hexadecimal FNUM=0x103), which start at kfdate[142] and end at kfdate[147]. That shows the ASM file 259 has the total of 6 AUs. The AU numbers will be the index of kfdate[i] + offset (kfdatb.aunum=448). In other words, 142+448=590, 143+448=591 ... 147+448=595. Let's verify that by querying X$KFFXP:
SQL> select AU_KFFXP
from X$KFFXP
where GROUP_KFFXP=1 -- disk group 1
and NUMBER_KFFXP=259 -- file 259
;
AU_KFFXP
----------
590
591
592
593
594
595
6 rows selected.
from X$KFFXP
where GROUP_KFFXP=1 -- disk group 1
and NUMBER_KFFXP=259 -- file 259
;
AU_KFFXP
----------
590
591
592
593
594
595
6 rows selected.
Free space
In the above kfed output, we see that kfdate[148] and kfdate[149] have the word free next to them, which marks them as free or unallocated allocation units (flagged with V=0). That kfed output is truncated, but there are many more free allocation units described by this AT block.
The stride
Each AT block can describe 448 AUs (the kfdatb.shrink value from the Allocation Table), and the whole AT can have 254 blocks (the kfdfsb.max value from the Free Space Table). This means that one Allocation Table can describe 254x448=113792 allocation units. This is called the stride, and the stride size - expressed in number of allocation units - is in the field kfdhdb.mfact, in ASM disk header:
$ kfed read /dev/sdc1 | grep kfdhdb.mfact
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
The stride size in this example is for the AU size of 1MB, that can fit 256 metadata blocks in AU0. Block 0 is for the disk header and block 1 is for the Free Space Table, which leaves 254 blocks for the Allocation Table blocks.
With the AU size of 4MB (default in Exadata), the stride size will be 454272 allocation units or 1817088 MB. With the larger AU size, the stride will also be larger.
How many Allocation Tables
Large ASM disks may have more than one stride. Each stride will have its own physically addressed metadata, which means that it will have its own Allocation Table.
The second stride will have its physically addressed metadata in the first AU of the stride. Let's have a look.
$ kfed read /dev/sdc1 | grep mfact
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
This shows the stride size is 113792 AUs. Let's check the AT entries for the second stride. Those should be in blocks 2-255 in AU113792.
$ kfed read /dev/sdc1 aun=113792 blkn=2 | grep type
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
$ kfed read /dev/sdc1 aun=113792 blkn=255 | grep type
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
...
$ kfed read /dev/sdc1 aun=113792 blkn=255 | grep type
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
As expected, we have another AT in AU113792. If we had another stride, there would be another AT at the beginning of that stride. As it happens, I have a large disk, with few strides, so we see the AT at the beginning at the third stride as well:
$ kfed read /dev/sdc1 aun=227584 blkn=2 | grep type
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
Conclusion
Every ASM disk contains at least one Allocation Table that describes the contents of the disk. The AT has one entry for every allocation unit on the disk. If the disk has more than one stride, each stride will have its own Allocation Table.
Your information is good. keep it up. i am also using a blog about Redundancy. Visit our website at About Redundancy
ReplyDelete