April 25, 2010

About ASM disk groups, disks and files


Oracle ASM uses disk groups to store data files. An ASM disk group is a collection of disks managed as a unit. Within a disk group, ASM exposes a file system interface for Oracle database files. The content of files that are stored in a disk group is evenly distributed to eliminate hot spots and to provide uniform performance across the disks. The performance is comparable to the performance of raw devices. [From Oracle® Automatic Storage Management Administrator's Guide 11g Release 2].


ASM Disk Groups


An ASM disk group consists of one or more disks and is the fundamental object that ASM manages. Each disk group is self contained and has its own ASM metadata. It is that ASM metadata that an ASM instance manages.


The idea with ASM is to have small number of disk groups. In ASM versions before 11.2, two disk groups should be sufficient – one for datafiles and one for backups/archive logs. In 11.2 you would want to create a separate disk group for ASM spfile, Oracle Cluster Registry (OCR) and voting disks – provided you opt to place those objects in an ASM disk group.


ASM Disks


Disks to be used by ASM have to be set up and provisioned by OS/storage administrator before ASM installation/setup. Disks can be local physical devices (IDE, SATA, SCSI, etc), SAN based LUNs (iSCSI, FC, FCoE, etc) or NAS/NFS based disks. Disks to be used for ASM should be partitioned. Even if the whole disk is to be used by ASM, it should have a single partition.


The above is true for all environments except for Exadata – where ASM makes use of grid disks, created from cell disks and presented to ASM via LIBCELL interface.


An ASM disk group can have up to 10,000 disks. Maximum size for an individual ASM disk is 2 TB. Due to bug 6453944, it is possible to add disks over 2 TB to an ASM disk group. The fix for bug 6453944 is in 10.2.0.4, 11.1.0.7 and 11.2. MOS Doc ID 736891.1 has more on that.

ASM looks for disks in the OS location specified by ASM_DISKSTRING initialization parameter. All platforms have the default value, so this parameter does not have to be specified. In a cluster, ASM disks can have different OS names on different nodes. In fact, ASM does not care about the OS disk names, as those are not kept in ASM metadata.


ASM Files


Any ASM file is allocated from and completely contained within a single disk group. However, a disk group might contain files belonging to several databases and a single database can have files in multiple disk groups.

ASM can store all Oracle database file types – datafiles, control files, redo logs, backup sets, data pump files, etc – but not binaries or text files. In addition to that, ASM also stores its metadata files within the disk group. ASM has its own file numbering scheme – independent of database file numbering.  ASM file numbers under 256 are reserved for ASM metadata files.


ASM Cluster File System (ACFS), introduced in 11.2, extends ASM support to database and application binaries, trace and log files, and in fact any files that can be stored on a traditional file systems. And most importantly, the ACFS is a cluster file system.

12 comments:

  1. You said: "Disks to be used for ASM should be partitioned. Even if the whole disk is to be used by ASM, it should have a single partition.".
    Why?
    Thanks in advance, Davide

    ReplyDelete
  2. Hi Davide,
    There is no technical requirement to use a partition instead of the whole disk. This is more of a recommendation from our 'best practices'. We enforce that on Linux if you make use of ASMLIB for example. ASMLIB will insist that you create at least one partition on the disk.
    We have lot of cases where OS admins destroy ASM disks as they don't know that it's being used by ASM. Having a partition at lest tells the OS admin that someone is using the disk.
    Hope this helps.
    Cheers,
    Bane

    ReplyDelete
  3. Hi Bane, thank you very much for your quick answer.
    And obviously I thank you for your really useful articles.
    Cheers, Davide

    ReplyDelete
  4. Hello Bane, if a single disk is accidentally added to two different diskgroups belonging to two separate databases. Would dropping the disk from one diskgroup solve the problem ? would oracle do an automatic rebalance ? Please advise.

    Thanks in advance
    DG

    ReplyDelete
    Replies
    1. Not sure how you managed to do that as ASM should have told you that the disk was in use.

      Anyway, if the disk was originally from the normal/high redundancy disk group, yes - just drop it. Wait for the rebalance to finish (yes, the rebalance will be automatic) and once you are sure the rebalance has completed, add the disk back to the original disk group.

      If the original disk group was external redundancy, that disk group is gone. You would have to recreate it and restore the database from backups.

      Hope this helps. If the problem is more complicated just drop me a line.

      Cheers,
      Bane

      Delete
  5. Hello Bane,
    Great reading you have written here. I would like to check one thing not mention in the docs.
    If I have an external redundancy diskgroup holding up one voting disk and one OCR and I decide to provision new storage array and get rid of the old luns. I have my voting disk on one of the luns, let's say lunX. From many MOS documents Oracle say if you are switching to new array just add the luns from the new array and drop the old ones. I can even try to do this in a single command to have less amount to rebalance. Question is: what is going on with the OCR and voting file are they going to be transferred to a random new lunY ? Do I need to have one more dummy disk group to do "crsctl replace votedisk +DUMMY" and after the rebalance go back to the original ?

    Thanks and BR
    Ognyan

    ReplyDelete
    Replies
    1. Thanks Ognyan!
      All you need to do is add new LUNs and drop the old ones - in a single ALTER DISKGROUP command. Yes, ASM will place the voting disk on one of the LUNs for you - no need to do anything else with the crsctl. As for the OCR - it's like any other file - it will be spread over the new LUNs.
      Cheers,
      Bane

      Delete
  6. Hello Bane,
    It worked well OCR, voting file, everythig is fine. Two interesting things happened that we did not know and would like to share with you. First we hit the 2T size limitation for a LUN on a Linux x64 box:

    SQL> create diskgroup DUMMY external redundancy disk '/dev/mapper/TMP_wa_01_01','/dev/mapper/TMP_wa_02_01';
    create diskgroup DUMMY external redundancy disk '/dev/mapper/TMP_wa_01_01','/dev/mapper/TMP_wa_02_01'
    *
    ERROR at line 1:
    ORA-15018: diskgroup cannot be created
    ORA-15099: disk '/dev/mapper/TMP_wa_02_01' is larger than maximum size of 2097152 MBs
    ORA-15099: disk '/dev/mapper/TMP_wa_01_01' is larger than maximum size of 2097152 MBs

    SO from MOS it says only in exadata we can have >2T.


    and second the v$asm_operation seem to be incorrect and I think we can better trust some calculation like this one:

    SQL> select sum(TOTAL_MB) - sum(FREE_MB) from v$asm_disk where PATH like '/dev/mapper/v%';

    SUM(TOTAL_MB)-SUM(FREE_MB)
    --------------------------
    717269

    And when this number get ) we are just waiting for the "compact" phase.

    For example by this query I have made calculations for 6h and v$asm_operation showed 10+ hours.
    Is it something that I am missing or just v$asm_operation algorithm is making incorrect.

    Cheers and BR
    Ognyan

    ReplyDelete
    Replies
    1. Yes, there is a 2 TB limit for non-Exadata disks.

      And yes, the v$asm_operation is not very precise, but it's best we have at the moment.

      It's hard to comment on the the timing for your case as I don't know if there was any load on the system at the time, what was the rate of moving from one storage to the next, when the compacting started, etc. In any case it was the good idea to run that query to check when all data would move out of the old disks. If the compacting took 4 hours, then that would account for the difference, but again it's not really easy to comment without any other info.

      Cheers,
      Bane

      Delete
  7. It actually was not loaded by any other DBs and rebalance was hitting ~400Mb/s at rebalnce power 11. Total size moved was 7.2T. It took around 6 hours and compacting phase as per your blog query run no longer than couple of minutes.
    BR
    Ognyan

    ReplyDelete
  8. Fundamental question regarding your first sentence Bane.” Oracle ASM uses disk groups to store data files.”. Doesn’t ASM store a mapping to where data is in a data file. It doesn’t actually store the data files.

    ReplyDelete