How to make sure Oracle ASM devices pointing to multipath devices and not scsi paths, sd devices when using ASMLib to manage ASM disks?
Red Hat Insights can detect this issue
Medio Ambiente
- Red Hat Enterprise Linux (RHEL) 5, 6, 7, 8
- device-mapper-multipath
- Oracle ASM using ASMLib
Cuestión
- Oracle application crashes when a single path in multipath fails. The application should be unaware of underlying path failures.
- ASM crashed and with that the Oracle DB
- Using Device Mapper Multipathing for Oracle database, and expect Oracle LUNs to see multipath, not sd devices?
- How to make sure Oracle ASM devices pointing to multipath devices and not scsi paths, sd devices when using
ASMLib
to manageASM
disks? - I/O's to the SAN are not shared across all the paths of multipath in Oracle application server configured with Oracleasm.
Resolución
ORACLEASM_SCANORDER
should be configured to force the use of the multipath pseudo-device. Since ASM uses entries from /proc/partition
, a filter would need to be set to exclude underlying paths.
Edit
/etc/sysconfig/oracleasm
and adddm
to the SCANORDER, andsd
to SCANEXCLUDE as follows:# ORACLEASM_SCANORDER: Matching patterns to order disk scanning ORACLEASM_SCANORDER="dm" # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan ORACLEASM_SCANEXCLUDE="sd"
This would require that the
oracleasm
configuration to be updated:# oracleasm configure # oracleasm scandisks
File
/etc/sysconfig/oracleasm
is soft-linked to/etc/sysconfig/oracleasm-_dev_oracleasm
which is the file used by OracleASM. Verify the soft-link exists.# ls -al /etc/sysconfig/oracleasm lrwxrwxrwx 1 root root 39 Feb 22 15:54 /etc/sysconfig/oracleasm -> /etc/ sysconfig/oracleasm-_dev_oracleasm
Changes to the
oracleasm
configuration file requires a restart of OracleASM service to take effect. This can be disruptive in a production environment.Note: It is recommended to schedule a reboot after setting
SCANORDER
andSCANEXCLUDE
in/etc/sysconfig/oracleasm
. Normally a system reboot is not required fororacleasm
to start using the multipath devices. However, in (private) RHBZ#1683606, it has been noticed that, whileoracleasm
was still allowed to detect single paths (before the configuration change and the restart oforacleasm
) it could change the value of counters used in device structures within the kernel (block_device.bd_holders
) to invalid (negative) values and make the paths appear as being in use. If this happens, restarting onlyoracleasm
will not clear the counters and the devices will continue appearing as being in use. In this case, multipath will still be unable to add the paths to the corresponding maps until the system is rebooted. If this happens, messages similar to the following will be appearing in the system logs whenever multipath tries to add one of those paths to the corresponding map:device-mapper: table: 253:<dm_num>: multipath: error getting device device-mapper: ioctl: error adding target to table
The problem can appear even when multipath is using the paths (i.e. the counters can be "silently" changed while the paths are in use by multipath). In such a scenario, the problem will appear in case of an outage, which will cause the paths to be removed from the maps. When the paths return, multipath will be failing to add them to the corresponding maps.
For this reason, it is recommended to schedule a reboot after setting
SCANORDER
andSCANEXCLUDE
in/etc/sysconfig/oracleasm
.Once restarted, verify the multipath device is being used, a major of 253 should be returned:
# oracleasm querydisk -d <ASM_DISK_NAME>
- Refer to Oracle documentation (Doc ID 868352.1: ASMLib Configuration File "/etc/sysconfig/oracleasm" Not Effective) for further information.
Causa Raíz
When devices were added to the DISKGROUP
, the underlying sd*
device was used instead of the multipath pseudo device.
The dm-*
devices are intended for internal use and are not persistent. However, once the DISKGROUP
is created this writes metadata to the device which ASM is then able to check the header regardless of the dm- assignment. The intention here is to force ASM to read from multipath devices.
Procedimientos para el Diagnóstico
Query the disk to obtain the
major:minor
number of the disk being used by the disk group:# /etc/init.d/oracleasm querydisk -d ASM_DATA1 Disk "ASM_Data1" is a valid ASM disk on device [8,16]
We can see that
8:16
is the underlyingsdb
path, not theASM_DATA1
multipath pseudo device. Failover would not occur with this configuration.ASM_DATA1 (3600500000000000001) dm-24 IBM,2107900 [size=100G][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=0][active] \_ 3:0:1:1 sdb 8:16 [failed][faulty] \_ 5:0:0:1 sdc 8:32 [active][ready] \_ 5:0:1:1 sdd 8:48 [active][ready] \_ 3:0:0:1 sde 8:64 [failed][faulty]
This can also be see in
/proc/partitions
:8 0 142577664 sda 8 1 514048 sda1 8 2 24579450 sda2 8 3 12289725 sda3 8 16 52428800 sdb 8 32 52428800 sdc 8 48 52428800 sdd 8 64 52428800 sde
The
major:minor
of the multipathASM_DATA1
pseudo device would be253:24
, ordm-24
. This is the device that should be used:253 24 52428800 dm-24
Note: To check if an
oracleasm
device is mapped correctly in a vmcore, please see How to map an oracleasm path in a vmcore.
No hay comentarios:
Publicar un comentario