Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Contents:

System configuration.

1x NVMe 1TB M.2 SSD
1x NVMe 3.84TB M.2 SSD
6x SATA 20TB SSD (RAID5)

Examine drive/partition information.

The current RAID5 configuration is not working at this time. First, list recognizable storage devices to identify those used for the RAID.

rdlab@exxact:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0  63.5M  1 loop /snap/core20/1974
loop1         7:1    0  63.5M  1 loop /snap/core20/2015
loop2         7:2    0  40.9M  1 loop /snap/snapd/20290
loop3         7:3    0  40.9M  1 loop /snap/snapd/20092
loop4         7:4    0  67.8M  1 loop /snap/lxd/22753
loop5         7:5    0  91.9M  1 loop /snap/lxd/24061
sda           8:0    0  18.2T  0 disk
└─sda1        8:1    0  18.2T  0 part
sdb           8:16   0  18.2T  0 disk
└─sdb1        8:17   0  18.2T  0 part
sdc           8:32   0  18.2T  0 disk
└─sdc1        8:33   0  18.2T  0 part
sdd           8:48   0  18.2T  0 disk
└─sdd1        8:49   0  18.2T  0 part
sde           8:64   0  18.2T  0 disk
└─sde1        8:65   0  18.2T  0 part
sdf           8:80   0  18.2T  0 disk
└─sdf1        8:81   0  18.2T  0 part
nvme1n1     259:0    0   3.5T  0 disk
└─nvme1n1p1 259:1    0   3.5T  0 part /scratch
nvme0n1     259:2    0 953.9G  0 disk
├─nvme0n1p1 259:3    0   1.1G  0 part /boot/efi
├─nvme0n1p2 259:4    0     1G  0 part /boot
├─nvme0n1p3 259:5    0    10G  0 part [SWAP]
└─nvme0n1p4 259:6    0 941.8G  0 part

This RAID5 is configured on sda, sdb, sdc, sdd, sde, sdf.

Display the OS configured partitions.

rdlab@exxact:~$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/disk/by-uuid/7639e5c3-3fcb-4b5b-953e-58461c5b9fb9 none swap sw 0 0
# / was on /dev/nvme0n1p4 during curtin installation
/dev/disk/by-uuid/7f5dfbfd-e027-4073-81a6-758484ecc019 / ext4 defaults 0 1
# /scratch was on /dev/nvme1n1p1 during curtin installation
/dev/disk/by-uuid/7ca82a8b-e49f-4af2-9e45-ca84ffcf8d52 /scratch ext4 defaults 0 1

# /data was on /dev/md0p1 during curtin installation
# /dev/disk/by-id/md-uuid-b690237f:da587456:50a2b64e:52c2abc6-part1 /data ext4 defaults 0 1
/dev/disk/by-id/md-uuid-8e1d9efa:a6028ec7:39cd0e4f:c76c036f /data xfs defaults 0 1

# /boot was on /dev/nvme0n1p2 during curtin installation
/dev/disk/by-uuid/6070f6d4-8edf-4032-b0d3-e9709be0326e /boot ext4 defaults 0 1
# /boot/efi was on /dev/nvme0n1p1 during curtin installation
/dev/disk/by-uuid/C232-F548 /boot/efi vfat defaults 0 1
#/swap.img      none    swap    sw      0       0

The RAID5 partition has the following details.

# /data was on /dev/md0p1 during curtin installation
# /dev/disk/by-id/md-uuid-b690237f:da587456:50a2b64e:52c2abc6-part1 /data ext4 defaults 0 1
/dev/disk/by-id/md-uuid-8e1d9efa:a6028ec7:39cd0e4f:c76c036f /data xfs defaults 0 1

The mounted partition is “/data” with “xfs” file system. The UUID value is an important reference to identify the drive.

View the “md127” RAID status.

cate@jupiter:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : inactive sdc1[2](S) sde1[4](S) sdf1[6](S) sdd1[3](S) sdb1[1](S) sda1[0](S)
      117190146048 blocks super 1.2

unused devices: <none>

Status shows the RAID is “inactive”.

View additional details of “md127”.

root@jupiter:/home/exx# mdadm --query /dev/md127
/dev/md127:  (null) 0 devices, 6 spares. Use mdadm --detail for more detail.

root@jupiter:/home/exx# mdadm --detail /dev/md127
/dev/md127:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 6
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 6

              Name : jupiter:0  (local to host jupiter)
              UUID : 03f505ac:d5a96bac:5d4da761:810ad4a6
            Events : 93758

    Number   Major   Minor   RaidDevice

       -       8        1        -        /dev/sda1
       -       8       81        -        /dev/sdf1
       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1
       -       8       17        -        /dev/sdb1

Why is the RAID Level: RAID0 when it should be RAID5?

It’s possible that the RAID failed and there was a recovery attempt. Unfortunately, recovery failed so another attempt was made to rebuild it. The rebuild attempt was most likely configured incorrectly.

Additional information on the UUID can be checked to verify proper OS identification.

root@jupiter:/home/exx# mdadm --detail --scan /dev/md127
INACTIVE-ARRAY /dev/md127 metadata=1.2 name=jupiter:0 UUID=03f505ac:d5a96bac:5d4da761:810ad4a6

root@jupiter:/home# cat /etc/mdadm/mdadm.conf
# ARRAY /dev/md0 metadata=1.2 spares=1 name=ubuntu-server:0 UUID=b690237f:da587456:50a2b64e:52c2abc6
MAILADDR root
ARRAY /dev/md/data  metadata=1.2 UUID=8e1d9efa:a6028ec7:39cd0e4f:c76c036f name=sn4622111485:data

The “mdadm.conf” file is important to verify there is reference to show a RAID was configured. Originally, the RAID was created when the OS was installed. It can be seen there is a past reference name, “md0”, with a different UUID. This information was entered as a comment line by the installer.

Display additional RAID details to identify RAID failure.

root@jupiter:/home# mdadm --stop /dev/md127
mdadm: stopped /dev/md127

root@jupiter:/home# mdadm --assemble --scan --verbose
mdadm: looking for devices for /dev/md/data
mdadm: /dev/sde1 has wrong uuid.
mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sdf1 has wrong uuid.
mdadm: No super block found on /dev/sdf (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdf
mdadm: /dev/sdc1 has wrong uuid.
mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdb1 has wrong uuid.
mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdd1 has wrong uuid.
mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sda1 has wrong uuid.
mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sda
mdadm: No super block found on /dev/nvme0n1p4 (Expected magic a92b4efc, got 00000477)
mdadm: no RAID superblock on /dev/nvme0n1p4
mdadm: No super block found on /dev/nvme0n1p3 (Expected magic a92b4efc, got 0000003f)
mdadm: no RAID superblock on /dev/nvme0n1p3
mdadm: No super block found on /dev/nvme0n1p2 (Expected magic a92b4efc, got 00000081)
mdadm: no RAID superblock on /dev/nvme0n1p2
mdadm: No super block found on /dev/nvme0n1p1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/nvme0n1p1
mdadm: No super block found on /dev/nvme0n1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/nvme0n1
mdadm: No super block found on /dev/nvme1n1p1 (Expected magic a92b4efc, got 000005c1)
mdadm: no RAID superblock on /dev/nvme1n1p1
mdadm: No super block found on /dev/nvme1n1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/nvme1n1
mdadm: No super block found on /dev/loop5 (Expected magic a92b4efc, got a6eff301)
mdadm: no RAID superblock on /dev/loop5
mdadm: No super block found on /dev/loop4 (Expected magic a92b4efc, got a6eff301)
mdadm: no RAID superblock on /dev/loop4
mdadm: No super block found on /dev/loop3 (Expected magic a92b4efc, got fa2c5214)
mdadm: no RAID superblock on /dev/loop3
mdadm: No super block found on /dev/loop2 (Expected magic a92b4efc, got 9dbc89cd)
mdadm: no RAID superblock on /dev/loop2
mdadm: No super block found on /dev/loop1 (Expected magic a92b4efc, got 0000000a)
mdadm: no RAID superblock on /dev/loop1
mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got 32138a62)
mdadm: no RAID superblock on /dev/loop0

 Wrong UUID on the six RAID drives: sda, sdb, sdc, sdd, sde, sdf.

Conclusion

At this point of the investigation, it was decided to wipe out the existing /dev/md127 (RAID0) with associated devices and re-create the RAID5 configuration.

REF: ZD-4301 / ZD-6131 / ZD-6713 / ZD-7113

  • No labels