Software-Raid
擴展軟體 RAID1 分區以包含 2 個鏡像驅動器,或轉換為 RAID10
我有一個連接到 Fedora 31 伺服器的 4 驅動器 e-sata,具有三個 1.5 TB 和一個 2 TB 驅動器。我按照這個優秀的 tecmint 教程創建了一個 RAID1 。我用過
--raid-devices=4
。好吧,這不會自動創建鏡像的 2 驅動器分區。它顯示只有 1.4 TB 可用。來自df -h
:/dev/md0 1.4T 425G 880G 33% /esata
然後:
lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.4T 0 disk └─sda1 8:1 0 1.4T 0 part └─md0 9:0 0 1.4T 0 raid1 sdb 8:16 0 1.4T 0 disk └─sdb1 8:17 0 1.4T 0 part └─md0 9:0 0 1.4T 0 raid1 sdd 8:48 0 1.4T 0 disk └─sdd1 8:49 0 1.4T 0 part └─md0 9:0 0 1.4T 0 raid1 sde 8:64 0 4.9T 0 disk ├─sde1 8:65 0 2M 0 part ├─sde2 8:66 0 476M 0 part /boot └─sde3 8:67 0 3.3T 0 part sdf 8:80 0 59.8G 0 disk └─sdf1 8:81 0 59.8G 0 part sdg 8:96 0 1.8T 0 disk └─sdg1 8:97 0 1.8T 0 part └─md0 9:0 0 1.4T 0 raid1 sr0 11:0 1 1024M 0 rom
和:
cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[0] sdg1[3] sdd1[2] sdb1[1] 1465005464 blocks super 1.2 [4/4] [UUUU] bitmap: 0/11 pages [0KB], 65536KB chunk unused devices: <none>
和:
mdadm -E /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1 /dev/sda1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8 Name : ourserver:0 (local to host ourserver) Creation Time : Fri Mar 13 16:46:35 2020 Raid Level : raid1 Raid Devices : 4 Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB) Array Size : 1465005440 (1397.14 GiB 1500.17 GB) Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB) Data Offset : 264192 sectors Super Offset : 8 sectors Unused Space : before=264112 sectors, after=48 sectors State : clean Device UUID : 7df3d233:060aaac3:04eb9f3a:65a9119e Internal Bitmap : 8 sectors from superblock Update Time : Sat Mar 14 08:32:32 2020 Bad Block Log : 512 entries available at offset 16 sectors Checksum : bbb40149 - correct Events : 20558 Device Role : Active device 0 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8 Name : ourserver:0 (local to host ourserver) Creation Time : Fri Mar 13 16:46:35 2020 Raid Level : raid1 Raid Devices : 4 Avail Dev Size : 2930010928 (1397.14 GiB 1500.17 GB) Array Size : 1465005440 (1397.14 GiB 1500.17 GB) Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB) Data Offset : 264192 sectors Super Offset : 8 sectors Unused Space : before=264112 sectors, after=48 sectors State : clean Device UUID : 434684bb:d297cd17:f5391b7b:0d73e9d7 Internal Bitmap : 8 sectors from superblock Update Time : Sat Mar 14 08:32:32 2020 Bad Block Log : 512 entries available at offset 16 sectors Checksum : 11dbfa76 - correct Events : 20558 Device Role : Active device 1 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdg1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8 Name : ourserver:0 (local to host ourserver) Creation Time : Fri Mar 13 16:46:35 2020 Raid Level : raid1 Raid Devices : 4 Avail Dev Size : 3906762928 (1862.89 GiB 2000.26 GB) Array Size : 1465005440 (1397.14 GiB 1500.17 GB) Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB) Data Offset : 264192 sectors Super Offset : 8 sectors Unused Space : before=264112 sectors, after=976752048 sectors State : clean Device UUID : 45a47922:251b01e7:a920b5ef:aec34c43 Internal Bitmap : 8 sectors from superblock Update Time : Sat Mar 14 08:32:32 2020 Bad Block Log : 512 entries available at offset 16 sectors Checksum : 623a20a2 - correct Events : 20558 Device Role : Active device 3 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdd1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 88b9fcb6:52d0f235:849bd9d6:c079cfc8 Name : ourserver:0 (local to host ourserver) Creation Time : Fri Mar 13 16:46:35 2020 Raid Level : raid1 Raid Devices : 4 Avail Dev Size : 2930012909 (1397.14 GiB 1500.17 GB) Array Size : 1465005440 (1397.14 GiB 1500.17 GB) Used Dev Size : 2930010880 (1397.14 GiB 1500.17 GB) Data Offset : 264192 sectors Super Offset : 8 sectors Unused Space : before=264112 sectors, after=2029 sectors State : clean Device UUID : 9f705e06:0b9a6d1a:fe4a0368:8a279a1a Internal Bitmap : 8 sectors from superblock Update Time : Sat Mar 14 08:32:32 2020 Bad Block Log : 512 entries available at offset 16 sectors Checksum : 8eeef44d - correct Events : 20558 Device Role : Active device 2 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
因此,我在 serverfault 上看到了一個使用者,而SE 上的另一個使用者建議使用
mdadm --assemble --update=devicesize /dev/md0
,我執行了它,然後mdadm -G /dev/md0 -z max
仍然具有相同的效果:mdadm --assemble --update=devicesize /dev/md0 /dev/sd[a-b]1 /dev/sdg1 /dev/sdd1 mdadm: /dev/md0 has been started with 4 drives. mdadm: component size of /dev/md0 unchanged at 1465005464K
我將如何更改這篇關於將 RAID 1 擴展到 RAID 10 的 SF 文章,或者只是獲得一個包含 2 個驅動器的鏡像分區?
由於軟體工程師Jean-Christophe Berthon的出色寫作,我解決了這個問題。如果我只是刪除了我用目錄和文件製作的巨大備份,那將節省更多時間。
儘管 RAID10 顯示健康,但我每天看到以下日誌,我認為這些日誌意味著替換 SDD1:
Mar 15 06:12:57 ourserver kernel: ata18.00: failed command: READ DMA EXT Mar 15 06:12:57 ourserver kernel: ata18.00: cmd 25/00:80:22:ba:c4/00:00:ab:00:00/e0 tag 31 dma 65536 in#012 res 51/40:00:6d:ba:c4 /00:00:ab:00:00/00 Emask 0x9 (media error) Mar 15 06:12:57 ourserver kernel: ata18.00: status: { DRDY ERR } Mar 15 06:12:57 ourserver kernel: ata18.00: error: { UNC } Mar 15 06:12:57 ourserver kernel: ata18.00: configured for UDMA/133 Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=2s Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Sense Key : Medium Error [current] Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 Add. Sense: Unrecovered read error - auto reallocate failed Mar 15 06:12:57 ourserver kernel: sd 17:0:0:0: [sdd] tag#31 CDB: Read(10) 28 00 ab c4 ba 22 00 00 80 00 Mar 15 06:12:57 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881796717 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0 Mar 15 06:12:57 ourserver kernel: ata18: EH complete Mar 15 06:13:00 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Mar 15 06:13:00 ourserver kernel: ata18.00: irq_stat 0x40000001 Mar 15 06:13:00 ourserver kernel: ata18.00: failed command: READ DMA EXT Mar 15 06:13:00 ourserver kernel: ata18.00: cmd 25/00:00:a2:ba:c4/00:09:ab:00:00/e0 tag 0 dma 1179648 in#012 res 51/40:00:41:bd:c 4/00:00:ab:00:00/00 Emask 0x9 (media error) Mar 15 06:13:00 ourserver kernel: ata18.00: status: { DRDY ERR } Mar 15 06:13:00 ourserver kernel: ata18.00: error: { UNC } Mar 15 06:13:01 ourserver kernel: ata18.00: configured for UDMA/133 Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Sense Key : Medium Error [current] Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed Mar 15 06:13:01 ourserver kernel: sd 17:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 ab c4 ba a2 00 09 00 00 Mar 15 06:13:01 ourserver kernel: blk_update_request: I/O error, dev sdd, sector 2881797441 op 0x0:(READ) flags 0x0 phys_seg 86 prio class 0 Mar 15 06:13:01 ourserver kernel: ata18: EH complete Mar 15 06:13:04 ourserver kernel: ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Mar 15 06:13:04 ourserver kernel: ata18.00: irq_stat 0x40000001 Mar 15 06:13:04 ourserver kernel: ata18.00: failed command: READ DMA EXT
並
smartctl
顯示了這一點:Error 45 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 80 ff ff ff ef 00 1d+07:38:55.963 READ DMA EXT 27 00 00 00 00 00 e0 00 1d+07:38:55.906 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 1d+07:38:55.905 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 1d+07:38:55.892 SET FEATURES [Set transfer mode] 27 00 00 00 00 00 e0 00 1d+07:38:55.830 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] Error 44 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 1d+07:38:52.146 READ DMA EXT 35 00 80 ff ff ff ef 00 1d+07:38:52.143 WRITE DMA EXT 35 00 80 ff ff ff ef 00 1d+07:38:52.142 WRITE DMA EXT 35 00 80 ff ff ff ef 00 1d+07:38:52.140 WRITE DMA EXT 27 00 00 00 00 00 e0 00 1d+07:38:52.112 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] Error 43 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 80 ff ff ff ef 00 1d+07:38:49.220 READ DMA EXT ea 00 00 00 00 00 a0 00 1d+07:38:49.163 FLUSH CACHE EXT ca 00 01 2a 00 00 e0 00 1d+07:38:49.163 WRITE DMA ea 00 00 00 00 00 a0 00 1d+07:38:49.162 FLUSH CACHE EXT ea 00 00 00 00 00 a0 00 1d+07:38:49.136 FLUSH CACHE EXT Error 42 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 1d+07:38:46.103 READ DMA EXT 27 00 00 00 00 00 e0 00 1d+07:38:46.075 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 1d+07:38:46.074 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 1d+07:38:46.060 SET FEATURES [Set transfer mode] 27 00 00 00 00 00 e0 00 1d+07:38:46.033 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] Error 41 occurred at disk power-on lifetime: 3736 hours (155 days + 16 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 80 ff ff ff ef 00 1d+07:38:42.036 READ DMA EXT 35 00 00 ff ff ff ef 00 1d+07:38:42.032 WRITE DMA EXT 35 00 80 ff ff ff ef 00 1d+07:38:42.025 WRITE DMA EXT 27 00 00 00 00 00 e0 00 1d+07:38:41.997 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 1d+07:38:41.996 IDENTIFY DEVICE
還看到這個:
smartctl -A /dev/sdd1 smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.8-200.fc31.x86_64] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 101 099 006 Pre-fail Always - 203989872 3 Spin_Up_Time 0x0003 099 097 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 16 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 23 7 Seek_Error_Rate 0x000f 070 060 030 Pre-fail Always - 12419382 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3774 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 10 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 101 188 Command_Timeout 0x0032 100 099 000 Old_age Always - 65537 189 High_Fly_Writes 0x003a 070 070 000 Old_age Always - 30 190 Airflow_Temperature_Cel 0x0022 072 060 045 Old_age Always - 28 (Min/Max 28/31) 194 Temperature_Celsius 0x0022 028 040 000 Old_age Always - 28 (0 19 0 0 0) 195 Hardware_ECC_Recovered 0x001a 044 006 000 Old_age Always - 203989872 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 105 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 105 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 3774 (62 166 0) 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 201703664 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1542427917