April 2018 1 post

Identifying file associated with a bad sector on ext2/ext3/ext4

Monday, April 16, 2018


I got some SMART warnings about a bad sector on my hard drive, and I wanted to know which specific file had the bad sector.

First, I looked at the SMART logs to see where the problem was:

# smartctl -x /dev/sdd
...

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 08 00 00 04 4b 5b c0 40 00  Error: UNC at LBA = 0x044b5bc0 = 72047552

...

fdisk -l is useful for looking at the partition info and sector size:

# fdisk -l /dev/sdd
Disk /dev/sdd: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x54afc7e9

Device     Boot Start        End    Sectors  Size Id Type
/dev/sdd1        2048 3907029167 3907027120  1.8T 83 Linux

Then I used badblocks to look around that physical sector for more bad sectors. My sector size is 512 bytes, shown above; also, badblocks takes the end sector number first, followed by the start sector:

# badblocks -b 512 /dev/sdd 72047570 72047540
72047552
72047553
72047554
72047555
72047556
72047557
72047558
72047559
72047560

Finally, debugfs is useful for finding which files are on those blocks.

Explanation:

  1. First, find the logical filesystem block number by computing (physical sector - partition start sector) * (physical sector size / filesystem block size). In my case, this would be (72047552 − 2048) * (512 / 4096) = 9005688. Since there are 9 contiguous sectors affected, the bad area stretches into block 9005689 as well.
  2. Use testb to see whether there is actually anything there. If not, then no data is lost.
  3. Use icheck to find the inode corresponding to those blocks. Luckily (?), both bad blocks are associated with the same inode here.
  4. Finally, use ncheck to find the pathname(s) associated with the inode.
# debugfs /dev/sdd1
debugfs 1.43.5 (04-Aug-2017)
debugfs:  testb 9005688
Block 9005688 marked in use
debugfs:  testb 9005689
Block 9005689 marked in use
debugfs:  icheck 9005688
Block   Inode number
9005688 105518423
debugfs:  icheck 9005689
Block   Inode number
9005689 105518423
debugfs:  ncheck 105518423
Inode   Pathname
105518423       /drz/rdiff-backup/artanis/var/lib/pgsql/data/base/21595/26720

Here, it was just a backup file, so once I swap out the hard drive or reallocate the sector, the next backup cycle will fix the lost data.

Tags: linux,filesystem | Posted at 22:57 | Comments (1)