[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Info-mtools] Bug: Corruption in FAT table when copying multiple files.

From: Lenny Bensman
Subject: [Info-mtools] Bug: Corruption in FAT table when copying multiple files.
Date: Wed, 27 Nov 2019 19:01:20 -0500


I just spent a few days chasing an issue in our build, and the issue turned out to be an apparent bug in mcopy utility of mtools. 

I'm using is  4.0.18.  I'm unable to (easily) test on the latest version because it's a complicated build using NixOS/nixpkg, for which I'd have to build a proper derivation to pull in a newer version.

Our build process creates a bootable USB stick installation media image with a FAT partition.  The build generates the following directory tree to be copied onto the FAT partition (some names have been changed a bit due to IP reasons with overall filename length preserved):

[root@nixos:/tmp/usb]# ls -lR .
total 571944
drwxr-xr-x 2 root root      4096 Nov 26 18:35 boot
drwxr-xr-x 3 root root      4096 Nov 26 18:35 EFI
-rwxr-xr-x 1 root root      1181 Dec 31  2097
drwxr-xr-x 3 root root      4096 Nov 26 18:35 loader
-rwxr-xr-x 1 root root 585650176 Dec 31  2097 nix-store.squashfs
-rwxr-xr-x 1 root root        50 Dec 31  2097 version.txt

total 13108
-rwxr-xr-x 1 root root 4081200 Dec 31  2097 bzImage
-rwxr-xr-x 1 root root 9336711 Dec 31  2097 initrd

total 4
drwxr-xr-x 2 root root 4096 Nov 26 18:35 boot

total 76
-rwxr-xr-x 1 root root 75285 Dec 31  2097 bootx64.efi

total 8
drwxr-xr-x 2 root root 4096 Nov 26 18:35 entries
-rwxr-xr-x 1 root root   35 Dec 31  2097 loader.conf

total 4
-rwxr-xr-x 1 root root 324 Dec 31  2097 abcdefg-install.conf

The copying is done not onto the physical media, but onto a file that represents partition, using `-i file` option.  The media preparation and copying is done using approximately the following steps:
1) Calculate needed media size by taking all files' sizes using `du --apparent-size`.  For the above listing, the file was created with size 659570688 (which gives padding of 10% to the sum of all file sizes, rounded up to the nearest boundary.
2) Create file using `touch --size=`.
3) Format file using `mkfs.vfat`.  I tried replacing it with `mformat` but it made no difference.
4) Use `mcopy -i partitionfile ./* ::` to copy over files.
5) Run `fsck.vfat -a` to validate resulting image.
The last step frequently gives errors on validation about missing `.` and `..` parent entries, sometimes generating zero-length FSCK000.00x files.
Sometimes, the resulting media, even after fsck run, is corrupt with some unreadable directories (usually `loader`, but varies).
I also noticed that running in our build system vs local build would result in different outcomes, with differences in input only being version number (as populated in some binaries which would cause size to vary), as well as in that version.txt file listed above.

I believe mcopy has issues either when copying more than one object; when copying objects for which subdirectories need to be created; or both.
I modified our build script to change and split step 4 above into the following:
1) Pre-create all directories and subdirectories using `mmd` and only create one directory per invocation of `mmd`.  E.g. something like `find . -type d | xargs -I% mmd -i <filepartition> ::%`.  (-I of xargs ensures that each file is run with new instance of `mmd`)
2) Copy each file into its respective folder, again one at a time per invocation of `mcopy`.  E.g. something like `find . -type f | xargs -I% mcopy -i <filepartition> % ::%`
Then, running `fsck.vfat` on the resulting file partition image file produces no complaints about file and contained FAT structures.  I get the same results locally and through our build (speaking to the variance of versions noted earlier), and I varied padding from 10% to 15% and that also produced perfect image with no complaints from `fsck.vfat` (previously it resulted in number of errors detected in fs).

A bug in mtools/mcopy that incorrectly updates FAT entries when dealing with more than one object being copied; or, incorrectly updating FAT entries when mcopy is the one creating directories, or, a combination of the two cases.

Let me know if I can provide additional information.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]