117 lines
4.6 KiB
Diff
117 lines
4.6 KiB
Diff
From: Filipe Manana <fdmanana@suse.com>
|
|
Date: Thu, 14 Feb 2019 14:59:15 +0000
|
|
Subject: Btrfs: fix corruption reading shared and compressed extents after
|
|
hole punching
|
|
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/commit/?h=for-5.1&id=7e8c76f7d19155ebe63c5f62864071f5ce4a3441
|
|
Bug-Debian: https://bugs.debian.org/922306
|
|
|
|
In the past we had data corruption when reading compressed extents that
|
|
are shared within the same file and they are consecutive, this got fixed
|
|
by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and
|
|
shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read
|
|
corruption of compressed and shared extents"). However there was a case
|
|
that was missing in those fixes, which is when the shared and compressed
|
|
extents are referenced with a non-zero offset. The following shell script
|
|
creates a reproducer for this issue:
|
|
|
|
#!/bin/bash
|
|
|
|
mkfs.btrfs -f /dev/sdc &> /dev/null
|
|
mount -o compress /dev/sdc /mnt/sdc
|
|
|
|
# Create a file with 3 consecutive compressed extents, each has an
|
|
# uncompressed size of 128Kb and a compressed size of 4Kb.
|
|
for ((i = 1; i <= 3; i++)); do
|
|
head -c 4096 /dev/zero
|
|
for ((j = 1; j <= 31; j++)); do
|
|
head -c 4096 /dev/zero | tr '\0' "\377"
|
|
done
|
|
done > /mnt/sdc/foobar
|
|
sync
|
|
|
|
echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)"
|
|
|
|
# Clone the first extent into offsets 128K and 256K.
|
|
xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
|
|
xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
|
|
sync
|
|
|
|
echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)"
|
|
|
|
# Punch holes into the regions that are already full of zeroes.
|
|
xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
|
|
xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
|
|
xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
|
|
sync
|
|
|
|
echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
|
|
|
|
echo "Dropping page cache..."
|
|
sysctl -q vm.drop_caches=1
|
|
echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
|
|
|
|
umount /dev/sdc
|
|
|
|
When running the script we get the following output:
|
|
|
|
Digest after file creation: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
|
|
linked 131072/131072 bytes at offset 131072
|
|
128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
|
|
linked 131072/131072 bytes at offset 262144
|
|
128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
|
|
Digest after cloning: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
|
|
Digest after hole punching: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
|
|
Dropping page cache...
|
|
Digest after hole punching: fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar
|
|
|
|
This happens because after reading all the pages of the extent in the
|
|
range from 128K to 256K for example, we read the hole at offset 256K
|
|
and then when reading the page at offset 260K we don't submit the
|
|
existing bio, which is responsible for filling all the page in the
|
|
range 128K to 256K only, therefore adding the pages from range 260K
|
|
to 384K to the existing bio and submitting it after iterating over the
|
|
entire range. Once the bio completes, the uncompressed data fills only
|
|
the pages in the range 128K to 256K because there's no more data read
|
|
from disk, leaving the pages in the range 260K to 384K unfilled. It is
|
|
just a slightly different variant of what was solved by commit
|
|
005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared
|
|
extents").
|
|
|
|
Fix this by forcing a bio submit, during readpages(), whenever we find a
|
|
compressed extent map for a page that is different from the extent map
|
|
for the previous page or has a different starting offset (in case it's
|
|
the same compressed extent), instead of the extent map's original start
|
|
offset.
|
|
|
|
A test case for fstests follows soon.
|
|
|
|
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
|
|
Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
|
|
Cc: stable@vger.kernel.org # 4.3+
|
|
Signed-off-by: Filipe Manana <fdmanana@suse.com>
|
|
---
|
|
fs/btrfs/extent_io.c | 4 ++--
|
|
1 file changed, 2 insertions(+), 2 deletions(-)
|
|
|
|
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
|
|
index 52abe4082680..1bfb7207bbf0 100644
|
|
--- a/fs/btrfs/extent_io.c
|
|
+++ b/fs/btrfs/extent_io.c
|
|
@@ -2985,11 +2985,11 @@ static int __do_readpage(struct extent_io_tree *tree,
|
|
*/
|
|
if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) &&
|
|
prev_em_start && *prev_em_start != (u64)-1 &&
|
|
- *prev_em_start != em->orig_start)
|
|
+ *prev_em_start != em->start)
|
|
force_bio_submit = true;
|
|
|
|
if (prev_em_start)
|
|
- *prev_em_start = em->orig_start;
|
|
+ *prev_em_start = em->start;
|
|
|
|
free_extent_map(em);
|
|
em = NULL;
|
|
--
|
|
2.20.1
|
|
|