Skip to content
  • Jeffle Xu's avatar
    ext4: fix partial cluster initialization when splitting extent · cfb3c85a
    Jeffle Xu authored
    Fix the bug when calculating the physical block number of the first
    block in the split extent.
    
    This bug will cause xfstests shared/298 failure on ext4 with bigalloc
    enabled occasionally. Ext4 error messages indicate that previously freed
    blocks are being freed again, and the following fsck will fail due to
    the inconsistency of block bitmap and bg descriptor.
    
    The following is an example case:
    
    1. First, Initialize a ext4 filesystem with cluster size '16K', block size
    '4K', in which case, one cluster contains four blocks.
    
    2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent
    tree of this file is like:
    
    ...
    36864:[0]4:220160
    36868:[0]14332:145408
    51200:[0]2:231424
    ...
    
    3. Then execute PUNCH_HOLE fallocate on this file. The hole range is
    like:
    
    ..
    ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1
    ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1
    ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1
    ...
    
    4. Then the extent tree of this file after punching is like
    
    ...
    49507:[0]37:158047
    49547:[0]58:158087
    ...
    
    5. Detailed procedure of punching hole [49544, 49546]
    
    5.1. The block address space:
    ```
    lblk        ~49505  49506   49507~49543     49544~49546    49547~
    	  ---------+------+-------------+----------------+--------
    	    extent | hole |   extent	|	hole	 | extent
    	  ---------+------+-------------+----------------+--------
    pblk       ~158045  158046  158047~158083  158084~158086   158087~
    ```
    
    5.2. The detailed layout of cluster 39521:
    ```
    		cluster 39521
    	<------------------------------->
    
    		hole		  extent
    	<----------------------><--------
    
    lblk      49544   49545   49546   49547
    	+-------+-------+-------+-------+
    	|	|	|	|	|
    	+-------+-------+-------+-------+
    pblk     158084  1580845  158086  158087
    ```
    
    5.3. The ftrace output when punching hole [49544, 49546]:
    - ext4_ext_remove_space (start 49544, end 49546)
      - ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2])
        - ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2]
          - ext4_free_blocks: (block 158084 count 4)
            - ext4_mballoc_free (extent 1/6753/1)
    
    5.4. Ext4 error message in dmesg:
    EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
    EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters
    
    In this case, the whole cluster 39521 is freed mistakenly when freeing
    pblock 158084~158086 (i.e., the first three blocks of this cluster),
    although pblock 158087 (the last remaining block of this cluster) has
    not been freed yet.
    
    The root cause of this isuue is that, the pclu of the partial cluster is
    calculated mistakenly in ext4_ext_remove_space(). The correct
    partial_cluster.pclu (i.e., the cluster number of the first block in the
    next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather
    than 39522.
    
    Fixes: f4226d9e
    
     ("ext4: fix partial cluster initialization")
    Signed-off-by: default avatarJeffle Xu <jefflexu@linux.alibaba.com>
    Reviewed-by: default avatarEric Whitney <enwlinux@gmail.com>
    Cc: stable@kernel.org # v3.19+
    Link: https://lore.kernel.org/r/1590121124-37096-1-git-send-email-jefflexu@linux.alibaba.com
    
    
    Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    cfb3c85a