Commit c15471f7 authored by Ryan Ding's avatar Ryan Ding Committed by Linus Torvalds

ocfs2: fix sparse file & data ordering issue in direct io

There are mainly three issues in the direct io code path after commit
24c40b32 ("ocfs2: implement ocfs2_direct_IO_write"):

  * Does not support sparse file.
  * Does not support data ordering.  eg: when write to a file hole, it
    will alloc extent first.  If system crashed before io finished, data
    will corrupt.
  * Potential risk when doing aio+dio.  The -EIOCBQUEUED return value is
    likely to be ignored by ocfs2_direct_IO_write().

To resolve above problems, re-design direct io code with following ideas:
  * Use buffer io to fill in holes.  And this will make better
    performance also.
  * Clear unwritten after direct write finished.  So we can make sure
    meta data changes after data write to disk.  (Unwritten extent is
    invisible to user, from user's view, meta data is not changed when
    allocate an unwritten extent.)
  * Clear ocfs2_direct_IO_write().  Do all ending work in end_io.

This patch has passed fs,dio,ltp-aiodio.part1,ltp-aiodio.part2,ltp-aiodio.part4
test cases of ltp.

For performance improvement, see following test result:
ocfs2 cluster size 1MB, ocfs2 volume is mounted on /mnt/.
The original way:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 1707.83 s, 2.5 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 582.705 s, 7.4 MB/s

After this patch:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 64.6412 s, 66.4 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 34.7611 s, 124 MB/s
Signed-off-by: default avatarRyan Ding <ryan.ding@oracle.com>
Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 4506cfb6
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment