Skip to content
  • Hugh Dickins's avatar
    mm: get_user_pages(write,force) refuse to COW in shared areas · cda540ac
    Hugh Dickins authored
    
    
    get_user_pages(write=1, force=1) has always had odd behaviour on write-
    protected shared mappings: although it demands FMODE_WRITE-access to the
    underlying object (do_mmap_pgoff sets neither VM_SHARED nor VM_MAYWRITE
    without that), it ends up with do_wp_page substituting private anonymous
    Copied-On-Write pages for the shared file pages in the area.
    
    That was long ago intentional, as a safety measure to prevent ptrace
    setting a breakpoint (or POKETEXT or POKEDATA) from inadvertently
    corrupting the underlying executable.  Yet exec and dynamic loaders open
    the file read-only, and use MAP_PRIVATE rather than MAP_SHARED.
    
    The traditional odd behaviour still causes surprises and bugs in mm, and
    is probably not what any caller wants - even the comment on the flag
    says "You do not want this" (although it's undoubtedly necessary for
    overriding userspace protections in some contexts, and good when !write).
    
    Let's stop doing that.  But it would be dangerous to remove the long-
    standing safety at this stage, so just make get_user_pages(write,force)
    fail with EFAULT when applied to a write-protected shared area.
    Infiniband may in future want to force write through to underlying
    object: we can add another FOLL_flag later to enable that if required.
    
    Odd though the old behaviour was, there is no doubt that we may turn out
    to break userspace with this change, and have to revert it quickly.
    Issue a WARN_ON_ONCE to help debug the changed case (easily triggered by
    userspace, so only once to prevent spamming the logs); and delay a few
    associated cleanups until this change is proved.
    
    get_user_pages callers who might see trouble from this change:
      ptrace poking, or writing to /proc/<pid>/mem
      drivers/infiniband/
      drivers/media/v4l2-core/
      drivers/gpu/drm/exynos/exynos_drm_gem.c
      drivers/staging/tidspbridge/core/tiomap3430.c
    if they ever apply get_user_pages to write-protected shared mappings
    of an object which was opened for writing.
    
    I went to apply the same change to mm/nommu.c, but retreated.  NOMMU has
    no place for COW, and its VM_flags conventions are not the same: I'd be
    more likely to screw up NOMMU than make an improvement there.
    
    Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    cda540ac