1. 08 May, 2017 1 commit
    • Xunlei Pang's avatar
      x86/mm: Add support for gbpages to kernel_ident_mapping_init() · 66aad4fd
      Xunlei Pang authored
      
      
      Kernel identity mappings on x86-64 kernels are created in two
      ways: by the early x86 boot code, or by kernel_ident_mapping_init().
      
      Native kernels (which is the dominant usecase) use the former,
      but the kexec and the hibernation code uses kernel_ident_mapping_init().
      
      There's a subtle difference between these two ways of how identity
      mappings are created, the current kernel_ident_mapping_init() code
      creates identity mappings always using 2MB page(PMD level) - while
      the native kernel boot path also utilizes gbpages where available.
      
      This difference is suboptimal both for performance and for memory
      usage: kernel_ident_mapping_init() needs to allocate pages for the
      page tables when creating the new identity mappings.
      
      This patch adds 1GB page(PUD level) support to kernel_ident_mapping_init()
      to address these concerns.
      
      The primary advantage would be better TLB coverage/performance,
      because we'd utilize 1GB TLBs instead of 2MB ones.
      
      It is also useful for machines with large number of memory to
      save paging structure allocations(around 4MB/TB using 2MB page)
      when setting identity mappings for all the memory, after using
      1GB page it will consume only 8KB/TB.
      
      ( Note that this change alone does not activate gbpages in kexec,
        we are doing that in a separate patch. )
      Signed-off-by: default avatarXunlei Pang <xlpang@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: akpm@linux-foundation.org
      Cc: kexec@lists.infradead.org
      Link: http://lkml.kernel.org/r/1493862171-8799-1-git-send-email-xlpang@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      66aad4fd
  2. 08 Jul, 2016 1 commit
    • Thomas Garnier's avatar
      x86/mm: Enable KASLR for physical mapping memory regions · 021182e5
      Thomas Garnier authored
      Add the physical mapping in the list of randomized memory regions.
      
      The physical memory mapping holds most allocations from boot and heap
      allocators. Knowing the base address and physical memory size, an attacker
      can deduce the PDE virtual address for the vDSO memory page. This attack
      was demonstrated at CanSecWest 2016, in the following presentation:
      
        "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
        https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf
      
      (See second part of the presentation).
      
      The exploits used against Linux worked successfully against 4.6+ but
      fail with KASLR memory enabled:
      
        https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits
      
      
      
      Similar research was done at Google leading to this patch proposal.
      
      Variants exists to overwrite /proc or /sys objects ACLs leading to
      elevation of privileges. These variants were tested against 4.6+.
      
      The page offset used by the compressed kernel retains the static value
      since it is not yet randomized during this boot stage.
      Signed-off-by: default avatarThomas Garnier <thgarnie@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-7-git-send-email-keescook@chromium.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      021182e5
  3. 26 Jun, 2016 1 commit
    • Kees Cook's avatar
      x86/KASLR: Clarify identity map interface · 11fdf97a
      Kees Cook authored
      
      
      This extracts the call to prepare_level4() into a top-level function
      that the user of the pagetable.c interface must call to initialize
      the new page tables. For clarity and to match the "finalize" function,
      it has been renamed to initialize_identity_maps(). This function also
      gains the initialization of mapping_info so we don't have to do it each
      time in add_identity_map().
      
      Additionally add copyright notice to the top, to make it clear that the
      bulk of the pagetable.c code was written by Yinghai, and that I just
      added bugs later. :)
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1464216334-17200-3-git-send-email-keescook@chromium.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      11fdf97a
  4. 10 May, 2016 1 commit
    • Kees Cook's avatar
      x86/KASLR: Initialize mapping_info every time · 434a6c9f
      Kees Cook authored
      
      
      As it turns out, mapping_info DOES need to be initialized every
      time, because pgt_data address could be changed during kernel
      relocation. So it can not be build time assigned.
      
      Without this, page tables were not being corrected updated, which
      could cause reboots when a physical address beyond 2G was chosen.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-2-git-send-email-keescook@chromium.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      434a6c9f
  5. 07 May, 2016 1 commit
    • Kees Cook's avatar
      x86/KASLR: Build identity mappings on demand · 3a94707d
      Kees Cook authored
      
      
      Currently KASLR only supports relocation in a small physical range (from
      16M to 1G), due to using the initial kernel page table identity mapping.
      To support ranges above this, we need to have an identity mapping for the
      desired memory range before we can decompress (and later run) the kernel.
      
      32-bit kernels already have the needed identity mapping. This patch adds
      identity mappings for the needed memory ranges on 64-bit kernels. This
      happens in two possible boot paths:
      
      If loaded via startup_32(), we need to set up the needed identity map.
      
      If loaded from a 64-bit bootloader, the bootloader will have already
      set up an identity mapping, and we'll start via the compressed kernel's
      startup_64(). In this case, the bootloader's page tables need to be
      avoided while selecting the new uncompressed kernel location. If not,
      the decompressor could overwrite them during decompression.
      
      To accomplish this, we could walk the pagetable and find every page
      that is used, and add them to mem_avoid, but this needs extra code and
      will require increasing the size of the mem_avoid array.
      
      Instead, we can create a new set of page tables for our own identity
      mapping instead. The pages for the new page table will come from the
      _pagetable section of the compressed kernel, which means they are
      already contained by in mem_avoid array. To do this, we reuse the code
      from the uncompressed kernel's identity mapping routines.
      
      The _pgtable will be shared by both the 32-bit and 64-bit paths to reduce
      init_size, as now the compressed kernel's _rodata to _end will contribute
      to init_size.
      
      To handle the possible mappings, we need to increase the existing page
      table buffer size:
      
      When booting via startup_64(), we need to cover the old VO, params,
      cmdline and uncompressed kernel. In an extreme case we could have them
      all beyond the 512G boundary, which needs (2+2)*4 pages with 2M mappings.
      And we'll need 2 for first 2M for VGA RAM. One more is needed for level4.
      This gets us to 19 pages total.
      
      When booting via startup_32(), KASLR could move the uncompressed kernel
      above 4G, so we need to create extra identity mappings, which should only
      need (2+2) pages at most when it is beyond the 512G boundary. So 19
      pages is sufficient for this case as well.
      
      The resulting BOOT_*PGT_SIZE defines use the "_SIZE" suffix on their
      names to maintain logical consistency with the existing BOOT_HEAP_SIZE
      and BOOT_STACK_SIZE defines.
      
      This patch is based on earlier patches from Yinghai Lu and Baoquan He.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462572095-11754-4-git-send-email-keescook@chromium.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3a94707d