1. 12 Jul, 2019 1 commit
  2. 15 May, 2019 1 commit
    • Michael Ellerman's avatar
      powerpc/mm: Fix crashes with hugepages & 4K pages · 7338874c
      Michael Ellerman authored
      The recent commit to cleanup ifdefs in the hugepage initialisation led
      to crashes when using 4K pages as reported by Sachin:
      
        BUG: Kernel NULL pointer dereference at 0x0000001c
        Faulting instruction address: 0xc000000001d1e58c
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
        ...
        CPU: 3 PID: 4635 Comm: futex_wake04 Tainted: G        W  O      5.1.0-next-20190507-autotest #1
        NIP:  c000000001d1e58c LR: c000000001d1e54c CTR: 0000000000000000
        REGS: c000000004937890 TRAP: 0300
        MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22424822  XER: 00000000
        CFAR: c00000000183e9e0 DAR: 000000000000001c DSISR: 40000000 IRQMASK: 0
        ...
        NIP kmem_cache_alloc+0xbc/0x5a0
        LR  kmem_cache_alloc+0x7c/0x5a0
        Call Trace:
          huge_pte_alloc+0x580/0x950
          hugetlb_fault+0x9a0/0x1250
          handle_mm_fault+0x490/0x4a0
          __do_page_fault+0x77c/0x1f00
          do_page_fault+0x28/0x50
          handle_page_fault+0x18/0x38
      
      This is caused by us trying to allocate from a NULL kmem cache in
      __hugepte_alloc(). The kmem cache is NULL because it was never
      allocated in hugetlbpage_init(), because add_huge_page_size() returned
      an error.
      
      The reason add_huge_page_size() returned an error is a simple typo, we
      are calling check_and_get_huge_psize(size) when we should be passing
      shift instead.
      
      The fact that we're able to trigger this path when the kmem caches are
      NULL is a separate bug, ie. we should not advertise any hugepage sizes
      if we haven't setup the required caches for them.
      
      This was only seen with 4K pages, with 64K pages we don't need to
      allocate any extra kmem caches because the 16M hugepage just occupies
      a single entry at the PMD level.
      
      Fixes: 723f268f
      
       ("powerpc/mm: cleanup ifdef mess in add_huge_page_size()")
      Reported-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Tested-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      7338874c
  3. 06 May, 2019 1 commit
  4. 02 May, 2019 8 commits
  5. 04 Dec, 2018 3 commits
    • Christophe Leroy's avatar
      powerpc/8xx: Enable 512k hugepage support with HW assistance · 3fb69c6a
      Christophe Leroy authored
      
      
      For using 512k pages with hardware assistance, the PTEs have to be spread
      every 128 bytes in the L2 table.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3fb69c6a
    • Christophe Leroy's avatar
      powerpc/mm: fix a warning when a cache is common to PGD and hugepages · 1e03c7e2
      Christophe Leroy authored
      
      
      While implementing TLB miss HW assistance on the 8xx, the following
      warning was encountered:
      
      [  423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c
      [  423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671
      [  423.733075] NIP:  c0108f90 LR: c0109ad0 CTR: 00000004
      [  423.733121] REGS: c455bba0 TRAP: 0700   Not tainted  (4.18.0-rc8-00664-g2dfff9121c55)
      [  423.733147] MSR:  00021032 <ME,IR,DR,RI>  CR: 24224848  XER: 20000000
      [  423.733319]
      [  423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30
      [  423.733319] GPR08: 00000001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 00000040
      [  423.733319] GPR16: 73f00000 00210d00 00000000 00000001 c455a000 00000100 00000200 c455a000
      [  423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 00000000 00009032
      [  423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c
      [  423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c
      [  423.734283] Call Trace:
      [  423.734326] [c455bc50] [00000100] 0x100 (unreliable)
      [  423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c
      [  423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc
      [  423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c
      [  423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c
      [  423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc
      [  423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140
      [  423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8
      [  423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc
      [  423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38
      [  423.735271] Instruction dump:
      [  423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010
      [  423.735536] 71280004 41a2ff88 4840c571 4bffff80 <0fe00000> 4bfffeb8 81340010 712a0004
      [  423.735757] ---[ end trace e9b222919a470790 ]---
      
      This warning occurs when calling kmem_cache_zalloc() on a
      cache having a constructor.
      
      In this case it happens because PGD cache and 512k hugepte cache are
      the same size (4k). While a cache with constructor is created for
      the PGD, hugepages create cache without constructor and uses
      kmem_cache_zalloc(). As both expect a cache with the same size,
      the hugepages reuse the cache created for PGD, hence the conflict.
      
      In order to avoid this conflict, this patch:
      - modifies pgtable_cache_add() so that a zeroising constructor is
      added for any cache size.
      - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc()
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1e03c7e2
    • Christophe Leroy's avatar
      powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER) · 03566562
      Christophe Leroy authored
      
      
      Instead of opencoding cache handling for the special case
      of hugepage tables having a single pte_t element, this
      patch makes use of the common pgtable_cache helpers
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      03566562
  6. 09 Nov, 2018 1 commit
    • Paul E. McKenney's avatar
      powerpc: Convert hugepd_free() to use call_rcu() · 04229110
      Paul E. McKenney authored
      
      
      Now that call_rcu()'s callback is not invoked until after all
      preempt-disable regions of code have completed (in addition to explicitly
      marked RCU read-side critical sections), call_rcu() can be used in place
      of call_rcu_sched().  This commit therefore makes that change.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: <linuxppc-dev@lists.ozlabs.org>
      04229110
  7. 31 Oct, 2018 1 commit
  8. 03 Oct, 2018 3 commits
    • Christophe Leroy's avatar
      powerpc/mm: Don't report hugepage tables as memory leaks when using kmemleak · 803d690e
      Christophe Leroy authored
      When a process allocates a hugepage, the following leak is
      reported by kmemleak. This is a false positive which is
      due to the pointer to the table being stored in the PGD
      as physical memory address and not virtual memory pointer.
      
      unreferenced object 0xc30f8200 (size 512):
        comm "mmap", pid 374, jiffies 4872494 (age 627.630s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<e32b68da>] huge_pte_alloc+0xdc/0x1f8
          [<9e0df1e1>] hugetlb_fault+0x560/0x8f8
          [<7938ec6c>] follow_hugetlb_page+0x14c/0x44c
          [<afbdb405>] __get_user_pages+0x1c4/0x3dc
          [<b8fd7cd9>] __mm_populate+0xac/0x140
          [<3215421e>] vm_mmap_pgoff+0xb4/0xb8
          [<c148db69>] ksys_mmap_pgoff+0xcc/0x1fc
          [<4fcd760f>] ret_from_syscall+0x0/0x38
      
      See commit a984506c
      
       ("powerpc/mm: Don't report PUDs as
      memory leaks when using kmemleak") for detailed explanation.
      
      To fix that, this patch tells kmemleak to ignore the allocated
      hugepage table.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      803d690e
    • Aneesh Kumar K.V's avatar
      powerpc/mm/book3s: Check for pmd_large instead of pmd_trans_huge · ae28f17b
      Aneesh Kumar K.V authored
      
      
      Update few code paths to check for pmd_large.
      
      set_pmd_at:
      We want to use this to store swap pte at pmd level. For swap ptes we don't want
      to set H_PAGE_THP_HUGE. Hence check for pmd_large in set_pmd_at. This remove
      the false WARN_ON when using this with swap pmd entry.
      
      pmd_page:
      We don't really use them on pmd migration entries. But they can also work with
      migration entries and we don't differentiate at the pte level. Hence update
      pmd_page to work with pmd migration entries too
      
      __find_linux_pte:
      lockless page table walk need to handle pmd migration entries. pmd_trans_huge
      check will return false on them. We don't set thp = 1 for such entries, but
      update hpage_shift correctly. Without this we will walk pmd migration entries
      as a pte page pointer which is wrong.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ae28f17b
    • Aneesh Kumar K.V's avatar
      powerpc/mm/hugetlb/book3s: add _PAGE_PRESENT to hugepd pointer. · f1981b5b
      Aneesh Kumar K.V authored
      
      
      This make hugetlb directory pointer similar to other page able entries. A hugepd
      entry is identified by lack of _PAGE_PTE bit set and directory size stored in
      HUGEPD_SHIFT_MASK. We update that to also look at _PAGE_PRESENT
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f1981b5b
  9. 19 Jul, 2018 1 commit
  10. 16 Jul, 2018 1 commit
  11. 19 Jun, 2018 1 commit
  12. 03 Jun, 2018 1 commit
    • Aneesh Kumar K.V's avatar
      powerpc/mm/hugetlb: Update hugetlb related locks · ed515b68
      Aneesh Kumar K.V authored
      With split pmd page table lock enabled, we don't use mm->page_table_lock when
      updating pmd entries. This patch update hugetlb path to use the right lock
      when inserting huge page directory entries into page table.
      
      ex: if we are using hugepd and inserting hugepd entry at the pmd level, we
      use pmd_lockptr, which based on config can be split pmd lock.
      
      For update huge page directory entries itself we use mm->page_table_lock. We
      do have a helper huge_pte_lockptr() for that.
      
      Fixes: 675d9952
      
       ("powerpc/book3s64: Enable split pmd ptlock")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ed515b68
  13. 03 May, 2018 1 commit
    • Hari Bathini's avatar
      powerpc/fadump: Do not use hugepages when fadump is active · 85975387
      Hari Bathini authored
      
      
      FADump capture kernel boots in restricted memory environment preserving
      the context of previous kernel to save vmcore. Supporting hugepages in
      such environment makes things unnecessarily complicated, as hugepages
      need memory set aside for them. This means most of the capture kernel's
      memory is used in supporting hugepages. In most cases, this results in
      out-of-memory issues while booting FADump capture kernel. But hugepages
      are not of much use in capture kernel whose only job is to save vmcore.
      So, disabling hugepages support, when fadump is active, is a reliable
      solution for the out of memory issues. Introducing a flag variable to
      disable HugeTLB support when fadump is active.
      Signed-off-by: default avatarHari Bathini <hbathini@linux.vnet.ibm.com>
      Reviewed-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      85975387
  14. 06 Apr, 2018 1 commit
    • Dan Williams's avatar
      mm, powerpc: use vma_kernel_pagesize() in vma_mmu_pagesize() · 09135cc5
      Dan Williams authored
      Patch series "mm, smaps: MMUPageSize for device-dax", v3.
      
      Similar to commit 31383c68 ("mm, hugetlbfs: introduce ->split() to
      vm_operations_struct") here is another occasion where we want
      special-case hugetlbfs/hstate enabling to also apply to device-dax.
      
      This prompts the question what other hstate conversions we might do
      beyond ->split() and ->pagesize(), but this appears to be the last of
      the usages of hstate_vma() in generic/non-hugetlbfs specific code paths.
      
      This patch (of 3):
      
      The current powerpc definition of vma_mmu_pagesize() open codes looking
      up the page size via hstate.  It is identical to the generic
      vma_kernel_pagesize() implementation.
      
      Now, vma_kernel_pagesize() is growing support for determining the page
      size of Device-DAX vmas in addition to the existing Hugetlbfs page size
      determination.
      
      Ideally, if the powerpc vma_mmu_pagesize() used vma_kernel_pagesize() it
      would automatically benefit from any new vma-type support that is added
      to vma_kernel_pagesize().  However, the powerpc vma_mmu_pagesize() is
      prevented from calling vma_kernel_pagesize() due to a circular header
      dependency that requires vma_mmu_pagesize() to be defined before
      including <linux/hugetlb.h>.
      
      Break this circular dependency by defining the default vma_mmu_pagesize()
      as a __weak symbol to be overridden by the powerpc version.
      
      Link: http://lkml.kernel.org/r/151996254179.27922.2213728278535578744.stgit@dwillia2-desk3.amr.corp.intel.com
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Jane Chu <jane.chu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09135cc5
  15. 04 Apr, 2018 1 commit
  16. 13 Mar, 2018 1 commit
  17. 05 Mar, 2018 1 commit
    • Christophe Leroy's avatar
      powerpc/mm/slice: Fix hugepage allocation at hint address on 8xx · aa0ab02b
      Christophe Leroy authored
      On the 8xx, the page size is set in the PMD entry and applies to
      all pages of the page table pointed by the said PMD entry.
      
      When an app has some regular pages allocated (e.g. see below) and tries
      to mmap() a huge page at a hint address covered by the same PMD entry,
      the kernel accepts the hint allthough the 8xx cannot handle different
      page sizes in the same PMD entry.
      
      10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
      10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc
      
      mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
           MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000
      
      This results the app remaining forever in do_page_fault()/hugetlb_fault()
      and when interrupting that app, we get the following warning:
      
      [162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
      [162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W       4.14.6 #85
      [162980.035744] task: c67e2c00 task.stack: c668e000
      [162980.035783] NIP:  c000fe18 LR: c00e1eec CTR: c00f90c0
      [162980.035830] REGS: c668fc20 TRAP: 0700   Tainted: G W        (4.14.6)
      [162980.035854] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24044224 XER: 20000000
      [162980.036003]
      [162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
      [162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
      [162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
      [162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
      [162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
      [162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
      [162980.036861] Call Trace:
      [162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
      [162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
      [162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
      [162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
      [162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
      [162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
      [162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
      [162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
      [162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
      [162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
      [162980.037781] Instruction dump:
      [162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
      [162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
      [162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
      [162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
      [162985.363322] BUG: non-zero nr_ptes on freeing mm: -1
      
      In order to fix this, this patch uses the address space "slices"
      implemented for BOOK3S/64 and enhanced to support PPC32 by the
      preceding patch.
      
      This patch modifies the context.id on the 8xx to be in the range
      [1:16] instead of [0:15] in order to identify context.id == 0 as
      not initialised contexts as done on BOOK3S
      
      This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
      selected for the 8xx
      
      Alltough we could in theory have as many slices as PMD entries, the
      current slices implementation limits the number of low slices to 16.
      This limitation is not preventing us to fix the initial issue allthough
      it is suboptimal. It will be cured in a subsequent patch.
      
      Fixes: 4b914286
      
       ("powerpc/8xx: Implement support of hugepages")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      aa0ab02b
  18. 19 Jan, 2018 2 commits
  19. 16 Jan, 2018 1 commit
    • Christophe Leroy's avatar
      powerpc/8xx: Remove _PAGE_USER and handle user access at PMD level · de0f9387
      Christophe Leroy authored
      
      
      As Linux kernel separates KERNEL and USER address spaces, there is
      therefore no need to flag USER access at page level.
      
      Today, the 8xx TLB handlers already handle user access in the L1 entry
      through Access Protection Groups, it is then natural to move the user
      access handling at PMD level once _PAGE_NA allows to handle PAGE_NONE
      protection without _PAGE_USER
      
      In the mean time, as we free up one bit in the PTE, we can use it to
      include SPS (page size flag) in the PTE and avoid handling it at every
      TLB miss hence removing special handling based on compiled page size.
      
      For _PAGE_EXEC, we rework it to use PP PTE bits, avoiding the copy
      of _PAGE_EXEC bit into the L1 entry. Unfortunatly we are not
      able to put it at the correct location as it conflicts with
      NA/RO/RW bits for data entries.
      
      Upper bits of APG in L1 entry overlap with PMD base address. In
      order to avoid having to filter that out, we set up all groups so that
      upper bits can have any value.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      de0f9387
  20. 22 Dec, 2017 1 commit
  21. 16 Nov, 2017 1 commit
  22. 23 Aug, 2017 1 commit
  23. 17 Aug, 2017 1 commit
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Rename find_linux_pte_or_hugepte() · 94171b19
      Aneesh Kumar K.V authored
      
      
      Add newer helpers to make the function usage simpler. It is always
      recommended to use find_current_mm_pte() for walking the page table.
      If we cannot use find_current_mm_pte(), it should be documented why
      the said usage of __find_linux_pte() is safe against a parallel THP
      split.
      
      For now we have KVM code using __find_linux_pte(). This is because kvm
      code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
      with PACA soft_enabled = 1. We may want to fix that later and make
      sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
      we can switch kvm to use find_linux_pte().
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      94171b19
  24. 16 Aug, 2017 1 commit
    • Aneesh Kumar K.V's avatar
      powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line · 79cc38de
      Aneesh Kumar K.V authored
      With commit aa888a74
      
       ("hugetlb: support larger than MAX_ORDER") we added
      support for allocating gigantic hugepages via kernel command line. Switch
      ppc64 arch specific code to use that.
      
      W.r.t FSL support, we now limit our allocation range using BOOTMEM_ALLOC_ACCESSIBLE.
      
      We use the kernel command line to do reservation of hugetlb pages on powernv
      platforms. On pseries hash mmu mode the supported gigantic huge page size is
      16GB and that can only be allocated with hypervisor assist. For pseries the
      command line option doesn't do the allocation. Instead pseries does gigantic
      hugepage allocation based on hypervisor hint that is specified via
      "ibm,expected#pages" property of the memory node.
      
      Cc: Scott Wood <oss@buserror.net>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      79cc38de
  25. 15 Aug, 2017 1 commit
    • Christophe Leroy's avatar
      powerpc/hugetlb: fix page rights verification in gup_hugepte() · ca8afd40
      Christophe Leroy authored
      gup_hugepte() checks if pages are present and readable, and
      when  'write' is set, also checks if the pages are writable.
      
      Initially this was done by checking if _PAGE_PRESENT and
      _PAGE_READ were set. In addition, _PAGE_WRITE was verified for write
      accesses.
      
      The problem is that we have to handle the three following cases:
      1/ The target defines __PAGE_READ and __PAGE_WRITE
      2/ The target defines __PAGE_RW
      3/ The target defines __PAGE_RO
      
      In case 1/, this is obvious
      In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW
      so it works as well.
      But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0
      and then the test returns true (page writable) in all cases.
      
      A first correction was attempted in commit 6b8cb66a
      
       ("powerpc: Fix
      usage of _PAGE_RO in hugepage"), but that fix is wrong:
      instead of checking that the page is writable when write is requested,
      it checks that the page is NOT writable when write is NOT requested.
      
      This patch adds a new pte_read() helper to check whether a page is
      readable or not. This avoids handling all possible cases in
      gup_hugepte().
      
      Then gup_hugepte() is modified to use pte_present(), pte_read()
      and pte_write() instead of the raw flags.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ca8afd40
  26. 06 Jul, 2017 3 commits