- Sep 14, 2017
-
-
Wanpeng Li authored
qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2 qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0 qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018 kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000 qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018 qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2 After running some memory intensive workload in guest, I catch the kworker which completes the GUP too quickly, and queues an "Page Ready" #PF exception after the "Page not Present" exception before the next vmentry as the above trace which will result in #DF injected to guest. This patch fixes it by clearing the queue for "Page not Present" if "Page Ready" occurs before the next vmentry since the GUP has already got the required page and shadow page table has already been fixed by "Page Ready" handler. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Wanpeng Li <wanpeng.li@hotmail.com> Fixes: 7c90705b ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.") [Changed indentation and added clearing of injected. - Radim] Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Wanpeng Li authored
Don't block vCPU if there is pending exception. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Suravee Suthikulpanit authored
SVM AVIC hardware accelerates guest write to APIC_EOI register (for edge-trigger interrupt), which means it does not trap to KVM. So, only enable SVM AVIC only in split irqchip mode. (e.g. launching qemu w/ option '-machine kernel_irqchip=split'). Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Fixes: 44a95dae ("KVM: x86: Detect and Initialize AVIC support") [Removed pr_debug - Radim.] Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
- Sep 13, 2017
-
-
Suravee Suthikulpanit authored
Modify struct kvm_x86_ops.arch.apicv_active() to take struct kvm_vcpu pointer as parameter in preparation to subsequent changes. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Suravee Suthikulpanit authored
Preparing the base code for subsequent changes. This does not change existing logic. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
Clang resolves __builtin_constant_p() to false even if the expression is constant in the end. The only purpose of that expression was to differentiate a case where the following expression couldn't be checked at compile-time, so we can just remove the check. Clang handles the following two correctly. Turn it into BUG_ON if there are any more problems with this. Fixes: d6321d49 ("KVM: x86: generalize guest_cpuid_has_ helpers") Reported-by:
Dmitry Vyukov <dvyukov@google.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
When user space sets kvm_run->immediate_exit, KVM is supposed to return quickly. However, when a vCPU is in KVM_MP_STATE_UNINITIALIZED, the value is not considered and the vCPU blocks. Fix that oversight. Fixes: 460df4c1 ("KVM: race-free exit from KVM_RUN without POSIX signals") Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
KVM API says that KVM_RUN will return with -EINTR when a signal is pending. However, if a vCPU is in KVM_MP_STATE_UNINITIALIZED, then the return value is unconditionally -EAGAIN. Copy over some code from vcpu_run(), so that the case of a pending signal results in the expected return value. Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Fixes: f6511935 ("KVM: SVM: Add checks for IO instructions") Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Joerg Roedel authored
The commit 9dd21e104bc ('KVM: x86: simplify handling of PKRU') removed all users and providers of that call-back, but didn't remove it. Remove it now. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
- Sep 12, 2017
-
-
Paul Mackerras authored
Aneesh Kumar reported seeing host crashes when running recent kernels on POWER8. The symptom was an oops like this: Unable to handle kernel paging request for data at address 0xf00000000786c620 Faulting instruction address: 0xc00000000030e1e4 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: powernv_op_panel CPU: 24 PID: 6663 Comm: qemu-system-ppc Tainted: G W 4.13.0-rc7-43932-gfc36c59 #2 task: c000000fdeadfe80 task.stack: c000000fdeb68000 NIP: c00000000030e1e4 LR: c00000000030de6c CTR: c000000000103620 REGS: c000000fdeb6b450 TRAP: 0300 Tainted: G W (4.13.0-rc7-43932-gfc36c59) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24044428 XER: 20000000 CFAR: c00000000030e134 DAR: f00000000786c620 DSISR: 40000000 SOFTE: 0 GPR00: 0000000000000000 c000000fdeb6b6d0 c0000000010bd000 000000000000e1b0 GPR04: c00000000115e168 c000001fffa6e4b0 c00000000115d000 c000001e1b180386 GPR08: f000000000000000 c000000f9a8913e0 f00000000786c600 00007fff587d0000 GPR12: c000000fdeb68000 c00000000fb0f000 0000000000000001 00007fff587cffff GPR16: 0000000000000000 c000000000000000 00000000003fffff c000000fdebfe1f8 GPR20: 0000000000000004 c000000fdeb6b8a8 0000000000000001 0008000000000040 GPR24: 07000000000000c0 00007fff587cffff c000000fdec20bf8 00007fff587d0000 GPR28: c000000fdeca9ac0 00007fff587d0000 00007fff587c0000 00007fff587d0000 NIP [c00000000030e1e4] __get_user_pages_fast+0x434/0x1070 LR [c00000000030de6c] __get_user_pages_fast+0xbc/0x1070 Call Trace: [c000000fdeb6b6d0] [c00000000139dab8] lock_classes+0x0/0x35fe50 (unreliable) [c000000fdeb6b7e0] [c00000000030ef38] get_user_pages_fast+0xf8/0x120 [c000000fdeb6b830] [c000000000112318] kvmppc_book3s_hv_page_fault+0x308/0xf30 [c000000fdeb6b960] [c00000000010e10c] kvmppc_vcpu_run_hv+0xfdc/0x1f00 [c000000fdeb6bb20] [c0000000000e915c] kvmppc_vcpu_run+0x2c/0x40 [c000000fdeb6bb40] [c0000000000e5650] kvm_arch_vcpu_ioctl_run+0x110/0x300 [c000000fdeb6bbe0] [c0000000000d6468] kvm_vcpu_ioctl+0x528/0x900 [c000000fdeb6bd40] [c0000000003bc04c] do_vfs_ioctl+0xcc/0x950 [c000000fdeb6bde0] [c0000000003bc930] SyS_ioctl+0x60/0x100 [c000000fdeb6be30] [c00000000000b96c] system_call+0x58/0x6c Instruction dump: 7ca81a14 2fa50000 41de0010 7cc8182a 68c60002 78c6ffe2 0b060000 3cc2000a 794a3664 390610d8 e9080000 7d485214 <e90a0020> 7d435378 790507e1 408202f0 ---[ end trace fad4a342d0414aa2 ]--- It turns out that what has happened is that the SLB entry for the vmmemap region hasn't been reloaded on exit from a guest, and it has the wrong page size. Then, when the host next accesses the vmemmap region, it gets a page fault. Commit a25bd72b ("powerpc/mm/radix: Workaround prefetch issue with KVM", 2017-07-24) modified the guest exit code so that it now only clears out the SLB for hash guest. The code tests the radix flag and puts the result in a non-volatile CR field, CR2, and later branches based on CR2. Unfortunately, the kvmppc_save_tm function, which gets called between those two points, modifies all the user-visible registers in the case where the guest was in transactional or suspended state, except for a few which it restores (namely r1, r2, r9 and r13). Thus the hash/radix indication in CR2 gets corrupted. This fixes the problem by re-doing the comparison just before the result is needed. For good measure, this also adds comments next to the call sites of kvmppc_save_tm and kvmppc_restore_tm pointing out that non-volatile register state will be lost. Cc: stable@vger.kernel.org # v4.13 Fixes: a25bd72b ("powerpc/mm/radix: Workaround prefetch issue with KVM") Tested-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
Commit 468808bd ("KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9", 2017-01-30) added a call to kvmppc_update_lpcr() which doesn't hold the kvm->lock mutex around the call, as required. This adds the lock/unlock pair, and for good measure, includes the kvmppc_setup_partition_table() call in the locked region, since it is altering global state of the VM. This error appears not to have any fatal consequences for the host; the consequences would be that the VCPUs could end up running with different LPCR values, or an update to the LPCR value by userspace using the one_reg interface could get overwritten, or the update done by kvmhv_configure_mmu() could get overwritten. Cc: stable@vger.kernel.org # v4.10+ Fixes: 468808bd ("KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9") Signed-off-by:
Paul Mackerras <paulus@ozlabs.org>
-
Benjamin Herrenschmidt authored
The XIVE interrupt controller on POWER9 machines doesn't support byte accesses to any register in the thread management area other than the CPPR (current processor priority register). In particular, when reading the PIPR (pending interrupt priority register), we need to do a 32-bit or 64-bit load. Cc: stable@vger.kernel.org # v4.13 Fixes: 2c4fb78f ("KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss") Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Paul Mackerras <paulus@ozlabs.org>
-
- Sep 07, 2017
-
-
Andy Lutomirski authored
While debugging a problem, I thought that using cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be helpful. It turns out to be counterproductive. Add a comment documenting how this works. Signed-off-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Andy Lutomirski authored
When Linux brings a CPU down and back up, it switches to init_mm and then loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect of masking off the ASID bits in CR3. This can result in some confusion in the TLB handling code. If we bring a CPU down and back up with any ASID other than 0, we end up with the wrong ASID active on the CPU after resume. This could cause our internal state to become corrupt, although major corruption is unlikely because init_mm doesn't have any user pages. More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion in the next context switch. The result of *that* is a failure to resume from suspend with probability 1 - 1/6^(cpus-1). Fix it by reinitializing cpu_tlbstate on resume and CPU bringup. Reported-by:
Linus Torvalds <torvalds@linux-foundation.org> Reported-by:
Jiri Kosina <jikos@kernel.org> Fixes: 10af6235 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Rik van Riel authored
Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty in the child process after fork. This differs from MADV_DONTFORK in one important way. If a child process accesses memory that was MADV_WIPEONFORK, it will get zeroes. The address ranges are still valid, they are just empty. If a child process accesses memory that was MADV_DONTFORK, it will get a segmentation fault, since those address ranges are no longer valid in the child after fork. Since MADV_DONTFORK also seems to be used to allow very large programs to fork in systems with strict memory overcommit restrictions, changing the semantics of MADV_DONTFORK might break existing programs. MADV_WIPEONFORK only works on private, anonymous VMAs. The use case is libraries that store or cache information, and want to know that they need to regenerate it in the child process after fork. Examples of this would be: - systemd/pulseaudio API checks (fail after fork) (replacing a getpid check, which is too slow without a PID cache) - PKCS#11 API reinitialization check (mandated by specification) - glibc's upcoming PRNG (reseed after fork) - OpenSSL PRNG (reseed after fork) The security benefits of a forking server having a re-inialized PRNG in every child process are pretty obvious. However, due to libraries having all kinds of internal state, and programs getting compiled with many different versions of each library, it is unreasonable to expect calling programs to re-initialize everything manually after fork. A further complication is the proliferation of clone flags, programs bypassing glibc's functions to call clone directly, and programs calling unshare, causing the glibc pthread_atfork hook to not get called. It would be better to have the kernel take care of this automatically. The patch also adds MADV_KEEPONFORK, to undo the effects of a prior MADV_WIPEONFORK. This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO: https://man.openbsd.org/minherit.2 [akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines] Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.com Signed-off-by:
Rik van Riel <riel@redhat.com> Reported-by:
Florian Weimer <fweimer@redhat.com> Reported-by:
Colm MacCártaigh <colm@allcosts.net> Reviewed-by:
Mike Kravetz <mike.kravetz@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Cc: <linux-api@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Rik van Riel authored
Patch series "mm,fork,security: introduce MADV_WIPEONFORK", v4. If a child process accesses memory that was MADV_WIPEONFORK, it will get zeroes. The address ranges are still valid, they are just empty. If a child process accesses memory that was MADV_DONTFORK, it will get a segmentation fault, since those address ranges are no longer valid in the child after fork. Since MADV_DONTFORK also seems to be used to allow very large programs to fork in systems with strict memory overcommit restrictions, changing the semantics of MADV_DONTFORK might break existing programs. The use case is libraries that store or cache information, and want to know that they need to regenerate it in the child process after fork. Examples of this would be: - systemd/pulseaudio API checks (fail after fork) (replacing a getpid check, which is too slow without a PID cache) - PKCS#11 API reinitialization check (mandated by specification) - glibc's upcoming PRNG (reseed after fork) - OpenSSL PRNG (reseed after fork) The security benefits of a forking server having a re-inialized PRNG in every child process are pretty obvious. However, due to libraries having all kinds of internal state, and programs getting compiled with many different versions of each library, it is unreasonable to expect calling programs to re-initialize everything manually after fork. A further complication is the proliferation of clone flags, programs bypassing glibc's functions to call clone directly, and programs calling unshare, causing the glibc pthread_atfork hook to not get called. It would be better to have the kernel take care of this automatically. The patchset also adds MADV_KEEPONFORK, to undo the effects of a prior MADV_WIPEONFORK. This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO: https://man.openbsd.org/minherit.2 This patch (of 2): MPX only seems to be available on 64 bit CPUs, starting with Skylake and Goldmont. Move VM_MPX into the 64 bit only portion of vma->vm_flags, in order to free up a VMA flag. Link: http://lkml.kernel.org/r/20170811212829.29186-2-riel@redhat.com Signed-off-by:
Rik van Riel <riel@redhat.com> Acked-by:
Dave Hansen <dave.hansen@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Colm MacCártaigh <colm@allcosts.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Kravetz authored
A non-default huge page size can be encoded in the flags argument of the mmap system call. The definitions for these encodings are in arch specific header files. However, all architectures use the same values. Consolidate all the definitions in the primary user header file (uapi/linux/mman.h). Include definitions for all known huge page sizes. Use the generic encoding definitions in hugetlb_encode.h as the basis for these definitions. Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com Signed-off-by:
Mike Kravetz <mike.kravetz@oracle.com> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Dou Liyang authored
Commit a7be6e5a ("mm: drop useless local parameters of __register_one_node()") removes the last user of parent_node(). The parent_node() macro in METAG architecture is unnecessary. Remove it for cleanup. Link: http://lkml.kernel.org/r/1501076076-1974-4-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by:
Dou Liyang <douly.fnst@cn.fujitsu.com> Reported-by:
Michael Ellerman <mpe@ellerman.id.au> Cc: James Hogan <james.hogan@imgtec.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Sep 05, 2017
-
-
Christoffer Dall authored
As we are about to access the APRs from the GICv2 uaccess interface, make this logic generally available. Reviewed-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@linaro.org>
-
James Morse authored
The ARM-ARM has two bits in the ESR/HSR relevant to external aborts. A range of {I,D}FSC values (of which bit 5 is always set) and bit 9 'EA' which provides: > an IMPLEMENTATION DEFINED classification of External Aborts. This bit is in addition to the {I,D}FSC range, and has an implementation defined meaning. KVM should always ignore this bit when handling external aborts from a guest. Remove the ESR_ELx_EA definition and rewrite its helper kvm_vcpu_dabt_isextabt() to check the {I,D}FSC range. This merges kvm_vcpu_dabt_isextabt() and the recently added is_abort_sea() helper. CC: Tyler Baicar <tbaicar@codeaurora.org> Reported-by:
gengdongjiu <gengdj.1984@gmail.com> Signed-off-by:
James Morse <james.morse@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@linaro.org>
-
- Sep 04, 2017
-
-
Ben Hutchings authored
Commit 00fc0e0d ("alpha: move exports to actual definitions") also removed the exports of the math emulator hooks, which are defined in C code. In case anyone cares about the option of CONFIG_MATHEMU=m, add exports next to those definitions. Also add a MODULE_LICENSE. Fixes: 00fc0e0d ("alpha: move exports to actual definitions") Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Ben Hutchings authored
Add <asm/asm-prototypes.h> so that genksyms knows the types of these symbols and can generate CRCs for them. Fixes: 00fc0e0d ("alpha: move exports to actual definitions") Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Krzysztof Kozlowski authored
Remove old, dead Kconfig options (in order appearing in this commit): - IP_NF_QUEUE: commit 3dd6664f ("netfilter: remove unused "config IP_NF_QUEUE""); - AUTOFS_FS: commit 561c5cf9 ("staging: Remove autofs3"); Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Geliang Tang authored
Use kobj_to_dev() instead of open-coding it. Signed-off-by:
Geliang Tang <geliangtang@163.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Masahiro Yamada authored
Remove unneeded variables and assignments. While we are here, fix the coding style of SMC37c669_read_config(): - replace whitespaces at the start of lines with tabs - remove unneeded whitespaces around parentheses Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Shyam Saini authored
Replace explicit computation of vma page count by a call to vma_pages() Signed-off-by:
Shyam Saini <mayhs11saini@gmail.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Dan Carpenter authored
We check that "member" is in bounds for the first line, but we also use it on the next line without checking which is a mistake. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Julia Cartwright authored
The alpha/marvel code currently implements an irq_chip for handling interrupts; due to how irq_chip handling is done, it's necessary for the irq_chip methods to be invoked from hardirq context, even on a a real-time kernel. Because the spinlock_t type becomes a "sleeping" spinlock w/ RT kernels, it is not suitable to be used with irq_chips. A quick audit of the operations under the lock reveal that they do only minimal, bounded work, and are therefore safe to do under a raw spinlock. Signed-off-by:
Julia Cartwright <julia@ni.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Sergei Trofimovich authored
__NR_sys_epoll_create and friends are alpha-specific while __NR_epoll_create is a generic name for other arches. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Tobias Klauser authored
The arch uses a verbatim copy of the asm-generic version and does not add any own implemntations to the header, so use asm-generic/fb.h instead of duplicating code. Signed-off-by:
Tobias Klauser <tklauser@distanz.ch> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Wolfram Sang authored
include/linux/i2c is not for client devices. Move the header file to a more appropriate location. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de> Acked-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by:
Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by:
Mark Brown <broonie@kernel.org> Acked-by:
Sebastian Reichel <sebastian.reichel@collabora.co.uk> Acked-by:
Jonathan Cameron <jic23@kernel.org> Acked-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by:
Kishon Vijay Abraham I <kishon@ti.com> Acked-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Acked-by:
Thierry Reding <thierry.reding@gmail.com> Acked-by:
Tony Lindgren <tony@atomide.com> Acked-by:
Daniel Thompson <daniel.thompson@linaro.org> Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Lee Jones <lee.jones@linaro.org>
-
Varsha Rao authored
This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they are no longer required. Replace _ASSERT() macros with WARN_ON(). Signed-off-by:
Varsha Rao <rvarsha016@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Cédric Le Goater authored
xive_spapr_init() is called from a __init routine and calls __init routines. Signed-off-by:
Cédric Le Goater <clg@kaod.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Paul Mackerras authored
Commit 350779a2 ("powerpc: Handle most loads and stores in instruction emulation code", 2017-08-30) changed the register usage in get_vr and put_vr with the aim of leaving the register number in r3 untouched on return. Unfortunately, r6 was not a good choice, as the callers as of 350779a2 store a MSR value in r6. Then, in commit c22435a5 ("powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live", 2017-08-30), the saving and restoring of the MSR got moved into get_vr and put_vr. Either way, the effect is that we put a value in MSR that only has the 0x3f8 bits non-zero, meaning that we are switching to 32-bit mode. That leads to a crash like this: Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0x0007bea0 Oops: Kernel access of bad area, sig: 11 [#12] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: vmx_crypto binfmt_misc ip_tables x_tables autofs4 crc32c_vpmsum CPU: 6 PID: 32659 Comm: trashy_testcase Tainted: G D 4.13.0-rc2-00313-gf3026f57e6ed-dirty #23 task: c000000f1bb9e780 task.stack: c000000f1ba98000 NIP: 000000000007bea0 LR: c00000000007b054 CTR: c00000000007be70 REGS: c000000f1ba9b960 TRAP: 0400 Tainted: G D (4.13.0-rc2-00313-gf3026f57e6ed-dirty) MSR: 10000000400010a1 <HV,ME,IR,LE> CR: 48000228 XER: 00000000 CFAR: c00000000007be74 SOFTE: 1 GPR00: c00000000007b054 c000000f1ba9bbe0 c000000000e6e000 000000000000001d GPR04: c000000f1ba9bc00 c00000000007be70 00000000000000e8 9000000002009033 GPR08: 0000000002000000 100000000282f033 000000000b0a0900 0000000000001009 GPR12: 0000000000000000 c00000000fd42100 0706050303020100 a5a5a5a5a5a5a5a5 GPR16: 2e2e2e2e2e2de70c 2e2e2e2e2e2e2e2d 0000000000ff00ff 0606040202020000 GPR20: 000000000000005b ffffffffffffffff 0000000003020100 0000000000000000 GPR24: c000000f1ab90020 c000000f1ba9bc00 0000000000000001 0000000000000001 GPR28: c000000f1ba9bc90 c000000f1ba9bea0 000000000b0a0908 0000000000000001 NIP [000000000007bea0] 0x7bea0 LR [c00000000007b054] emulate_loadstore+0x1044/0x1280 Call Trace: [c000000f1ba9bbe0] [c000000000076b80] analyse_instr+0x60/0x34f0 (unreliable) [c000000f1ba9bc70] [c00000000007b7ec] emulate_step+0x23c/0x544 [c000000f1ba9bce0] [c000000000053424] arch_uprobe_skip_sstep+0x24/0x40 [c000000f1ba9bd00] [c00000000024b2f8] uprobe_notify_resume+0x598/0xba0 [c000000f1ba9be00] [c00000000001c284] do_notify_resume+0xd4/0xf0 [c000000f1ba9be30] [c00000000000bd44] ret_from_except_lite+0x70/0x74 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace a7ae7a7f3e0256b5 ]--- To fix this, we just revert to using r3 as before, since the callers don't rely on r3 being left unmodified. Fortunately, this can't be triggered by a misaligned load or store, because vector loads and stores truncate misaligned addresses rather than taking an alignment interrupt. It can be triggered using uprobes. Fixes: 350779a2 ("powerpc: Handle most loads and stores in instruction emulation code") Reported-by:
Anton Blanchard <anton@ozlabs.org> Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Tested-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- Sep 02, 2017
-
-
Cédric Le Goater authored
Having the CPU identifier in the debug logs is helpful when tracking issues. Also add some more logging and fix a compile issue in xive_do_source_eoi(). Signed-off-by:
Cédric Le Goater <clg@kaod.org> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Cédric Le Goater authored
On POWER9, the Client Architecture Support (CAS) negotiation process determines whether the guest operates in XIVE Legacy compatibility or in XIVE exploitation mode. Now that we have initial guest support for the XIVE interrupt controller, let's inform the hypervisor what we can do. The platform advertises the XIVE Exploitation Mode support using the property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 : - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only - 0b10 XIVE legacy or exploitation mode The OS asks for XIVE Exploitation Mode support using the property "ibm,architecture-vec-5", byte 23 bits 0-1: - 0b00 XIVE legacy mode Only - 0b01 XIVE exploitation mode Only Signed-off-by:
Cédric Le Goater <clg@kaod.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Cédric Le Goater authored
The H_INT_ESB hcall() is used to issue a load or store to the ESB page instead of using the MMIO pages. This can be used as a workaround on some HW issues. The OS knows that this hcall should be used on an interrupt source when the ESB hcall flag is set to 1 in the hcall H_INT_GET_SOURCE_INFO. To maintain the frontier between the xive frontend and backend, we introduce a new xive operation 'esb_rw' to be used in the routines doing memory accesses on the ESBs. Signed-off-by:
Cédric Le Goater <clg@kaod.org> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Cédric Le Goater authored
It will be required later by the H_INT_ESB hcall. Signed-off-by:
Cédric Le Goater <clg@kaod.org> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Cédric Le Goater authored
Some source support MMIO stores on the ESB page to perform EOI. Let's introduce a specific routine for this case even if this should be the only use of it. Signed-off-by:
Cédric Le Goater <clg@kaod.org> Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-