- 23 Sep, 2017 1 commit
-
-
Josh Poimboeuf authored
For inline asm statements which have a CALL instruction, we list the stack pointer as a constraint to convince GCC to ensure the frame pointer is set up first: static inline void foo() { register void *__sp asm(_ASM_SP); asm("call bar" : "+r" (__sp)) } Unfortunately, that pattern causes Clang to corrupt the stack pointer. The fix is easy: convert the stack pointer register variable to a global variable. It should be noted that the end result is different based on the GCC version. With GCC 6.4, this patch has exactly the same result as before: defconfig defconfig-nofp distro distro-nofp before 9820389 9491555 8816046 8516940 after 9820389 9491555 8816046 8516940 With GCC 7.2, however, GCC's behavior has changed. It now changes its behavior based on the conversion of the register variable to a global. That somehow convinces it to *always* set up the frame pointer before inserting *any* inline asm. (Therefore, listing the variable as an output constraint is a no-op and is no longer necessary.) It's a bit overkill, but the performance impact should be negligible. And in fact, there's a nice improvement with frame pointers disabled: defconfig defconfig-nofp distro distro-nofp before 9796316 9468236 9076191 8790305 after 9796957 9464267 9076381 8785949 So in summary, while listing the stack pointer as an output constraint is no longer necessary for newer versions of GCC, it's still needed for older versions. Suggested-by:
Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by:
Matthias Kaehlcke <mka@chromium.org> Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 29 Aug, 2017 1 commit
-
-
Thomas Gleixner authored
The union inside of desc_struct which allows access to the raw u32 parts of the descriptors. This raw access part is about to go away. Replace the few code parts which access those fields. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064958.120214366@linutronix.deSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 03 Jul, 2017 1 commit
-
-
Marek Marczykowski-Górecki authored
Userspace application can do a hypercall through /dev/xen/privcmd, and some for some hypercalls argument is a pointers to user-provided structure. When SMAP is supported and enabled, hypervisor can't access. So, lets allow it. The same applies to HYPERVISOR_dm_op, where additionally privcmd driver carefully verify buffer addresses. Cc: stable@vger.kernel.org Signed-off-by:
Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com>
-
- 08 Jun, 2017 1 commit
-
-
Sergey Dyasli authored
Change the third parameter to be the required struct xen_dm_op_buf * instead of a generic void * (which blindly accepts any pointer). Signed-off-by:
Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
Stefano Stabellini <sstabellini@kernel.org> Signed-off-by:
Juergen Gross <jgross@suse.com>
-
- 14 Feb, 2017 1 commit
-
-
Paul Durrant authored
Recently a new dm_op[1] hypercall was added to Xen to provide a mechanism for restricting device emulators (such as QEMU) to a limited set of hypervisor operations, and being able to audit those operations in the kernel of the domain in which they run. This patch adds IOCTL_PRIVCMD_DM_OP as gateway for __HYPERVISOR_dm_op. NOTE: There is no requirement for user-space code to bounce data through locked memory buffers (as with IOCTL_PRIVCMD_HYPERCALL) since privcmd has enough information to lock the original buffers directly. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=524a98c2Signed-off-by:
Paul Durrant <paul.durrant@citrix.com> Acked-by:
Stefano Stabellini <sstabellini@kernel.org> Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 24 Feb, 2016 1 commit
-
-
Josh Poimboeuf authored
If a hypercall is inlined at the beginning of a function, gcc can insert the call instruction before setting up a stack frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and can result in a bad stack trace. Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an output operand for the hypercall inline asm statements. Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/c6face5a46713108bded9c4c103637222abc4528.1453405861.git.jpoimboe@redhat.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 21 Dec, 2015 1 commit
-
-
Stefano Stabellini authored
The dom0_op hypercall has been renamed to platform_op since Xen 3.2, which is ancient, and modern upstream Linux kernels cannot run as dom0 and it anymore anyway. Signed-off-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 28 Sep, 2015 1 commit
-
-
Juergen Gross authored
HYPERVISOR_memory_op() is defined to return an "int" value. This is wrong, as the Xen hypervisor will return "long". The sub-function XENMEM_maximum_reservation returns the maximum number of pages for the current domain. An int will overflow for a domain configured with 8TB of memory or more. Correct this by using the correct type. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
- 20 Aug, 2015 1 commit
-
-
Boris Ostrovsky authored
Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall. Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
- 24 Apr, 2014 1 commit
-
-
Ian Campbell authored
As part of this make the usual change to xen_ulong_t in place of unsigned long. This change has no impact on x86. The Linux definition of struct multicall_entry.result differs from the Xen definition, I think for good reasons, and used a long rather than an unsigned long. Therefore introduce a xen_long_t, which is a long on x86 architectures and a signed 64-bit integer on ARM. Use uint32_t nr_calls on x86 for consistency with the ARM definition. Build tested on amd64 and i386 builds. Runtime tested on ARM. Signed-off-by:
Ian Campbell <ian.campbell@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
- 22 Mar, 2013 1 commit
-
-
Jan Beulich authored
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: add mechanism to fully protect MSI-X table from PV guest accesses") on the master branch of git://xenbits.xen.org/xen.git. CC: stable@vger.kernel.org Signed-off-by:
Jan Beulich <jbeulich@suse.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 04 Nov, 2012 1 commit
-
-
Jan Beulich authored
While copying the argument structures in HYPERVISOR_event_channel_op() and HYPERVISOR_physdev_op() into the local variable is sufficiently safe even if the actual structure is smaller than the container one, copying back eventual output values the same way isn't: This may collide with on-stack variables (particularly "rc") which may change between the first and second memcpy() (i.e. the second memcpy() could discard that change). Move the fallback code into out-of-line functions, and handle all of the operations known by this old a hypervisor individually: Some don't require copying back anything at all, and for the rest use the individual argument structures' sizes rather than the container's. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Jan Beulich <jbeulich@suse.com> [v2: Reduce #define/#undef usage in HYPERVISOR_physdev_op_compat().] [v3: Fix compile errors when modules use said hypercalls] [v4: Add xen_ prefix to the HYPERCALL_..] [v5: Alter the name and only EXPORT_SYMBOL_GPL one of them] Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 19 Jul, 2012 1 commit
-
-
Liu, Jinsong authored
When MCA error occurs, it would be handled by Xen hypervisor first, and then the error information would be sent to initial domain for logging. This patch gets error information from Xen hypervisor and convert Xen format error into Linux format mcelog. This logic is basically self-contained, not touching other kernel components. By using tools like mcelog tool users could read specific error information, like what they did under native Linux. To test follow directions outlined in Documentation/acpi/apei/einj.txt Acked-and-tested-by:
Borislav Petkov <borislav.petkov@amd.com> Signed-off-by:
Ke, Liping <liping.ke@intel.com> Signed-off-by:
Jiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 26 Sep, 2011 1 commit
-
-
Jeremy Fitzhardinge authored
Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
-
- 18 Jul, 2011 1 commit
-
-
Jeremy Fitzhardinge authored
Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
-
- 26 May, 2011 1 commit
-
-
Dan Magenheimer authored
This patch provides a shim between the kernel-internal cleancache API (see Documentation/mm/cleancache.txt) and the Xen Transcendent Memory ABI (see http://oss.oracle.com/projects/tmem). Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented pseudo-RAM store for cleancache pages, shared cleancache pages, and frontswap pages. Tmem provides enterprise-quality concurrency, full save/restore and live migration support, compression and deduplication. A presentation showing up to 8% faster performance and up to 52% reduction in sectors read on a kernel compile workload, despite aggressive in-kernel page reclamation ("self-ballooning") can be found at: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdfSigned-off-by:
Dan Magenheimer <dan.magenheimer@oracle.com> Reviewed-by:
Jeremy Fitzhardinge <jeremy@goop.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik Van Riel <riel@redhat.com> Cc: Jan Beulich <JBeulich@novell.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Andreas Dilger <adilger@sun.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org>
-
- 25 Feb, 2011 2 commits
-
-
Ian Campbell authored
Rename old interface to sched_op_compat and rename sched_op_new to simply sched_op. Signed-off-by:
Ian Campbell <ian.campbell@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Ian Campbell authored
Take the opportunity to comment on the semantics of the PV guest suspend hypercall arguments. Signed-off-by:
Ian Campbell <ian.campbell@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 20 Oct, 2010 1 commit
-
-
Jeremy Fitzhardinge authored
Allow non-constant hypercall to be called, for privcmd. [ Impact: make arbitrary hypercalls; needed for privcmd ] Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
-
- 22 Jul, 2010 1 commit
-
-
Jeremy Fitzhardinge authored
Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Sheng Yang <sheng@linux.intel.com> Signed-off-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com>
-
- 12 Mar, 2009 1 commit
-
-
Jan Beulich authored
Impact: fix potential oops during app-initiated LDT manipulation The underlying hypercall has differing argument requirements on 32- and 64-bit. Signed-off-by:
Jan Beulich <jbeulich@novell.com> Acked-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> LKML-Reference: <49B9061E.76E4.0078.0@novell.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- 16 Dec, 2008 1 commit
-
-
Jeremy Fitzhardinge authored
Impact: cleanup hypervisor.h had accumulated a lot of crud, including lots of spurious #includes. Clean it all up, and go around fixing up everything else accordingly. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- 23 Oct, 2008 3 commits
-
-
H. Peter Anvin authored
Drop double underscores from header guards in arch/x86/include. They are used inconsistently, and are not necessary. Signed-off-by:
H. Peter Anvin <hpa@zytor.com>
-
H. Peter Anvin authored
Change header guards named "ASM_X86__*" to "_ASM_X86_*" since: a. the double underscore is ugly and pointless. b. no leading underscore violates namespace constraints. Signed-off-by:
H. Peter Anvin <hpa@zytor.com>
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
H. Peter Anvin <hpa@zytor.com>
-
- 22 Jul, 2008 1 commit
-
-
Vegard Nossum authored
This patch is the result of an automatic script that consolidates the format of all the headers in include/asm-x86/. The format: 1. No leading underscore. Names with leading underscores are reserved. 2. Pathname components are separated by two underscores. So we can distinguish between mm_types.h and mm/types.h. 3. Everything except letters and numbers are turned into single underscores. Signed-off-by:
Vegard Nossum <vegard.nossum@gmail.com>
-
- 16 Jul, 2008 5 commits
-
-
Jeremy Fitzhardinge authored
64-bit hypercall interface can pass a maddr in one argument rather than splitting it. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Eduardo Habkost authored
Signed-off-by:
Eduardo Habkost <ehabkost@redhat.com> Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Jeremy Fitzhardinge authored
Use callback_op hypercall to register callbacks in a 32/64-bit independent way (64-bit doesn't need a code segment, but that detail is hidden in XEN_CALLBACK). Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Jeremy Fitzhardinge authored
The 64-bit calling convention for hypercalls uses different registers from 32-bit. Annoyingly, gcc's asm syntax doesn't have a way to specify one of the extra numeric reigisters in a constraint, so we must use explicitly placed register variables. Given that we have to do it for some args, may as well do it for all. Also fix syntax gcc generates for the call instruction itself. We need a plain direct call, but the asm expansion which works on 32-bit generates a rip-relative addressing mode in 64-bit, which is treated as an indirect call. The alternative is to pass the hypercall page offset into the asm, and have it add it to the hypercall page start address to generate the call. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Jeremy Fitzhardinge authored
64-bit guests can pass 64-bit quantities in a single argument, so fix up the hypercalls. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- 27 May, 2008 2 commits
-
-
Jeremy Fitzhardinge authored
Use the new sched_op hypercall, mainly because xenner doesn't support the old one. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Jeremy Fitzhardinge authored
Xen will trap and emulate clts, but its better to use a hypercall. Also, xenner doesn't handle clts. Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- 24 Apr, 2008 1 commit
-
-
Jeremy Fitzhardinge authored
Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- 11 Oct, 2007 1 commit
-
-
Thomas Gleixner authored
Move the headers to include/asm-x86 and fixup the header install make rules Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- 18 Jul, 2007 2 commits
-
-
Jeremy Fitzhardinge authored
This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by:
Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by:
Chris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>
-
Jeremy Fitzhardinge authored
Add Xen interface header files. These are taken fairly directly from the Xen tree, but somewhat rearranged to suit the kernel's conventions. Define macros and inline functions for doing hypercalls into the hypervisor. Signed-off-by:
Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by:
Ian Pratt <ian.pratt@xensource.com> Signed-off-by:
Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by:
Chris Wright <chrisw@sous-sol.org>
-