1. 23 Sep, 2017 1 commit
    • Josh Poimboeuf's avatar
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf authored
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f5caf621
  2. 16 Nov, 2016 1 commit
  3. 08 Jun, 2016 1 commit
  4. 24 Feb, 2016 1 commit
    • Josh Poimboeuf's avatar
      sched/x86: Add stack frame dependency to __preempt_schedule[_notrace]() · 821eae7d
      Josh Poimboeuf authored
      If __preempt_schedule() or __preempt_schedule_notrace() is referenced at
      the beginning of a function, gcc can insert the asm inline "call
      ___preempt_schedule[_notrace]" instruction before setting up a stack
      frame, which breaks frame pointer convention if CONFIG_FRAME_POINTER is
      enabled and can result in bad stack traces.
      
      Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
      listing the stack pointer as an output operand for the inline asm
      statements.
      
      Specifically this fixes the following stacktool warnings:
      
        stacktool: drivers/scsi/hpsa.o: hpsa_scsi_do_simple_cmd.constprop.106()+0x79: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x70: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_find_first()+0x92: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0xff: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0xf5: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_free()+0x11a: call without frame pointer save/setup
        stacktool: fs/mbcache.o: mb_cache_entry_get()+0x225: call without frame pointer save/setup
        stacktool: kernel/locking/percpu-rwsem.o: percpu_up_read()+0x27: call without frame pointer save/setup
        stacktool: kernel/profile.o: do_profile_hits.isra.5()+0x139: call without frame pointer save/setup
        stacktool: lib/nmi_backtrace.o: nmi_trigger_all_cpu_backtrace()+0x2b6: call without frame pointer save/setup
        stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_recv()+0x58: call without frame pointer save/setup
        stacktool: net/rds/ib_cm.o: rds_ib_cq_comp_handler_send()+0x58: call without frame pointer save/setup
        stacktool: net/rds/ib_recv.o: rds_ib_attempt_ack()+0xc1: call without frame pointer save/setup
        stacktool: net/rds/iw_recv.o: rds_iw_attempt_ack()+0xc1: call without frame pointer save/setup
        stacktool: net/rds/iw_recv.o: rds_iw_recv_cq_comp_handler()+0x55: call without frame pointer save/setup
      
      So it only adds a stack frame to 15 call sites out of ~5000 calls to
      ___preempt_schedule[_notrace]().  All the others already had stack frames.
      
      Oddly, this change actually seems to make things faster in a lot of
      cases.  For many smaller functions it causes the stack frame creation to
      get moved out of the common path and into the unlikely path.
      
      For example, here's the original cyc2ns_read_end():
      
        ffffffff8101f8c0 <cyc2ns_read_end>:
        ffffffff8101f8c0:	55                   	push   %rbp
        ffffffff8101f8c1:	48 89 e5             	mov    %rsp,%rbp
        ffffffff8101f8c4:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
        ffffffff8101f8c8:	75 08                	jne    ffffffff8101f8d2 <cyc2ns_read_end+0x12>
        ffffffff8101f8ca:	65 48 89 3d e6 5a ff 	mov    %rdi,%gs:0x7eff5ae6(%rip)        # 153b8 <cyc2ns+0x38>
        ffffffff8101f8d1:	7e
        ffffffff8101f8d2:	65 ff 0d 77 c4 fe 7e 	decl   %gs:0x7efec477(%rip)        # bd50 <__preempt_count>
        ffffffff8101f8d9:	74 02                	je     ffffffff8101f8dd <cyc2ns_read_end+0x1d>
        ffffffff8101f8db:	5d                   	pop    %rbp
        ffffffff8101f8dc:	c3                   	retq
        ffffffff8101f8dd:	e8 1e 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
        ffffffff8101f8e2:	5d                   	pop    %rbp
        ffffffff8101f8e3:	c3                   	retq
        ffffffff8101f8e4:	66 66 66 2e 0f 1f 84 	data16 data16 nopw %cs:0x0(%rax,%rax,1)
        ffffffff8101f8eb:	00 00 00 00 00
      
      And here's the same function with the patch:
      
        ffffffff8101f8c0 <cyc2ns_read_end>:
        ffffffff8101f8c0:	83 6f 10 01          	subl   $0x1,0x10(%rdi)
        ffffffff8101f8c4:	75 08                	jne    ffffffff8101f8ce <cyc2ns_read_end+0xe>
        ffffffff8101f8c6:	65 48 89 3d ea 5a ff 	mov    %rdi,%gs:0x7eff5aea(%rip)        # 153b8 <cyc2ns+0x38>
        ffffffff8101f8cd:	7e
        ffffffff8101f8ce:	65 ff 0d 7b c4 fe 7e 	decl   %gs:0x7efec47b(%rip)        # bd50 <__preempt_count>
        ffffffff8101f8d5:	74 01                	je     ffffffff8101f8d8 <cyc2ns_read_end+0x18>
        ffffffff8101f8d7:	c3                   	retq
        ffffffff8101f8d8:	55                   	push   %rbp
        ffffffff8101f8d9:	48 89 e5             	mov    %rsp,%rbp
        ffffffff8101f8dc:	e8 1f 37 fe ff       	callq  ffffffff81003000 <___preempt_schedule>
        ffffffff8101f8e1:	5d                   	pop    %rbp
        ffffffff8101f8e2:	c3                   	retq
        ffffffff8101f8e3:	66 66 66 66 2e 0f 1f 	data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
        ffffffff8101f8ea:	84 00 00 00 00 00
      
      Notice that it moved the frame pointer setup code to the unlikely
      ___preempt_schedule() call path.  Going through a sampling of the
      differences in the asm, that's the most common change I see.
      
      Otherwise it has no real effect on callers which already have stack
      frames (though it does result in the reordering of some 'mov's).
      Reported-by: default avatarJiri Slaby <jslaby@suse.cz>
      Tested-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160218174158.GA28230@treble.redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      821eae7d
  5. 06 Oct, 2015 2 commits
    • Peter Zijlstra's avatar
      sched/core, sched/x86: Kill thread_info::saved_preempt_count · d87b7a33
      Peter Zijlstra authored
      With the introduction of the context switch preempt_count invariant,
      and the demise of PREEMPT_ACTIVE, its pointless to save/restore the
      per-cpu preemption count, it must always be 2.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d87b7a33
    • Peter Zijlstra's avatar
      sched/core: Create preempt_count invariant · 609ca066
      Peter Zijlstra authored
      Assuming units of PREEMPT_DISABLE_OFFSET for preempt_count() numbers.
      
      Now that TASK_DEAD no longer results in preempt_count() == 3 during
      scheduling, we will always call context_switch() with preempt_count()
      == 2.
      
      However, we don't always end up with preempt_count() == 2 in
      finish_task_switch() because new tasks get created with
      preempt_count() == 1.
      
      Create FORK_PREEMPT_COUNT and set it to 2 and use that in the right
      places. Note that we cannot use INIT_PREEMPT_COUNT as that serves
      another purpose (boot).
      
      After this, preempt_count() is invariant across the context switch,
      with exception of PREEMPT_ACTIVE.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      609ca066
  6. 03 Aug, 2015 1 commit
  7. 07 Jun, 2015 1 commit
  8. 28 Oct, 2014 2 commits
    • Oleg Nesterov's avatar
      sched: Kill task_preempt_count() · e2336f6e
      Oleg Nesterov authored
      task_preempt_count() is pointless if preemption counter is per-cpu,
      currently this is x86 only. It is only valid if the task is not
      running, and even in this case the only info it can provide is the
      state of PREEMPT_ACTIVE bit.
      
      Change its single caller to check p->on_rq instead, this should be
      the same if p->state != TASK_RUNNING, and kill this helper.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141008183348.GC17495@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e2336f6e
    • Oleg Nesterov's avatar
      sched: stop the unbound recursion in preempt_schedule_context() · 009f60e2
      Oleg Nesterov authored
      preempt_schedule_context() does preempt_enable_notrace() at the end
      and this can call the same function again; exception_exit() is heavy
      and it is quite possible that need-resched is true again.
      
      1. Change this code to dec preempt_count() and check need_resched()
         by hand.
      
      2. As Linus suggested, we can use the PREEMPT_ACTIVE bit and avoid
         the enable/disable dance around __schedule(). But in this case
         we need to move into sched/core.c.
      
      3. Cosmetic, but x86 forgets to declare this function. This doesn't
         really matter because it is only called by asm helpers, still it
         make sense to add the declaration into asm/preempt.h to match
         preempt_schedule().
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/20141005202322.GB27962@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      009f60e2
  9. 07 Apr, 2014 1 commit
    • Christoph Lameter's avatar
      percpu: add raw_cpu_ops · b3ca1c10
      Christoph Lameter authored
      The kernel has never been audited to ensure that this_cpu operations are
      consistently used throughout the kernel.  The code generated in many
      places can be improved through the use of this_cpu operations (which
      uses a segment register for relocation of per cpu offsets instead of
      performing address calculations).
      
      The patch set also addresses various consistency issues in general with
      the per cpu macros.
      
      A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
         because checks are skipped. This is typically shown through a raw_
         prefix. So this patch set changes the places where __this_cpu_ptr()
         is used to raw_cpu_ptr().
      
      B. There has been the long term wish by some that __this_cpu operations
         would check for preemption. However, there are cases where preemption
         checks need to be skipped. This patch set adds raw_cpu operations that
         do not check for preemption and then adds preemption checks to the
         __this_cpu operations.
      
      C. The use of __get_cpu_var is always a reference to a percpu variable
         that can also be handled via a this_cpu operation. This patch set
         replaces all uses of __get_cpu_var with this_cpu operations.
      
      D. We can then use this_cpu RMW operations in various places replacing
         sequences of instructions by a single one.
      
      E. The use of this_cpu operations throughout will allow other arches than
         x86 to implement optimized references and RMV operations to work with
         per cpu local data.
      
      F. The use of this_cpu operations opens up the possibility to
         further optimize code that relies on synchronization through
         per cpu data.
      
      The patch set works in a couple of stages:
      
      I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
          Also converts the existing __this_cpu_xx_# primitive in the x86
          code to raw_cpu_xx_#.
      
      II. Patch 2-4 use the raw_cpu operations in places that would give
           us false positives once they are enabled.
      
      III. Patch 5 adds preemption checks to __this_cpu operations to allow
          checking if preemption is properly disabled when these functions
          are used.
      
      IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
         with this_cpu_ptr. They do not depend on any changes to the percpu
         code. No preemption tests are skipped if they are applied.
      
      V. Patches 21-46 are conversion patches that use this_cpu operations
         in various kernel subsystems/drivers or arch code.
      
      VI.  Patches 47/48 (not included in this series) remove no longer used
          functions (__this_cpu_ptr and __get_cpu_var).  These should only be
          applied after all the conversion patches have made it and after we
          have done additional passes through the kernel to ensure that none of
          the uses of these functions remain.
      
      This patch (of 46):
      
      The patches following this one will add preemption checks to __this_cpu
      ops so we need to have an alternative way to use this_cpu operations
      without preemption checks.
      
      raw_cpu_ops will be the basis for all other ops since these will be the
      operations that do not implement any checks.
      
      Primitive operations are renamed by this patch from __this_cpu_xxx to
      raw_cpu_xxxx.
      
      Also change the uses of the x86 percpu primitives in preempt.h.
      These depend directly on asm/percpu.h (header #include nesting issue).
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bryan Wu <cooloney@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b3ca1c10
  10. 11 Dec, 2013 1 commit
  11. 28 Sep, 2013 1 commit
  12. 25 Sep, 2013 5 commits