Skip to content
  • Milton Miller's avatar
    smp_call_function_many: handle concurrent clearing of mask · 723aae25
    Milton Miller authored
    
    
    Mike Galbraith reported finding a lockup ("perma-spin bug") where the
    cpumask passed to smp_call_function_many was cleared by other cpu(s)
    while a cpu was preparing its call_data block, resulting in no cpu to
    clear the last ref and unlock the block.
    
    Having cpus clear their bit asynchronously could be useful on a mask of
    cpus that might have a translation context, or cpus that need a push to
    complete an rcu window.
    
    Instead of adding a BUG_ON and requiring yet another cpumask copy, just
    detect the race and handle it.
    
    Note: arch_send_call_function_ipi_mask must still handle an empty
    cpumask because the data block is globally visible before the that arch
    callback is made.  And (obviously) there are no guarantees to which cpus
    are notified if the mask is changed during the call; only cpus that were
    online and had their mask bit set during the whole call are guaranteed
    to be called.
    
    Reported-by: default avatarMike Galbraith <efault@gmx.de>
    Reported-by: default avatarJan Beulich <JBeulich@novell.com>
    Acked-by: default avatarJan Beulich <jbeulich@novell.com>
    Cc: stable@kernel.org
    Signed-off-by: default avatarMilton Miller <miltonm@bga.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    723aae25