1. 16 Oct, 2017 1 commit
  2. 13 Jul, 2017 1 commit
  3. 01 Mar, 2017 1 commit
    • Eric Dumazet's avatar
      tcp/dccp: block BH for SYN processing · 449809a6
      Eric Dumazet authored
      SYN processing really was meant to be handled from BH.
      
      When I got rid of BH blocking while processing socket backlog
      in commit 5413d1ba ("net: do not block BH while processing socket
      backlog"), I forgot that a malicious user could transition to TCP_LISTEN
      from a state that allowed (SYN) packets to be parked in the socket
      backlog while socket is owned by the thread doing the listen() call.
      
      Sure enough syzkaller found this and reported the bug ;)
      
      =================================
      [ INFO: inconsistent lock state ]
      4.10.0+ #60 Not tainted
      ---------------------------------
      inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
      syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
      [<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
       (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
      [<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
      net/ipv4/inet_hashtables.c:407
      {IN-SOFTIRQ-W} state was registered at:
        mark_irqflags kernel/locking/lockdep.c:2923 [inline]
        __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
        lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
        spin_lock include/linux/spinlock.h:299 [inline]
        inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
        reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
        inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
        tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
        tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
        tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
        tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
        tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
        ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
        NF_HOOK include/linux/netfilter.h:257 [inline]
        ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
        dst_input include/net/dst.h:492 [inline]
        ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
        NF_HOOK include/linux/netfilter.h:257 [inline]
        ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
        __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
        __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
        netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
        napi_skb_finish net/core/dev.c:4602 [inline]
        napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
        e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
        e1000_clean_rx_irq+0x5e0/0x1490
      drivers/net/ethernet/intel/e1000/e1000_main.c:4489
        e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
        napi_poll net/core/dev.c:5171 [inline]
        net_rx_action+0xe70/0x1900 net/core/dev.c:5236
        __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
        invoke_softirq kernel/softirq.c:364 [inline]
        irq_exit+0x19e/0x1d0 kernel/softirq.c:405
        exiting_irq arch/x86/include/asm/apic.h:658 [inline]
        do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
        ret_from_intr+0x0/0x20
        native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
        arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
        default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
        arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
        default_idle_call+0x36/0x60 kernel/sched/idle.c:96
        cpuidle_idle_call kernel/sched/idle.c:154 [inline]
        do_idle+0x348/0x440 kernel/sched/idle.c:243
        cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
        start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
        verify_cpu+0x0/0xfc
      irq event stamp: 1741
      hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
      __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
      [inline]
      hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
      _raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
      hardirqs last disabled at (1740): [<ffffffff84d4a732>]
      __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
      hardirqs last disabled at (1740): [<ffffffff84d4a732>]
      _raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
      softirqs last  enabled at (1738): [<ffffffff84d4deff>]
      __do_softirq+0x7cf/0xb7d kernel/softirq.c:310
      softirqs last disabled at (1571): [<ffffffff84d4b92c>]
      do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&hashinfo->ehash_locks[i])->rlock);
        <Interrupt>
          lock(&(&hashinfo->ehash_locks[i])->rlock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor0/5090:
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
      include/net/sock.h:1460 [inline]
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
      sock_setsockopt+0x233/0x1e40 net/core/sock.c:683
      
      stack backtrace:
      CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x292/0x398 lib/dump_stack.c:51
       print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
       valid_state kernel/locking/lockdep.c:2400 [inline]
       mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
       mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
       mark_irqflags kernel/locking/lockdep.c:2941 [inline]
       __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
       lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
       spin_lock include/linux/spinlock.h:299 [inline]
       inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
       reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
       inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
       dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
       dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
       dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
       sk_backlog_rcv include/net/sock.h:896 [inline]
       __release_sock+0x127/0x3a0 net/core/sock.c:2052
       release_sock+0xa5/0x2b0 net/core/sock.c:2539
       sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
       SYSC_setsockopt net/socket.c:1782 [inline]
       SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x4458b9
      RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
      RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
      RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
      R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
      R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000
      
      Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      449809a6
  4. 17 Feb, 2017 1 commit
  5. 02 May, 2016 1 commit
  6. 28 Apr, 2016 1 commit
  7. 18 Nov, 2014 1 commit
  8. 11 Apr, 2014 1 commit
    • David S. Miller's avatar
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller authored
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      676d2369
  9. 11 Jul, 2012 1 commit
  10. 15 Apr, 2012 1 commit
  11. 04 Jul, 2011 1 commit
    • Gerrit Renker's avatar
      dccp: Clean up slow-path input processing · c0c20150
      Gerrit Renker authored
      This patch rearranges the order of statements of the slow-path input processing
      (i.e. any other state than OPEN), to resolve the following issues.
      
       1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
          step 7 is before step 9 (previously 9 was before 7), and parsing options in
          step 8 (which may consume resources) now comes after step 7.
       2. Sequence number checks are omitted if in state LISTEN/REQUEST, due to the
          note underneath the table in RFC 4340, 7.5.3.
          As a result, CCID processing is now indeed confined to OPEN/PARTOPEN states,
          i.e. congestion control is performed only on the flow of data packets. This
          avoids pathological cases of doing congestion control on those messages
          which set up and terminate the connection.
       3. Packets are now passed on to Ack Vector / CCID processing only after
          - step 7  (receive unexpected packets),
          - step 9  (receive Reset),
          - step 13 (receive CloseReq),
          - step 14 (receive Close)
          and only if the state is PARTOPEN. This simplifies CCID processing:
          - in LISTEN/CLOSED the CCIDs are non-existent;
          - in RESPOND/REQUEST the CCIDs have not yet been negotiated;
          - in CLOSEREQ and active-CLOSING the node has already closed this socket;
          - in passive-CLOSING the client is waiting for its Reset.
          In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
          data, which is the approach taken here.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      c0c20150
  12. 02 Mar, 2011 1 commit
    • Gerrit Renker's avatar
      dccp: fix oops on Reset after close · 720dc34b
      Gerrit Renker authored
      This fixes a bug in the order of dccp_rcv_state_process() that still permitted
      reception even after closing the socket. A Reset after close thus causes a NULL
      pointer dereference by not preventing operations on an already torn-down socket.
      
       dccp_v4_do_rcv() 
      	|
      	| state other than OPEN
      	v
       dccp_rcv_state_process()
      	|
      	| DCCP_PKT_RESET
      	v
       dccp_rcv_reset()
      	|
      	v
       dccp_time_wait()
      
       WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
       Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
       [<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common)
       [<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n)
       [<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd)
       [<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai)
       [<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces)
       [<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_)
       [<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0)
       [<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380)
       [<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70)
      
      The fix is by testing the socket state first. Receiving a packet in Closed state
      now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.
      Reported-and-tested-by: default avatarJohan Hovold <jhovold@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      720dc34b
  13. 07 Jan, 2011 1 commit
  14. 28 Nov, 2010 1 commit
  15. 15 Nov, 2010 2 commits
    • Gerrit Renker's avatar
      dccp ccid-2: Consolidate Ack-Vector processing within main DCCP module · 18219463
      Gerrit Renker authored
      This aggregates Ack Vector processing (handling input and clearing old state)
      into one function, for the following reasons and benefits:
       * all Ack Vector-specific processing is now in one place;
       * duplicated code is removed;
       * ensuring sanity: from an Ack Vector point of view, it is better to clear the
                          old state first before entering new state;
       * Ack Event handling happens mostly within the CCIDs, not the main DCCP module.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      18219463
    • Gerrit Renker's avatar
      dccp ccid-2: Algorithm to update buffer state · 5753fdfe
      Gerrit Renker authored
      This provides a routine to consistently update the buffer state when the
      peer acknowledges receipt of Ack Vectors; updating state in the list of Ack
      Vectors as well as in the circular buffer.
      
      While based on RFC 4340, several additional (and necessary) precautions were
      added to protect the consistency of the buffer state. These additions are
      essential, since analysis and experience showed that the basic algorithm was
      insufficient for this task (which lead to problems that were hard to debug).
      
      The algorithm now
       * deals with HC-sender acknowledging to HC-receiver and vice versa,
       * keeps track of the last unacknowledged but received seqno in tail_ackno,
       * has special cases to reset the overflow condition when appropriate,
       * is protected against receiving older information (would mess up buffer state).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      5753fdfe
  16. 10 Nov, 2010 1 commit
    • Gerrit Renker's avatar
      dccp ccid-2: Ack Vector interface clean-up · f17a37c9
      Gerrit Renker authored
      This patch brings the Ack Vector interface up to date. Its main purpose is
      to lay the basis for the subsequent patches of this set, which will use the
      new data structure fields and routines.
      
      There are no real algorithmic changes, rather an adaptation:
      
       (1) Replaced the static Ack Vector size (2) with a #define so that it can
           be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems
           to be sufficient for the moment) and added a solution so that computing
           the ECN nonce will continue to work - even with larger Ack Vectors.
      
       (2) Replaced the #defines for Ack Vector states with a complete enum.
      
       (3) Replaced #defines to compute Ack Vector length and state with general
           purpose routines (inlines), and updated code to use these.
      
       (4) Added a `tail' field (conversion to circular buffer in subsequent patch).
      
       (5) Updated the (outdated) documentation for Ack Vector struct.
      
       (6) All sequence number containers now trimmed to 48 bits.
      
       (7) Removal of unused bits:
           * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already
             redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record);
           * removed Elapsed Time for Ack Vectors (it was nowhere used);
           * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since
             the code needs to be able to remember the old run length;
           * reduced the de-/allocation routines (redundant / duplicate tests).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      f17a37c9
  17. 12 Oct, 2010 2 commits
    • Gerrit Renker's avatar
      dccp: cosmetics - warning format · 2f34b329
      Gerrit Renker authored
      This  omits the redundant "DCCP:" in warning messages, since DCCP_WARN() already
      echoes the function name, avoiding messages like
      
         kernel: [10988.766503] dccp_close: DCCP: ABORT -- 209 bytes unread
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      2f34b329
    • Gerrit Renker's avatar
      dccp: fix the adjustments to AWL and SWL · 0b53d460
      Gerrit Renker authored
      This fixes a problem and a potential loophole with regard to seqno/ackno
      validity: currently the initial adjustments to AWL/SWL are only performed
      once at the begin of the connection, during the handshake.
      
      Since the Sequence Window feature is always greater than Wmin=32 (7.5.2),
      it is however necessary to perform these adjustments at least for the first
      W/W' (variables as per 7.5.1) packets in the lifetime of a connection.
      
      This requirement is complicated by the fact that W/W' can change at any time
      during the lifetime of a connection.
      
      Therefore it is better to perform that safety check each time SWL/AWL are
      updated, as implemented by the patch.
      
      A second problem solved by this patch is that the remote/local Sequence Window
      feature values (which set the bounds for AWL/SWL/SWH) are undefined until the
      feature negotiation has completed.
      
      During the initial handshake we have more stringent sequence number protection;
      the changes added by this patch effect that {A,S}W{L,H} are within the correct
      bounds at the instant that feature negotiation completes (since the SeqWin
      feature activation handlers call dccp_update_gsr/gss()).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      0b53d460
  18. 26 Jun, 2010 1 commit
    • Gerrit Renker's avatar
      dccp: make implementation of Syn-RTT symmetric · 59b80802
      Gerrit Renker authored
      This patch is thanks to Andre Noll who reported the issue and helped testing.
      
      The Syn-RTT sampled during the initial handshake currently only works for
      the client sending the DCCP-Request. TFRC penalizes the absence of an RTT
      sample with a very slow initial speed (1 packet per second), which delays
      slow-start significantly, resulting in sluggish performance.
      
      This patch mirrors the "Syn RTT" principle by adding a timestamp also onto
      the DCCP-Response, producing an RTT sample  when the (Data)Ack completing
      the handshake arrives.
      
      Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59b80802
  19. 25 May, 2010 1 commit
  20. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  21. 24 Mar, 2010 1 commit
  22. 05 Jan, 2009 1 commit
  23. 08 Dec, 2008 2 commits
    • Gerrit Renker's avatar
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · 6fdd34d4
      Gerrit Renker authored
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, as it now is handled fully dynamically via feature negotiation
      (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
       RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      
      There was a FIXME to change the error code when dccp_ackvec_add() fails.
      I removed this after finding out that:
       * the check whether ackno < ISN is already made earlier,
       * this Response is likely the 1st packet with an Ackno that the client gets,
       * so when dccp_ackvec_add() fails, the reason is likely not a packet error.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fdd34d4
    • Gerrit Renker's avatar
      dccp: Integration of dynamic feature activation - part 3 (client side) · 991d927c
      Gerrit Renker authored
      This integrates feature-activation in the client:
      
       1. When dccp_parse_options() fails, the reset code is already set; request_sent\
          _state_process() currently overrides this with `Packet Error', which is not
          intended - changed to use the reset code supplied by dccp_parse_options().
      
       2. When feature negotiation fails, the socket should be marked as not usable,
          so that the application is notified that an error occurred. This is achieved
          by a new label 'unable_to_proceed': generating an error code of `Aborted',
          setting the socket state to CLOSED, returning with ECOMM in sk_err.
      
       3. Avoids parsing the Ack twice in Respond state by not doing option processing
          again in dccp_rcv_respond_partopen_state_process (as option processing has
          already been done on the request_sock in dccp_check_req).
      
      Since this addresses congestion-control initialisation, a corresponding
      FIXME has been removed.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      991d927c
  24. 05 Nov, 2008 1 commit
  25. 09 Sep, 2008 1 commit
  26. 04 Sep, 2008 10 commits
    • Gerrit Renker's avatar
      dccp: Clamping RTT values · 49ffc29a
      Gerrit Renker authored
      This extracts the clamping part of dccp_sample_rtt() and makes it available
      to other parts of the code (as e.g. used in the next patch).
      
      Note: The function dccp_sample_rtt() now reduces to subtracting the elapsed
      time. This could be eliminated but would require shorter prefixes and thus
      is not done by this patch - maybe an idea for later.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      49ffc29a
    • Gerrit Renker's avatar
      dccp: Clean up slow-path input processing · ddab0556
      Gerrit Renker authored
      This patch rearranges the order of statements of the slow-path input processing
      (i.e. any other state than OPEN), to resolve the following issues.
      
       1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
          step 7 is before step 9 (previously 9 was before 7), and parsing options in
          step 8 (which can consume resources) now comes after step 7.
       2. Bug-fix: in state CLOSED, there should not be any sequence number checking
          or option processing. This is why the test for CLOSED has been moved after
          the test for LISTEN.
       3. As before sequence number checks are omitted if in state LISTEN/REQUEST, due
          to the note underneath the table in RFC 4340, 7.5.3.
       4. Packets are now passed on to Ack Vector / CCID processing only after
          - step 7  (receive unexpected packets), 
          - step 9  (receive Reset),
          - step 13 (receive CloseReq),
          - step 14 (receive Close)
          and only if the state is PARTOPEN. This simplifies CCID processing:
          - in LISTEN/CLOSED the CCIDs are non-existent;
          - in RESPOND/REQUEST the CCIDs have not yet been negotiated;
          - in CLOSEREQ and active-CLOSING the node has already closed this socket;
          - in passive-CLOSING the client is waiting for its Reset.
          In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
          data, which is the approach taken here.
      
      As a result of (3), CCID processing is now indeed confined to OPEN/PARTOPEN
      states, i.e. congestion control is performed only on the flow of data packets. 
      
      This avoids pathological cases of doing congestion control on those messages
      which set up and terminate the connection. 
      
      I have done a few checks to see if this creates a problem in other parts of
      the code. This seems not to be the case; even if there were one, it would be
      better to fix it than to perform congestion control on Close/Request/Response
      messages. Similarly for Ack Vectors (as they depend on the negotiated CCID).
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      ddab0556
    • Gerrit Renker's avatar
      dccp ccid-2: Consolidate Ack-Vector processing within main DCCP module · 283fb4a5
      Gerrit Renker authored
      This aggregates Ack Vector processing (handling input and clearing old state)
      into one function, for the following reasons and benefits:
       * all Ack Vector-specific processing is now in one place;
       * duplicated code is removed;
       * ensuring sanity: from an Ack Vector point of view, it is better to clear the
                          old state first before entering new state;
       * Ack Event handling happens mostly within the CCIDs, not the main DCCP module.
      
      Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>  
      283fb4a5
    • Gerrit Renker's avatar
      dccp ccid-2: Algorithm to update buffer state · 68b1de15
      Gerrit Renker authored
      This provides a routine to consistently update the buffer state when the
      peer acknowledges receipt of Ack Vectors; updating state in the list of Ack
      Vectors as well as in the circular buffer.
      
      While based on RFC 4340, several additional (and necessary) precautions were
      added to protect the consistency of the buffer state. These additions are
      essential, since analysis and experience showed that the basic algorithm was
      insufficient for this task (which lead to problems that were hard to debug).
      
      The algorithm now
       * deals with HC-sender acknowledging to HC-receiver and vice versa,
       * keeps track of the last unacknowledged but received seqno in tail_ackno,
       * has special cases to reset the overflow condition when appropriate,
       * is protected against receiving older information (would mess up buffer state).
      
      Note: The older code performed an unnecessary step, where the sender cleared
      Ack Vector state by parsing the Ack Vector received by the HC-receiver. Doing
      this was entirely redundant, since
       * the receiver always puts the full acknowledgment window (groups 2,3 in 11.4.2)
         into the Ack Vectors it sends; hence the HC-receiver is only interested in the
         highest state that the HC-sender received;
       * this means that the acknowledgment number on the (Data)Ack from the HC-sender
         is sufficient; and work done in parsing earlier state is not necessary, since
         the later state subsumes the  earlier one (see also RFC 4340, A.4).
      This older interface (dccp_ackvec_parse()) is therefore removed.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      68b1de15
    • Gerrit Renker's avatar
      dccp ccid-2: Ack Vector interface clean-up · ff49e270
      Gerrit Renker authored
      This patch brings the Ack Vector interface up to date. Its main purpose is
      to lay the basis for the subsequent patches of this set, which will use the
      new data structure fields and routines.
      
      There are no real algorithmic changes, rather an adaptation:
      
       (1) Replaced the static Ack Vector size (2) with a #define so that it can
           be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems
           to be sufficient for the moment) and added a solution so that computing
           the ECN nonce will continue to work - even with larger Ack Vectors.
      
       (2) Replaced the #defines for Ack Vector states with a complete enum.
      
       (3) Replaced #defines to compute Ack Vector length and state with general
           purpose routines (inlines), and updated code to use these.
      
       (4) Added a `tail' field (conversion to circular buffer in subsequent patch).
      
       (5) Updated the (outdated) documentation for Ack Vector struct.
      
       (6) All sequence number containers now trimmed to 48 bits.
      
       (7) Removal of unused bits:
           * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already
             redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record);
           * removed Elapsed Time for Ack Vectors (it was nowhere used);
           * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since
             the code needs to be able to remember the old run length; 
           * reduced the de-/allocation routines (redundant / duplicate tests).
      
      
      Justification for removing Elapsed Time information [can be removed]:
      ---------------------------------------------------------------------
       1. The Elapsed Time information for Ack Vectors was nowhere used in the code.
       2. DCCP does not implement rate-based pacing of acknowledgments. The only
          recommendation for always including Elapsed Time is in section 11.3 of
          RFC 4340: "Receivers that rate-pace acknowledgements SHOULD [...]
          include Elapsed Time options". But such is not the case here.
       3. It does not really improve estimation accuracy. The Elapsed Time field only
          records the time between the arrival of the last acknowledgeable packet and
          the time the Ack Vector is sent out. Since Linux does not (yet) implement
          delayed Acks, the time difference will typically be small, since often the
          arrival of a data packet triggers sending feedback at the HC-receiver.
      
      
      Justification for changes in de-/allocation routines [can be removed]:
      ----------------------------------------------------------------------
        * INIT_LIST_HEAD in dccp_ackvec_record_new was redundant, since the list
          pointers were later overwritten when the node was added via list_add();
        * dccp_ackvec_record_new() was called in a single place only;
        * calls to list_del_init() before calling dccp_ackvec_record_delete() were
          redundant, since subsequently the entire element was k-freed;
        * since all calls to dccp_ackvec_record_delete() were preceded to a call to
          list_del_init(), the WARN_ON test would never evaluate to true;
        * since all calls to dccp_ackvec_record_delete() were made from within
          list_for_each_entry_safe(), the test for avr == NULL was redundant;
        * list_empty() in ackvec_free was redundant, since the same condition is
          embedded in the loop condition of the subsequent list_for_each_entry_safe().
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      ff49e270
    • Gerrit Renker's avatar
      dccp: Fix the adjustments to AWL and SWL · bfbddd08
      Gerrit Renker authored
      This fixes a problem and a potential loophole with regard to seqno/ackno
      validity: the problem is that the initial adjustments to AWL/SWL were
      only performed at the begin of the connection, during the handshake.
      
      Since the Sequence Window feature is always greater than Wmin=32 (7.5.2), 
      it is however necessary to perform these adjustments at least for the first
      W/W' (variables as per 7.5.1) packets in the lifetime of a connection.
      
      This requirement is complicated by the fact that W/W' can change at any time
      during the lifetime of a connection.
      
      Therefore the consequence is to perform this safety check each time SWL/AWL
      are updated.
      
      A second problem solved by this patch is that the remote/local Sequence Window
      feature values (which set the bounds for AWL/SWL/SWH) are undefined until the
      feature negotiation has completed.
      
      During the initial handshake we have more stringent sequence number protection,
      the changes added by this patch effect that {A,S}W{L,H} are within the correct
      bounds at the instant that feature negotiation completes (since the SeqWin
      feature activation handlers call dccp_update_gsr/gss()). 
      
      A detailed rationale is below -- can be removed from the commit message.
      
      
      1. Server sequence number checks during initial handshake
      ---------------------------------------------------------
      The server can not use the fields of the listening socket for seqno/ackno checks
      and thus needs to store all relevant information on a per-connection basis on
      the dccp_request socket. This is a size-constrained structure and has currently
      only ISS (dreq_iss) and ISR (dreq_isr) defined.
      Adding further fields (SW{L,H}, AW{L,H}) would increase the size of the struct
      and it is questionable whether this will have any practical gain. The currently
      implemented solution is as follows.
       * receiving first Request: dccp_v{4,6}_conn_request sets 
                                  ISR := P.seqno, ISS := dccp_v{4,6}_init_sequence()
      
       * sending first Response:  dccp_v{4,6}_send_response via dccp_make_response()	
                                  sets P.seqno := ISS, sets P.ackno := ISR
      
       * receiving retransmitted Request: dccp_check_req() overrides ISR := P.seqno
      
       * answering retransmitted Request: dccp_make_response() sets ISS += 1,
                                          otherwise as per first Response
      
       * completing the handshake: succeeds in dccp_check_req() for the first Ack
                                   where P.ackno == ISS (P.seqno is not tested)
      
       * creating child socket: ISS, ISR are copied from the request_sock
      
      This solution will succeed whenever the server can receive the Request and the
      subsequent Ack in succession, without retransmissions. If there is packet loss,
      the client needs to retransmit until this condition succeeds; it will otherwise
      eventually give up. Adding further fields to the request_sock could increase
      the robustness a bit, in that it would make possible to let a reordered Ack
      (from a retransmitted Response) pass. The argument against such a solution is
      that if the packet loss is not persistent and an Ack gets through, why not
      wait for the one answering the original response: if the loss is persistent, it
      is probably better to not start the connection in the first place.
      
      Long story short: the present design (by Arnaldo) is simple and will likely work
      just as well as a more complicated solution. As a consequence, {A,S}W{L,H} are
      not needed until the moment the request_sock is cloned into the accept queue.
      
      At that stage feature negotiation has completed, so that the values for the local
      and remote Sequence Window feature (7.5.2) are known, i.e. we are now in a better
      position to compute {A,S}W{L,H}.
      
      
      2. Client sequence number checks during initial handshake
      ---------------------------------------------------------
      Until entering PARTOPEN the client does not need the adjustments, since it 
      constrains the Ack window to the packet it sent.
      
       * sending first Request: dccp_v{4,6}_connect() choose ISS, 
                                dccp_connect() then sets GAR := ISS (as per 8.5),
      			  dccp_transmit_skb() (with the previous bug fix) sets
      			         GSS := ISS, AWL := ISS, AWH := GSS
       * n-th retransmitted Request (with previous patch):
      	                  dccp_retransmit_skb() via timer calls
      			  dccp_transmit_skb(), which sets GSS := ISS+n
                                and then AWL := ISS, AWH := ISS+n
      	                  
       * receiving any Response: dccp_rcv_request_sent_state_process() 
      	                   -- accepts packet if AWL <= P.ackno <= AWH;
      			   -- sets GSR = ISR = P.seqno
      
       * sending the Ack completing the handshake: dccp_send_ack() calls 
                                 dccp_transmit_skb(), which sets GSS += 1
      			   and AWL := ISS, AWH := GSS
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      bfbddd08
    • Gerrit Renker's avatar
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · b235dc4a
      Gerrit Renker authored
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, which is now handled fully dynamically via feature negotiation;
      i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
      RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been 
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      b235dc4a
    • Gerrit Renker's avatar
      dccp: Integration of dynamic feature activation - part 3 (client side) · c49b2272
      Gerrit Renker authored
      This integrates feature-activation in the client, with these details:
      
       1. When dccp_parse_options() fails, the reset code is already set, request_sent
          _state_process() currently overrides this with `Packet Error', which is not
          intended - so changed to use the reset code set in dccp_parse_options();
      
       2. There was a FIXME to change the error code when dccp_ackvec_add() fails.
          I have looked this up and found that: 
          * the check whether ackno < ISN is already made earlier,
          * this Response is likely the 1st packet with an Ackno that the client gets,
          * so when dccp_ackvec_add() fails, the reason is likely not a packet error.
      
       3. When feature negotiation fails, the socket should be marked as not usable,
          so that the application is notified that an error occurs. This is achieved
          by a new label, which uses an error code of `Aborted' and which sets the
          socket state to CLOSED, as well as sk_err.
      
       4. Avoids parsing the Ack twice in Respond state by not doing option processing
          again in dccp_rcv_respond_partopen_state_process (as option processing has
          already been done on the request_sock in dccp_check_req).    
      
      Since this addresses congestion-control initialisation, a corresponding
      FIXME has been removed.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      c49b2272
    • Gerrit Renker's avatar
      dccp: Per-socket initialisation of feature negotiation · 828755ce
      Gerrit Renker authored
      This provides feature-negotiation initialisation for both DCCP sockets and
      DCCP request_sockets, to support feature negotiation during connection setup.
      
      It also resolves a FIXME regarding the congestion control initialisation.
      
      Thanks to Wei Yongjun for help with the IPv6 side of this patch.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: default avatarIan McDonald <ian.mcdonald@jandi.co.nz>
      828755ce
    • Wei Yongjun's avatar
      dccp: Always generate a Reset in response to option errors · ba1a6c7b
      Wei Yongjun authored
      RFC4340 states that if a packet is received with an option error (such as a
      Mandatory Option as the last byte of the option list), the endpoint should
      repond with a Reset.
      
      In the LISTEN and RESPOND states, the endpoint correctly reponds with Reset,
      while in the REQUEST/OPEN states, packets with option errors are just ignored.
      
      The packet sequence is as follows:
      
      Case 1:
      
        Endpoint A                           Endpoint B
        (CLOSED)                             (CLOSED)
      
                     <----------------       REQUEST
      
        RESPONSE     ----------------->      (*1)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*1) currently just ignored, no Reset is sent
      
      Case 2:
      
        Endpoint A                           Endpoint B
        (OPEN)                               (OPEN)
      
        DATA-ACK     ----------------->      (*2)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*2) currently just ignored, no Reset is sent
      
      This patch fixes the problem, by generating a Reset instead of silently
      ignoring option errors.
      Signed-off-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      ba1a6c7b
  27. 27 Aug, 2008 1 commit
    • Wei Yongjun's avatar
      dccp: Always generate a Reset in response to option errors · 33c44967
      Wei Yongjun authored
      RFC4340 states that if a packet is received with an option error (such as a
      Mandatory Option as the last byte of the option list), the endpoint should
      repond with a Reset.
      
      In the LISTEN and RESPOND states, the endpoint correctly reponds with Reset,
      while in the REQUEST/OPEN states, packets with option errors are just ignored.
      
      The packet sequence is as follows:
      
      Case 1:
      
        Endpoint A                           Endpoint B
        (CLOSED)                             (CLOSED)
      
                     <----------------       REQUEST
      
        RESPONSE     ----------------->      (*1)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*1) currently just ignored, no Reset is sent
      
      Case 2:
      
        Endpoint A                           Endpoint B
        (OPEN)                               (OPEN)
      
        DATA-ACK     ----------------->      (*2)
        (with invalid option)
                     <----------------       RESET
                                             (with Reset Code 5, "Option Error")
      
        (*2) currently just ignored, no Reset is sent
      
      This patch fixes the problem, by generating a Reset instead of silently
      ignoring option errors.
      Signed-off-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      33c44967
  28. 19 Aug, 2008 1 commit
    • Gerrit Renker's avatar
      dccp: Fix panic caused by too early termination of retransmission mechanism · d28934ad
      Gerrit Renker authored
      Thanks is due to Wei Yongjun for the detailed analysis and description of this
      bug at http://marc.info/?l=dccp&m=121739364909199&w=2
      
      The problem is that invalid packets received by a client in state REQUEST cause
      the retransmission timer for the DCCP-Request to be reset. This includes freeing
      the Request-skb ( in dccp_rcv_request_sent_state_process() ). As a consequence,
       * the arrival of further packets cause a double-free, triggering a panic(),
       * the connection then may hang, since further retransmissions are blocked.
      
      This patch changes the order of statements so that the retransmission timer is
      reset, and the pending Request freed, only if a valid Response has arrived (or
      the number of sysctl-retries has been exhausted).
      
      Further changes:
      ----------------
      To be on the safe side, replaced __kfree_skb with kfree_skb so that if due to
      unexpected circumstances the sk_send_head is NULL the WARN_ON is used instead.
      Signed-off-by: default avatarGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d28934ad