Skip to content
Snippets Groups Projects
  1. Sep 05, 2017
  2. Jul 20, 2017
  3. Jun 22, 2017
  4. Jun 16, 2017
  5. Jan 02, 2017
  6. Nov 17, 2016
    • Sowmini Varadhan's avatar
      RDS: TCP: Track peer's connection generation number · 905dd418
      Sowmini Varadhan authored
      
      The RDS transport has to be able to distinguish between
      two types of failure events:
      (a) when the transport fails (e.g., TCP connection reset)
          but the RDS socket/connection layer on both sides stays
          the same
      (b) when the peer's RDS layer itself resets (e.g., due to module
          reload or machine reboot at the peer)
      In case (a) both sides must reconnect and continue the RDS messaging
      without any message loss or disruption to the message sequence numbers,
      and this is achieved by rds_send_path_reset().
      
      In case (b) we should reset all rds_connection state to the
      new incarnation of the peer. Examples of state that needs to
      be reset are next expected rx sequence number from, or messages to be
      retransmitted to, the new incarnation of the peer.
      
      To achieve this, the RDS handshake probe added as part of
      commit 5916e2c1 ("RDS: TCP: Enable multipath RDS for TCP")
      is enhanced so that sender and receiver of the RDS ping-probe
      will add a generation number as part of the RDS_EXTHDR_GEN_NUM
      extension header. Each peer stores local and remote generation
      numbers as part of each rds_connection. Changes in generation
      number will be detected via incoming handshake probe ping
      request or response and will allow the receiver to reset rds_connection
      state.
      
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      905dd418
  7. Jul 15, 2016
  8. Jul 01, 2016
  9. Jun 15, 2016
  10. Jun 07, 2016
  11. Nov 24, 2015
  12. Oct 19, 2015
    • santosh.shilimkar@oracle.com's avatar
      RDS: fix rds-ping deadlock over TCP transport · 7b4b0009
      santosh.shilimkar@oracle.com authored
      
      Sowmini found hang with rds-ping while testing RDS over TCP. Its
      a corner case and doesn't happen always. The issue is not reproducible
      with IB transport. Its clear from below dump why we see it with RDS TCP.
      
       [<ffffffff8153b7e5>] do_tcp_setsockopt+0xb5/0x740
       [<ffffffff8153bec4>] tcp_setsockopt+0x24/0x30
       [<ffffffff814d57d4>] sock_common_setsockopt+0x14/0x20
       [<ffffffffa096071d>] rds_tcp_xmit_prepare+0x5d/0x70 [rds_tcp]
       [<ffffffffa093b5f7>] rds_send_xmit+0xd7/0x740 [rds]
       [<ffffffffa093bda2>] rds_send_pong+0x142/0x180 [rds]
       [<ffffffffa0939d34>] rds_recv_incoming+0x274/0x330 [rds]
       [<ffffffff810815ae>] ? ttwu_queue+0x11e/0x130
       [<ffffffff814dcacd>] ? skb_copy_bits+0x6d/0x2c0
       [<ffffffffa0960350>] rds_tcp_data_recv+0x2f0/0x3d0 [rds_tcp]
       [<ffffffff8153d836>] tcp_read_sock+0x96/0x1c0
       [<ffffffffa0960060>] ? rds_tcp_recv_init+0x40/0x40 [rds_tcp]
       [<ffffffff814d6a90>] ? sock_def_write_space+0xa0/0xa0
       [<ffffffffa09604d1>] rds_tcp_data_ready+0xa1/0xf0 [rds_tcp]
       [<ffffffff81545249>] tcp_data_queue+0x379/0x5b0
       [<ffffffffa0960cdb>] ? rds_tcp_write_space+0xbb/0x110 [rds_tcp]
       [<ffffffff81547fd2>] tcp_rcv_established+0x2e2/0x6e0
       [<ffffffff81552602>] tcp_v4_do_rcv+0x122/0x220
       [<ffffffff81553627>] tcp_v4_rcv+0x867/0x880
       [<ffffffff8152e0b3>] ip_local_deliver_finish+0xa3/0x220
      
      This happens because rds_send_xmit() chain wants to take
      sock_lock which is already taken by tcp_v4_rcv() on its
      way to rds_tcp_data_ready(). Commit db6526dc ("RDS: use
      rds_send_xmit() state instead of RDS_LL_SEND_FULL") which
      was trying to opportunistically finish the send request
      in same thread context.
      
      But because of above recursive lock hang with RDS TCP,
      the send work from rds_send_pong() needs to deferred to
      worker to avoid lock up. Given RDS ping is more of connectivity
      test than performance critical path, its should be ok even
      for transport like IB.
      
      Reported-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <ssantosh@kernel.org>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b4b0009
  13. Oct 05, 2015
  14. Aug 25, 2015
  15. Aug 07, 2015
  16. Apr 08, 2015
  17. Mar 02, 2015
  18. Dec 11, 2014
  19. Dec 09, 2014
    • Al Viro's avatar
      put iov_iter into msghdr · c0371da6
      Al Viro authored
      
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  20. Nov 24, 2014
  21. Oct 03, 2014
    • Herton R. Krzesinski's avatar
      net/rds: fix possible double free on sock tear down · 593cbb3e
      Herton R. Krzesinski authored
      
      I got a report of a double free happening at RDS slab cache. One
      suspicion was that may be somewhere we were doing a sock_hold/sock_put
      on an already freed sock. Thus after providing a kernel with the
      following change:
      
       static inline void sock_hold(struct sock *sk)
       {
      -       atomic_inc(&sk->sk_refcnt);
      +       if (!atomic_inc_not_zero(&sk->sk_refcnt))
      +               WARN(1, "Trying to hold sock already gone: %p (family: %hd)\n",
      +                       sk, sk->sk_family);
       }
      
      The warning successfuly triggered:
      
      Trying to hold sock already gone: ffff81f6dda61280 (family: 21)
      WARNING: at include/net/sock.h:350 sock_hold()
      Call Trace:
      <IRQ>  [<ffffffff8adac135>] :rds:rds_send_remove_from_sock+0xf0/0x21b
      [<ffffffff8adad35c>] :rds:rds_send_drop_acked+0xbf/0xcf
      [<ffffffff8addf546>] :rds_rdma:rds_ib_recv_tasklet_fn+0x256/0x2dc
      [<ffffffff8009899a>] tasklet_action+0x8f/0x12b
      [<ffffffff800125a2>] __do_softirq+0x89/0x133
      [<ffffffff8005f30c>] call_softirq+0x1c/0x28
      [<ffffffff8006e644>] do_softirq+0x2c/0x7d
      [<ffffffff8006e4d4>] do_IRQ+0xee/0xf7
      [<ffffffff8005e625>] ret_from_intr+0x0/0xa
      <EOI>
      
      Looking at the call chain above, the only way I think this would be
      possible is if somewhere we already released the same socket->sock which
      is assigned to the rds_message at rds_send_remove_from_sock. Which seems
      only possible to happen after the tear down done on rds_release.
      
      rds_release properly calls rds_send_drop_to to drop the socket from any
      rds_message, and some proper synchronization is in place to avoid race
      with rds_send_drop_acked/rds_send_remove_from_sock. However, I still see
      a very narrow window where it may be possible we touch a sock already
      released: when rds_release races with rds_send_drop_acked, we check
      RDS_MSG_ON_CONN to avoid cleanup on the same rds_message, but in this
      specific case we don't clear rm->m_rs. In this case, it seems we could
      then go on at rds_send_drop_to and after it returns, the sock is freed
      by last sock_put on rds_release, with concurrently we being at
      rds_send_remove_from_sock; then at some point in the loop at
      rds_send_remove_from_sock we process an rds_message which didn't have
      rm->m_rs unset for a freed sock, and a possible sock_hold on an sock
      already gone at rds_release happens.
      
      This hopefully address the described condition above and avoids a double
      free on "second last" sock_put. In addition, I removed the comment about
      socket destruction on top of rds_send_drop_acked: we call rds_send_drop_to
      in rds_release and we should have things properly serialized there, thus
      I can't see the comment being accurate there.
      
      Signed-off-by: default avatarHerton R. Krzesinski <herton@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      593cbb3e
  22. Apr 18, 2014
  23. Jan 19, 2014
Loading