Skip to content
Snippets Groups Projects
  1. Sep 05, 2017
    • Chuck Lever's avatar
      svcrdma: Estimate Send Queue depth properly · 26fb2254
      Chuck Lever authored
      
      The rdma_rw API adjusts max_send_wr upwards during the
      rdma_create_qp() call. If the ULP actually wants to take advantage
      of these extra resources, it must increase the size of its send
      completion queue (created before rdma_create_qp is called) and
      increase its send queue accounting limit.
      
      Use the new rdma_rw_mr_factor API to figure out the correct value
      to use for the Send Queue and Send Completion Queue depths.
      
      And, ensure that the chosen Send Queue depth for a newly created
      transport does not overrun the QP WR limit of the underlying device.
      
      Lastly, there's no longer a need to carry the Send Queue depth in
      struct svcxprt_rdma, since the value is used only in the
      svc_rdma_accept() path.
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      26fb2254
    • Chuck Lever's avatar
      svcrdma: Limit RQ depth · 5a25bfd2
      Chuck Lever authored
      
      Ensure that the chosen Receive Queue depth for a newly created
      transport does not overrun the QP WR limit of the underlying device.
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      5a25bfd2
    • Chuck Lever's avatar
      svcrdma: Populate tail iovec when receiving · 193bcb7b
      Chuck Lever authored
      
      So that NFS WRITE payloads can eventually be placed directly into a
      file's page cache, enable the RPC-over-RDMA transport to present
      these payloads in the xdr_buf's page list, while placing trailing
      content (such as a GETATTR operation) in the xdr_buf's tail.
      
      After this change, the RPC-over-RDMA's "copy tail" hack, added by
      commit a97c331f ("svcrdma: Handle additional inline content"),
      is no longer needed and can be removed.
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      193bcb7b
  2. Aug 25, 2017
  3. Aug 24, 2017
    • Vadim Lomovtsev's avatar
      net: sunrpc: svcsock: fix NULL-pointer exception · eebe53e8
      Vadim Lomovtsev authored
      
      While running nfs/connectathon tests kernel NULL-pointer exception
      has been observed due to races in svcsock.c.
      
      Race is appear when kernel accepts connection by kernel_accept
      (which creates new socket) and start queuing ingress packets
      to new socket. This happens in ksoftirq context which could run
      concurrently on a different core while new socket setup is not done yet.
      
      The fix is to re-order socket user data init sequence and add
      write/read barrier calls to be sure that we got proper values
      for callback pointers before actually calling them.
      
      Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations.
      
      Crash log:
      ---<-snip->---
      [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000
      [ 6708.647093] pgd = ffff0000094e0000
      [ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000
      [ 6708.660761] Internal error: Oops: 86000005 [#1] SMP
      [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
      [ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021]
      [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G        W  OE   4.11.0-4.el7.aarch64 #1
      [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017
      [ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000
      [ 6708.773822] PC is at 0x0
      [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
      [ 6708.781866] pc : [<0000000000000000>] lr : [<ffff0000029d7378>] pstate: 60000145
      [ 6708.789248] sp : ffff810ffbad3900
      [ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58
      [ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00
      [ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28
      [ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000
      [ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00
      [ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2
      [ 6708.824365] x17: 0000000000000000 x16: 0000000000000000
      [ 6708.829667] x15: 0000000000000000 x14: 0000000000000001
      [ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e
      [ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000
      [ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000
      [ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000
      [ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000
      [ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000
      [ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00
      [ 6708.872084]
      [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000)
      [ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000)
      [ 6708.886075] Call trace:
      [ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840)
      [ 6708.894942] 3700:                                   ffff810012412400 0001000000000000
      [ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000
      [ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000
      [ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000
      [ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d
      [ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140
      [ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000
      [ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028
      [ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000
      [ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000
      [ 6708.973117] [<          (null)>]           (null)
      [ 6708.977824] [<ffff0000086f9fa4>] tcp_data_queue+0x754/0xc5c
      [ 6708.983386] [<ffff0000086fa64c>] tcp_rcv_established+0x1a0/0x67c
      [ 6708.989384] [<ffff000008704120>] tcp_v4_do_rcv+0x15c/0x22c
      [ 6708.994858] [<ffff000008707418>] tcp_v4_rcv+0xaf0/0xb58
      [ 6709.000077] [<ffff0000086df784>] ip_local_deliver_finish+0x10c/0x254
      [ 6709.006419] [<ffff0000086dfea4>] ip_local_deliver+0xf0/0xfc
      [ 6709.011980] [<ffff0000086dfad4>] ip_rcv_finish+0x208/0x3a4
      [ 6709.017454] [<ffff0000086e018c>] ip_rcv+0x2dc/0x3c8
      [ 6709.022328] [<ffff000008692fc8>] __netif_receive_skb_core+0x2f8/0xa0c
      [ 6709.028758] [<ffff000008696068>] __netif_receive_skb+0x38/0x84
      [ 6709.034580] [<ffff00000869611c>] netif_receive_skb_internal+0x68/0xdc
      [ 6709.041010] [<ffff000008696bc0>] napi_gro_receive+0xcc/0x1a8
      [ 6709.046690] [<ffff0000014b0fc4>] nicvf_cq_intr_handler+0x59c/0x730 [nicvf]
      [ 6709.053559] [<ffff0000014b1380>] nicvf_poll+0x38/0xb8 [nicvf]
      [ 6709.059295] [<ffff000008697a6c>] net_rx_action+0x2f8/0x464
      [ 6709.064771] [<ffff000008081824>] __do_softirq+0x11c/0x308
      [ 6709.070164] [<ffff0000080d14e4>] irq_exit+0x12c/0x174
      [ 6709.075206] [<ffff00000813101c>] __handle_domain_irq+0x78/0xc4
      [ 6709.081027] [<ffff000008081608>] gic_handle_irq+0x94/0x190
      [ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20)
      [ 6709.092929] fde0:                                   0000810ff2ec0000 ffff000008c10000
      [ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18
      [ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80
      [ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4
      [ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50
      [ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000
      [ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000
      [ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20
      [ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145
      [ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238
      [ 6709.171107] [<ffff000008082eb4>] el1_irq+0xb4/0x140
      [ 6709.175976] [<ffff000008085244>] arch_cpu_idle+0x44/0x11c
      [ 6709.181368] [<ffff0000087bf3b8>] default_idle_call+0x20/0x30
      [ 6709.187020] [<ffff000008116d50>] do_idle+0x158/0x1e4
      [ 6709.191973] [<ffff000008116ff4>] cpu_startup_entry+0x2c/0x30
      [ 6709.197624] [<ffff00000808e7cc>] secondary_start_kernel+0x13c/0x160
      [ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4
      [ 6709.207967] Code: bad PC value
      [ 6709.211061] SMP: stopping secondary CPUs
      [ 6709.218830] Starting crashdump kernel...
      [ 6709.222749] Bye!
      ---<-snip>---
      
      Signed-off-by: default avatarVadim Lomovtsev <vlomovts@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      eebe53e8
  4. Jul 21, 2017
    • NeilBrown's avatar
      net/sunrpc/xprt_sock: fix regression in connection error reporting. · 3ffbc1d6
      NeilBrown authored
      
      Commit 3d476263 ("tcp: remove poll() flakes when receiving
      RST") in v4.12 changed the order in which ->sk_state_change()
      and ->sk_error_report() are called when a socket is shut
      down - sk_state_change() is now called first.
      
      This causes xs_tcp_state_change() -> xs_sock_mark_closed() ->
      xprt_disconnect_done() to wake all pending tasked with -EAGAIN.
      When the ->sk_error_report() callback arrives, it is too late to
      pass the error on, and it is lost.
      
      As easy way to demonstrate the problem caused is to try to start
      rpc.nfsd while rcpbind isn't running.
      nfsd will attempt a tcp connection to rpcbind.  A ECONNREFUSED
      error is returned, but sunrpc code loses the error and keeps
      retrying.  If it saw the ECONNREFUSED, it would abort.
      
      To fix this, handle the sk->sk_err in the TCP_CLOSE branch of
      xs_tcp_state_change().
      
      Fixes: 3d476263 ("tcp: remove poll() flakes when receiving RST")
      Cc: stable@vger.kernel.org (v4.12)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      3ffbc1d6
  5. Jul 20, 2017
  6. Jul 19, 2017
  7. Jul 18, 2017
    • Dan Carpenter's avatar
      netfilter: fix netfilter_net_init() return · 073dd5ad
      Dan Carpenter authored
      
      We accidentally return an uninitialized variable.
      
      Fixes: cf56c2f8 ("netfilter: remove old pre-netns era hook api")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      073dd5ad
    • Paolo Abeni's avatar
      udp: preserve skb->dst if required for IP options processing · 0ddf3fb2
      Paolo Abeni authored
      
      Eric noticed that in udp_recvmsg() we still need to access
      skb->dst while processing the IP options.
      Since commit 0a463c78 ("udp: avoid a cache miss on dequeue")
      skb->dst is no more available at recvmsg() time and bad things
      will happen if we enter the relevant code path.
      
      This commit address the issue, avoid clearing skb->dst if
      any IP options are present into the relevant skb.
      Since the IP CB is contained in the first skb cacheline, we can
      test it to decide to leverage the consume_stateless_skb()
      optimization, without measurable additional cost in the faster
      path.
      
      v1 -> v2: updated commit message tags
      
      Fixes: 0a463c78 ("udp: avoid a cache miss on dequeue")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ddf3fb2
    • Alexander Potapenko's avatar
      ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check() · 18bcf290
      Alexander Potapenko authored
      
      KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(),
      which originated from the TCP request socket created in
      cookie_v6_check():
      
       ==================================================================
       BUG: KMSAN: use of uninitialized memory in tcp_transmit_skb+0xf77/0x3ec0
       CPU: 1 PID: 2949 Comm: syz-execprog Not tainted 4.11.0-rc5+ #2931
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       TCP: request_sock_TCPv6: Possible SYN flooding on port 20028. Sending cookies.  Check SNMP counters.
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x172/0x1c0 lib/dump_stack.c:52
        kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
        __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
        skb_set_hash_from_sk ./include/net/sock.h:2011
        tcp_transmit_skb+0xf77/0x3ec0 net/ipv4/tcp_output.c:983
        tcp_send_ack+0x75b/0x830 net/ipv4/tcp_output.c:3493
        tcp_delack_timer_handler+0x9a6/0xb90 net/ipv4/tcp_timer.c:284
        tcp_delack_timer+0x1b0/0x310 net/ipv4/tcp_timer.c:309
        call_timer_fn+0x240/0x520 kernel/time/timer.c:1268
        expire_timers kernel/time/timer.c:1307
        __run_timers+0xc13/0xf10 kernel/time/timer.c:1601
        run_timer_softirq+0x36/0xa0 kernel/time/timer.c:1614
        __do_softirq+0x485/0x942 kernel/softirq.c:284
        invoke_softirq kernel/softirq.c:364
        irq_exit+0x1fa/0x230 kernel/softirq.c:405
        exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:657
        smp_apic_timer_interrupt+0x5a/0x80 arch/x86/kernel/apic/apic.c:966
        apic_timer_interrupt+0x86/0x90 arch/x86/entry/entry_64.S:489
       RIP: 0010:native_restore_fl ./arch/x86/include/asm/irqflags.h:36
       RIP: 0010:arch_local_irq_restore ./arch/x86/include/asm/irqflags.h:77
       RIP: 0010:__msan_poison_alloca+0xed/0x120 mm/kmsan/kmsan_instr.c:440
       RSP: 0018:ffff880024917cd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
       RAX: 0000000000000246 RBX: ffff8800224c0000 RCX: 0000000000000005
       RDX: 0000000000000004 RSI: ffff880000000000 RDI: ffffea0000b6d770
       RBP: ffff880024917d58 R08: 0000000000000dd8 R09: 0000000000000004
       R10: 0000160000000000 R11: 0000000000000000 R12: ffffffff85abf810
       R13: ffff880024917dd8 R14: 0000000000000010 R15: ffffffff81cabde4
        </IRQ>
        poll_select_copy_remaining+0xac/0x6b0 fs/select.c:293
        SYSC_select+0x4b4/0x4e0 fs/select.c:653
        SyS_select+0x76/0xa0 fs/select.c:634
        entry_SYSCALL_64_fastpath+0x13/0x94 arch/x86/entry/entry_64.S:204
       RIP: 0033:0x4597e7
       RSP: 002b:000000c420037ee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000017
       RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004597e7
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
       RBP: 000000c420037ef0 R08: 000000c420037ee0 R09: 0000000000000059
       R10: 0000000000000000 R11: 0000000000000246 R12: 000000000042dc20
       R13: 00000000000000f3 R14: 0000000000000030 R15: 0000000000000003
       chained origin:
        save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
        kmsan_save_stack mm/kmsan/kmsan.c:317
        kmsan_internal_chain_origin+0x12a/0x1f0 mm/kmsan/kmsan.c:547
        __msan_store_shadow_origin_4+0xac/0x110 mm/kmsan/kmsan_instr.c:259
        tcp_create_openreq_child+0x709/0x1ae0 net/ipv4/tcp_minisocks.c:472
        tcp_v6_syn_recv_sock+0x7eb/0x2a30 net/ipv6/tcp_ipv6.c:1103
        tcp_get_cookie_sock+0x136/0x5f0 net/ipv4/syncookies.c:212
        cookie_v6_check+0x17a9/0x1b50 net/ipv6/syncookies.c:245
        tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
        tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
        tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
        ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
        NF_HOOK ./include/linux/netfilter.h:257
        ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
        dst_input ./include/net/dst.h:492
        ip6_rcv_finish net/ipv6/ip6_input.c:69
        NF_HOOK ./include/linux/netfilter.h:257
        ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
        __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
        __netif_receive_skb net/core/dev.c:4246
        process_backlog+0x667/0xba0 net/core/dev.c:4866
        napi_poll net/core/dev.c:5268
        net_rx_action+0xc95/0x1590 net/core/dev.c:5333
        __do_softirq+0x485/0x942 kernel/softirq.c:284
       origin:
        save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
        kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
        kmsan_kmalloc+0x7f/0xe0 mm/kmsan/kmsan.c:337
        kmem_cache_alloc+0x1c2/0x1e0 mm/slub.c:2766
        reqsk_alloc ./include/net/request_sock.h:87
        inet_reqsk_alloc+0xa4/0x5b0 net/ipv4/tcp_input.c:6200
        cookie_v6_check+0x4f4/0x1b50 net/ipv6/syncookies.c:169
        tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
        tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
        tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
        ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
        NF_HOOK ./include/linux/netfilter.h:257
        ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
        dst_input ./include/net/dst.h:492
        ip6_rcv_finish net/ipv6/ip6_input.c:69
        NF_HOOK ./include/linux/netfilter.h:257
        ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
        __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
        __netif_receive_skb net/core/dev.c:4246
        process_backlog+0x667/0xba0 net/core/dev.c:4866
        napi_poll net/core/dev.c:5268
        net_rx_action+0xc95/0x1590 net/core/dev.c:5333
        __do_softirq+0x485/0x942 kernel/softirq.c:284
       ==================================================================
      
      Similar error is reported for cookie_v4_check().
      
      Fixes: 58d607d3 ("tcp: provide skb->hash to synack packets")
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18bcf290
  8. Jul 17, 2017
  9. Jul 15, 2017
  10. Jul 14, 2017
Loading