1. 19 Jun, 2019 1 commit
  2. 04 Jun, 2019 1 commit
  3. 21 May, 2019 1 commit
  4. 23 Apr, 2019 1 commit
    • Paul Gortmaker's avatar
      net: strparser: make it explicitly non-modular · 15253b4a
      Paul Gortmaker authored
      The Kconfig currently controlling compilation of this code is:
      
      net/strparser/Kconfig:config STREAM_PARSER
      net/strparser/Kconfig:  def_bool n
      
      ...meaning that it currently is not being built as a module by anyone.
      
      Lets remove the modular code that is essentially orphaned, so that
      when reading the driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.  For
      clarity, we change the fcn name mod_init to dev_init at the same time.
      
      We replace module.h with init.h and export.h ; the latter since this
      file exports some syms.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15253b4a
  5. 10 Apr, 2019 2 commits
    • Jakub Kicinski's avatar
      net: strparser: fix comment · 93e21254
      Jakub Kicinski authored
      Fix comment.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93e21254
    • Jakub Kicinski's avatar
      net: strparser: partially revert "strparser: Call skb_unclone conditionally" · 4a9c2e37
      Jakub Kicinski authored
      This reverts the first part of commit 4e485d06 ("strparser: Call
      skb_unclone conditionally").  To build a message with multiple
      fragments we need our own root of frag_list.  We can't simply
      use the frag_list of orig_skb, because it will lead to linking
      all orig_skbs together creating very long frag chains, and causing
      stack overflow on kfree_skb() (which is called recursively on
      the frag_lists).
      
      BUG: stack guard page was hit at 00000000d40fad41 (stack is 0000000029dde9f4..000000008cce03d5)
      kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP
      RIP: 0010:free_one_page+0x2b/0x490
      
      Call Trace:
        __free_pages_ok+0x143/0x2c0
        skb_release_data+0x8e/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
      
        [...]
      
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        ? skb_release_data+0xad/0x140
        kfree_skb+0x32/0xb0
        skb_release_data+0xad/0x140
        __kfree_skb+0xe/0x20
        tcp_disconnect+0xd6/0x4d0
        tcp_close+0xf4/0x430
        ? tcp_check_oom+0xf0/0xf0
        tls_sk_proto_close+0xe4/0x1e0 [tls]
        inet_release+0x36/0x60
        __sock_release+0x37/0xa0
        sock_close+0x11/0x20
        __fput+0xa2/0x1d0
        task_work_run+0x89/0xb0
        exit_to_usermode_loop+0x9a/0xa0
        do_syscall_64+0xc0/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Let's leave the second unclone conditional, as I'm not entirely
      sure what is its purpose :)
      
      Fixes: 4e485d06 ("strparser: Call skb_unclone conditionally")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a9c2e37
  6. 15 Mar, 2019 1 commit
  7. 15 Oct, 2018 1 commit
    • Daniel Borkmann's avatar
      bpf, sockmap: convert to generic sk_msg interface · 604326b4
      Daniel Borkmann authored
      Add a generic sk_msg layer, and convert current sockmap and later
      kTLS over to make use of it. While sk_buff handles network packet
      representation from netdevice up to socket, sk_msg handles data
      representation from application to socket layer.
      
      This means that sk_msg framework spans across ULP users in the
      kernel, and enables features such as introspection or filtering
      of data with the help of BPF programs that operate on this data
      structure.
      
      Latter becomes in particular useful for kTLS where data encryption
      is deferred into the kernel, and as such enabling the kernel to
      perform L7 introspection and policy based on BPF for TLS connections
      where the record is being encrypted after BPF has run and came to
      a verdict. In order to get there, first step is to transform open
      coding of scatter-gather list handling into a common core framework
      that subsystems can use.
      
      The code itself has been split and refactored into three bigger
      pieces: i) the generic sk_msg API which deals with managing the
      scatter gather ring, providing helpers for walking and mangling,
      transferring application data from user space into it, and preparing
      it for BPF pre/post-processing, ii) the plain sock map itself
      where sockets can be attached to or detached from; these bits
      are independent of i) which can now be used also without sock
      map, and iii) the integration with plain TCP as one protocol
      to be used for processing L7 application data (later this could
      e.g. also be extended to other protocols like UDP). The semantics
      are the same with the old sock map code and therefore no change
      of user facing behavior or APIs. While pursuing this work it
      also helped finding a number of bugs in the old sockmap code
      that we've fixed already in earlier commits. The test_sockmap
      kselftest suite passes through fine as well.
      
      Joint work with John.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      604326b4
  8. 01 Aug, 2018 1 commit
  9. 30 Jun, 2018 1 commit
  10. 28 Jun, 2018 1 commit
    • Doron Roberts-Kedes's avatar
      strparser: Remove early eaten to fix full tcp receive buffer stall · 977c7114
      Doron Roberts-Kedes authored
      On receving an incomplete message, the existing code stores the
      remaining length of the cloned skb in the early_eaten field instead of
      incrementing the value returned by __strp_recv. This defers invocation
      of sock_rfree for the current skb until the next invocation of
      __strp_recv, which returns early_eaten if early_eaten is non-zero.
      
      This behavior causes a stall when the current message occupies the very
      tail end of a massive skb, and strp_peek/need_bytes indicates that the
      remainder of the current message has yet to arrive on the socket. The
      TCP receive buffer is totally full, causing the TCP window to go to
      zero, so the remainder of the message will never arrive.
      
      Incrementing the value returned by __strp_recv by the amount otherwise
      stored in early_eaten prevents stalls of this nature.
      Signed-off-by: default avatarDoron Roberts-Kedes <doronrk@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      977c7114
  11. 21 Jun, 2018 1 commit
  12. 06 Jun, 2018 1 commit
    • Doron Roberts-Kedes's avatar
      strparser: Add __strp_unpause and use it in ktls. · 7170e604
      Doron Roberts-Kedes authored
      strp_unpause queues strp_work in order to parse any messages that
      arrived while the strparser was paused. However, the process invoking
      strp_unpause could eagerly parse a buffered message itself if it held
      the sock lock.
      
      __strp_unpause is an alternative to strp_pause that avoids the scheduling
      overhead that results when a receiving thread unpauses the strparser
      and waits for the next message to be delivered by the workqueue thread.
      
      This patch more than doubled the IOPS achieved in a benchmark of NBD
      traffic encrypted using ktls.
      Signed-off-by: default avatarDoron Roberts-Kedes <doronrk@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7170e604
  13. 23 Apr, 2018 1 commit
  14. 13 Apr, 2018 1 commit
    • Doron Roberts-Kedes's avatar
      strparser: Fix incorrect strp->need_bytes value. · 9d0c75bf
      Doron Roberts-Kedes authored
      strp_data_ready resets strp->need_bytes to 0 if strp_peek_len indicates
      that the remainder of the message has been received. However,
      do_strp_work does not reset strp->need_bytes to 0. If do_strp_work
      completes a partial message, the value of strp->need_bytes will continue
      to reflect the needed bytes of the previous message, causing
      future invocations of strp_data_ready to return early if
      strp->need_bytes is less than strp_peek_len. Resetting strp->need_bytes
      to 0 in __strp_recv on handing a full message to the upper layer solves
      this problem.
      
      __strp_recv also calculates strp->need_bytes using stm->accum_len before
      stm->accum_len has been incremented by cand_len. This can cause
      strp->need_bytes to be equal to the full length of the message instead
      of the full length minus the accumulated length. This, in turn, causes
      strp_data_ready to return early, even when there is sufficient data to
      complete the partial message. Incrementing stm->accum_len before using
      it to calculate strp->need_bytes solves this problem.
      
      Found while testing net/tls_sw recv path.
      
      Fixes: 43a0c675 ("strparser: Stream parser for messages")
      Signed-off-by: default avatarDoron Roberts-Kedes <doronrk@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d0c75bf
  15. 27 Mar, 2018 1 commit
    • Dave Watson's avatar
      strparser: Fix sign of err codes · cd00edc1
      Dave Watson authored
      strp_parser_err is called with a negative code everywhere, which then
      calls abort_parser with a negative code.  strp_msg_timeout calls
      abort_parser directly with a positive code.  Negate ETIMEDOUT
      to match signed-ness of other calls.
      
      The default abort_parser callback, strp_abort_strp, sets
      sk->sk_err to err.  Also negate the error here so sk_err always
      holds a positive value, as the rest of the net code expects.  Currently
      a negative sk_err can result in endless loops, or user code that
      thinks it actually sent/received err bytes.
      
      Found while testing net/tls_sw recv path.
      
      Fixes: 43a0c675 ("strparser: Stream parser for messages")
      Signed-off-by: default avatarDave Watson <davejwatson@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd00edc1
  16. 28 Dec, 2017 1 commit
  17. 25 Oct, 2017 1 commit
  18. 25 Aug, 2017 1 commit
    • Eric Biggers's avatar
      strparser: initialize all callbacks · 3fd87127
      Eric Biggers authored
      commit bbb03029 ("strparser: Generalize strparser") added more
      function pointers to 'struct strp_callbacks'; however, kcm_attach() was
      not updated to initialize them.  This could cause the ->lock() and/or
      ->unlock() function pointers to be set to garbage values, causing a
      crash in strp_work().
      
      Fix the bug by moving the callback structs into static memory, so
      unspecified members are zeroed.  Also constify them while we're at it.
      
      This bug was found by syzkaller, which encountered the following splat:
      
          IP: 0x55
          PGD 3b1ca067
          P4D 3b1ca067
          PUD 3b12f067
          PMD 0
      
          Oops: 0010 [#1] SMP KASAN
          Dumping ftrace buffer:
             (ftrace buffer empty)
          Modules linked in:
          CPU: 2 PID: 1194 Comm: kworker/u8:1 Not tainted 4.13.0-rc4-next-20170811 #2
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Workqueue: kstrp strp_work
          task: ffff88006bb0e480 task.stack: ffff88006bb10000
          RIP: 0010:0x55
          RSP: 0018:ffff88006bb17540 EFLAGS: 00010246
          RAX: dffffc0000000000 RBX: ffff88006ce4bd60 RCX: 0000000000000000
          RDX: 1ffff1000d9c97bd RSI: 0000000000000000 RDI: ffff88006ce4bc48
          RBP: ffff88006bb17558 R08: ffffffff81467ab2 R09: 0000000000000000
          R10: ffff88006bb17438 R11: ffff88006bb17940 R12: ffff88006ce4bc48
          R13: ffff88003c683018 R14: ffff88006bb17980 R15: ffff88003c683000
          FS:  0000000000000000(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000055 CR3: 000000003c145000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2098
           worker_thread+0x223/0x1860 kernel/workqueue.c:2233
           kthread+0x35e/0x430 kernel/kthread.c:231
           ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
          Code:  Bad RIP value.
          RIP: 0x55 RSP: ffff88006bb17540
          CR2: 0000000000000055
          ---[ end trace f0e4920047069cee ]---
      
      Here is a C reproducer (requires CONFIG_BPF_SYSCALL=y and
      CONFIG_AF_KCM=y):
      
          #include <linux/bpf.h>
          #include <linux/kcm.h>
          #include <linux/types.h>
          #include <stdint.h>
          #include <sys/ioctl.h>
          #include <sys/socket.h>
          #include <sys/syscall.h>
          #include <unistd.h>
      
          static const struct bpf_insn bpf_insns[3] = {
              { .code = 0xb7 }, /* BPF_MOV64_IMM(0, 0) */
              { .code = 0x95 }, /* BPF_EXIT_INSN() */
          };
      
          static const union bpf_attr bpf_attr = {
              .prog_type = 1,
              .insn_cnt = 2,
              .insns = (uintptr_t)&bpf_insns,
              .license = (uintptr_t)"",
          };
      
          int main(void)
          {
              int bpf_fd = syscall(__NR_bpf, BPF_PROG_LOAD,
                                   &bpf_attr, sizeof(bpf_attr));
              int inet_fd = socket(AF_INET, SOCK_STREAM, 0);
              int kcm_fd = socket(AF_KCM, SOCK_DGRAM, 0);
      
              ioctl(kcm_fd, SIOCKCMATTACH,
                    &(struct kcm_attach) { .fd = inet_fd, .bpf_fd = bpf_fd });
          }
      
      Fixes: bbb03029 ("strparser: Generalize strparser")
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Tom Herbert <tom@quantonium.net>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fd87127
  19. 16 Aug, 2017 1 commit
  20. 01 Aug, 2017 1 commit
  21. 04 Mar, 2017 1 commit
  22. 12 Oct, 2016 1 commit
  23. 29 Aug, 2016 1 commit
  24. 23 Aug, 2016 2 commits
  25. 17 Aug, 2016 1 commit
    • Tom Herbert's avatar
      strparser: Stream parser for messages · 43a0c675
      Tom Herbert authored
      This patch introduces a utility for parsing application layer protocol
      messages in a TCP stream. This is a generalization of the mechanism
      implemented of Kernel Connection Multiplexor.
      
      The API includes a context structure, a set of callbacks, utility
      functions, and a data ready function.
      
      A stream parser instance is defined by a strparse structure that
      is bound to a TCP socket. The function to initialize the structure
      is:
      
      int strp_init(struct strparser *strp, struct sock *csk,
                    struct strp_callbacks *cb);
      
      csk is the TCP socket being bound to and cb are the parser callbacks.
      
      The upper layer calls strp_tcp_data_ready when data is ready on the lower
      socket for strparser to process. This should be called from a data_ready
      callback that is set on the socket:
      
      void strp_tcp_data_ready(struct strparser *strp);
      
      A parser is bound to a TCP socket by setting data_ready function to
      strp_tcp_data_ready so that all receive indications on the socket
      go through the parser. This is assumes that sk_user_data is set to
      the strparser structure.
      
      There are four callbacks.
       - parse_msg is called to parse the message (returns length or error).
       - rcv_msg is called when a complete message has been received
       - read_sock_done is called when data_ready function exits
       - abort_parser is called to abort the parser
      
      The input to parse_msg is an skbuff which contains next message under
      construction. The backend processing of parse_msg will parse the
      application layer protocol headers to determine the length of
      the message in the stream. The possible return values are:
      
         >0 : indicates length of successfully parsed message
         0  : indicates more data must be received to parse the message
         -ESTRPIPE : current message should not be processed by the
            kernel, return control of the socket to userspace which
            can proceed to read the messages itself
         other < 0 : Error is parsing, give control back to userspace
            assuming that synchronzation is lost and the stream
            is unrecoverable (application expected to close TCP socket)
      
      In the case of error return (< 0) strparse will stop the parser
      and report and error to userspace. The application must deal
      with the error. To handle the error the strparser is unbound
      from the TCP socket. If the error indicates that the stream
      TCP socket is at recoverable point (ESTRPIPE) then the application
      can read the TCP socket to process the stream. Once the application
      has dealt with the exceptions in the stream, it may again bind the
      socket to a strparser to continue data operations.
      
      Note that ENODATA may be returned to the application. In this case
      parse_msg returned -ESTRPIPE, however strparser was unable to maintain
      synchronization of the stream (i.e. some of the message in question
      was already read by the parser).
      
      strp_pause and strp_unpause are used to provide flow control. For
      instance, if rcv_msg is called but the upper layer can't immediately
      consume the message it can hold the message and pause strparser.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43a0c675