      Merge branch 'master' of git://1984.lsi.us.es/nf-next · 82f437b9
      Pablo says:
      This is the second batch of Netfilter updates for net-next. It contains the
      kernel changes for the new user-space connection tracking helper
      More details on this infrastructure are provides here:
      Still, I plan to provide some official documentation through the
      conntrack-tools user manual on how to setup user-space utilities for this.
      So far, it provides two helper in user-space, one for NFSv3 and another for
      Oracle/SQLnet/TNS. Yet in my TODO list.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      include/net/dst.h: neaten asterisk placement · 7f95e188
      Fix code style - place the asterisk where it belongs.
      Signed-off-by: default avatarEldad Zack <eldad@fogrefinery.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      netfilter: add user-space connection tracking helper infrastructure · 12f7a505
      There are good reasons to supports helpers in user-space instead:
      * Rapid connection tracking helper development, as developing code
        in user-space is usually faster.
      * Reliability: A buggy helper does not crash the kernel. Moreover,
        we can monitor the helper process and restart it in case of problems.
      * Security: Avoid complex string matching and mangling in kernel-space
        running in privileged mode. Going further, we can even think about
        running user-space helpers as a non-root process.
      * Extensibility: It allows the development of very specific helpers (most
        likely non-standard proprietary protocols) that are very likely not to be
        accepted for mainline inclusion in the form of kernel-space connection
        tracking helpers.
      This patch adds the infrastructure to allow the implementation of
      user-space conntrack helpers by means of the new nfnetlink subsystem
      `nfnetlink_cthelper' and the existing queueing infrastructure
      I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
      ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
      two pieces. This change is required not to break NAT sequence
      adjustment and conntrack confirmation for traffic that is enqueued
      to our user-space conntrack helpers.
      Basic operation, in a few steps:
      1) Register user-space helper by means of `nfct':
       nfct helper add ftp inet tcp
       [ It must be a valid existing helper supported by conntrack-tools ]
      2) Add rules to enable the FTP user-space helper which is
         used to track traffic going to TCP port 21.
      For locally generated packets:
       iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
      For non-locally generated packets:
       iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
      3) Run the test conntrackd in helper mode (see example files under
      4) Generate FTP traffic going, if everything is OK, then conntrackd
         should create expectations (you can check that with `conntrack':
       conntrack -E expect
          [NEW] 301 proto=6 src= dst= sport=0 dport=54037 mask-src= mask-dst= sport=0 dport=65535 master-src= master-dst= sport=57127 dport=21 class=0 helper=ftp
      [DESTROY] 301 proto=6 src= dst= sport=0 dport=54037 mask-src= mask-dst= sport=0 dport=65535 master-src= master-dst= sport=57127 dport=21 class=0 helper=ftp
      This confirms that our test helper is receiving packets including the
      conntrack information, and adding expectations in kernel-space.
      The user-space helper can also store its private tracking information
      in the conntrack structure in the kernel via the CTA_HELP_INFO. The
      kernel will consider this a binary blob whose layout is unknown. This
      information will be included in the information that is transfered
      to user-space via glue code that integrates nfnetlink_queue and
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: ctnetlink: add CTA_HELP_INFO attribute · ae243bee
      This attribute can be used to modify and to dump the internal
      protocol information.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled · 8c88f87c
      User-space programs that receive traffic via NFQUEUE may mangle packets.
      If NAT is enabled, this usually puzzles sequence tracking, leading to
      traffic disruptions.
      With this patch, nfnl_queue will make the corresponding NAT TCP sequence
      adjustment if:
      1) The packet has been mangled,
      2) the NFQA_CFG_F_CONNTRACK flag has been set, and
      3) NAT is detected.
      There are some records on the Internet complaning about this issue:
      By now, we only support TCP since we have no helpers for DCCP or SCTP.
      Better to add this if we ever have some helper over those layer 4 protocols.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: add glue code to integrate nfnetlink_queue and ctnetlink · 9cb01766
      This patch allows you to include the conntrack information together
      with the packet that is sent to user-space via NFQUEUE.
      Previously, there was no integration between ctnetlink and
      nfnetlink_queue. If you wanted to access conntrack information
      from your libnetfilter_queue program, you required to query
      ctnetlink from user-space to obtain it. Thus, delaying the packet
      processing even more.
      Including the conntrack information is optional, you can set it
      via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: nf_ct_helper: implement variable length helper private data · 1afc5679
      This patch uses the new variable length conntrack extensions.
      Instead of using union nf_conntrack_help that contain all the
      helper private data information, we allocate variable length
      area to store the private helper data.
      This patch includes the modification of all existing helpers.
      It also includes a couple of include header to avoid compilation
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: nf_ct_ext: support variable length extensions · 3cf4c7e3
      We can now define conntrack extensions of variable size. This
      patch is useful to get rid of these unions:
      union nf_conntrack_help
      union nf_conntrack_proto
      union nf_conntrack_nat_help
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names · 3a8fc53a
      This patch modifies the struct nf_conntrack_helper to allocate
      the room for the helper name. The maximum length is 16 bytes
      (this was already introduced in 2.6.24).
      For the maximum length for expectation policy names, I have
      also selected 16 bytes.
      This patch is required by the follow-up patch to support
      user-space connection tracking helpers.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · aee289ba
      Pull in 'net' again to get the revert of Thomas's change
      which introduced regressions.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      This reverts commit 2a0c451a
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ipv6: Fix types of ip6_update_pmtu(). · 42ae66c8
      The mtu should be a __be32, not the mark.
      Reported-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 424d54d2
      Pull kvm fix from Marcelo Tosatti:
       "Fix a spurious warning on CPU offline path"
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        x86: kvmclock: remove check_and_clear_guest_paused warning
      Merge tag 'pinctrl-fixes-for-v3.5' of... · 09531359
      Merge tag 'pinctrl-fixes-for-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
      Pull pinctrl fixes from Linus Walleij:
       - section markup fixes
       - clk_prepare() fix to conform to the clk API
       - memory leaks
       - incorrect debug messages
       - bad errorpaths
       - typos
      * tag 'pinctrl-fixes-for-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: pinctrl-mxs: set platform driver data to NULL at errpath and at unregister
        pinctrl: pinctrl-mxs: Take care of frees if the kzalloc fails
        pinctrl: pinctrl-imx: fix incorrect debug message of maps
        pinctrl: pinctrl-imx: free if of_get_parent fails to get the parent node
        pinctrl: pinctrl-imx: free allocated pinctrl_map structure only once and use kernel facilities for IMX_PMX_DUMP
        pinctrl: nomadik: fix up typo
        pinctrl: nomadik: add clk_prepare() call
        pinctrl: fix a minor harmless typo
        pinctrl: sirf: mark of_device_id match table as __devinitconst
      Merge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · b532ff20
      Pull sound fixes from Takashi Iwai:
       - Fix a regression of USB-audio PCM assignment since 3.4
       - A few VGA-switcheroo-related fixes for proper HDMI audio enablement
       - Fixed the missing initializations of HD-audio verbs, which may have
         resulted in various breakage
       - Some driver-specific ASoC updates
       - A few fixes for the dynamic PCM code
       - The addition of pinctrl support for the i.MX audmux which didn't make
         it into -rc1 due to cross tree dependency issues
       - A few minor fixes in compress API codes
      * tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Don't forget to call init verbs added by fixup list
        ALSA: HDA: Pin fixup for Zotac Z68 motherboard
        ALSA: compress_core: cleanup pointers on stop
        ALSA: compress_core: don't wake up on pause
        ALSA: hda - Fix detection of Creative SoundCore3D controllers
        vga_switcheroo: Enable/disable audio clients at the right time
        ALSA: hda - HDMI Audio init all connectors when VGA-switcheroo is off
        vga_switcheroo: Fix error without CONFIG_VGA_SWITCHEROO
        ALSA: hda - Fix uninitialized HDMI controllers with VGA-switcheroo
        vga_switcheroo: Add a helper function to get the client state
        ALSA: usb-audio: Fix substream assignments
        ASoC: tegra: add MODULE_DEVICE_TABLE to tegra30_ahub
        ASoC: wm2000: Always use a 4s timeout for the firmware
        ASoC: dapm: Fix input list to use source widgets
        ASoC: dpcm: Fix dpcm_get_be() to check that DAI is BE
        ASoC: wm8994: Apply volume updates with clocks enabled
        ASoC: wm8994: Ensure all AIFnCLK events are run from the _late variants
        ASoC: imx-audmux: add pinctrl support
        ASoC: dapm: Fix connected widget capture path query.
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · fea7c783
      Pull networking fixes from David S. Miller:
      This has the fix for the wireless issues I ran into the other week as
      well as:
       1) Fix CAN c_can driver transmit handling resulting in BUG check
          triggers, from AnilKumar Ch.
       2) Fix packet drop monitor sleeping in atomic context, from Eric
       3) Fix mv643xx_eth driver build regression, from Andrew Lunn.
       4) Inetpeer freeing needs an RCU grace period in order to avoid races
          during tree invalidation.  From Eric Dumazet.
       5) Fix endianness bugs in xt_HMARK netfilter module, from Hans
       6) Add proper module refcounting to l2tp_eth to avoid crash on module
          unload, from Eric Dumazet.
       7) Fix truncation of neighbour entry dumps due to logic errors in
          neigh_dump_info() and friends, from Eric Dumazet.
       8) The conversion of fib6_age() to dst_neigh_lookup() accidently
          reversed the logic of a flags test, fix from Thomas Graf.
       9) Fix checksum configuration in newer sky2 chips, from Stephen
      10) Revert BQL support in NIU driver, doesn't work.
      11) l2tp_ip_sendmsg() illegally uses a route without a proper reference.
          From Eric Dumazet.
      12) be2net driver references an SKB after it's potentially been freed,
          also from Eric Dumazet.
      13) Fix RCU stalls in dummy net driver init.  Also from Eric Dumazet.
      14) lpc_eth has several bugs in it's transmit engine leading to packet
          leaks and improper queue wakes, from Eric Dumazet.
      15) Apply short DMA workaround to more tg3 chips, from Matt Carlson.
      16) Add tilegx network driver.
      17) Bonding queue mapping for a packet can get corrupted, fix from Eric
      18) Fix bug in netpoll_send_udp() SKB management that can leave garbage
          in the payload in certain situations.  From Eric Dumazet.
      19) bnx2x driver interprets chip RX checksum offload incorrectly in
          encapsulation situations.  Fix from Eric Dumazet.
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
        bnx2x: fix checksum validation
        netpoll: fix netpoll_send_udp() bugs
        bonding: Fix corrupted queue_mapping
        bonding:record primary when modify it via sysfs
        tilegx network driver: initial support
        tg3: Apply short DMA frag workaround to 5906
        net: stmmac: Fix clock en-/disable calls
        lpc_eth: fix tx completion
        lpc_eth: add missing ndo_change_mtu()
        dummy: fix rcu_sched self-detected stalls
        net: Reorder initialization in ip_route_output to fix gcc warning
        virtio-net: fix a race on 32bit arches
        r8169: avoid NAPI scheduling delay.
        net: Make linux/tcp.h C++ friendly (trivial)
        netdev: fix drivers/net/phy/ kernel-doc warnings
        net/core: fix kernel-doc warnings
        be2net: fix a race in be_xmit()
        l2tp: fix a race in l2tp_ip_sendmsg()
        mac80211: add back channel change flag
        NFC: Fix possible NULL ptr deref when getting the name of a socket
      ixgbe: Check PTP Rx timestamps via BPF filter · 1d1a79b5
      This patch fixes a potential Rx timestamp deadlock that causes the Rx
      timestamping to stall indefinitely. The issue could occur when a PTP packet is
      timestamped by hardware but never reaches the Rx queue. In order to prevent a
      permanent loss of timestamping, the RXSTMP(L/H) registers have to be read to
      unlock them. (This used to only occur when a packet that was timestamped
      reached the software.) However the registers can't be read early otherwise
      there is no way to correlate them to the packet.
      This patch introduces a filter function which can be used to determine if a
      packet should have been timestamped. Supplied with the filter setup by the
      hwtstamp ioctl, check to make sure the PTP protocol and message type match the
      expected values. If so, then read the timestamp registers (to free them.) At
      this point check the descriptor bit, if the bit is set then we know this
      packet correlates to the timestamp stored in the RXTSTAMP registers.
      Otherwise, assume that packet was dropped by the hardware, and ignore this
      timestamp value. However, we have at least unlocked the rxtstamp registers for
      future timestamping.
      Due to the way the driver handles skb data, it cannot be directly accessed. In
      order to work around this, a copy of the skb data into a linear buffer is
      made. From this buffer it becomes possible to read the data correctly
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Tested-by: default avatarPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ixgbe: PTP Fix hwtstamp mode settings · c19197a7
      When enabling the hwtstamp mode for Rx timestamping the V2 ptp event type
      specific modes (Delay Request and Sync) have been rolled into the V2 all event
      packet modes, in order to more accurately represent what hardware is doing.
      Hardware always timestamps the Path delay packets when a V2 mode is selected,
      regardless of what type was selected (in order to always support Path delay
      mode). However this means the user selected modes of timestamping only Sync or
      Delay Request is not truly supported. This patch correctly sets the mode for
      the hwtstamp config and returns to the user that all V2 event packets will be
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ixgbe: ptp code cleanup · 0ede4a60
      This patch fixes two minor nits from Richard Cochran. The first is a case of
      ambitious line wrapping that wasn't necessary. The second is to re-order the
      flag checks for PPS support. Previously, the hardware test was done first, and
      the interrupt flag test was done second. Now, test the interrupt flag and use
      the unlikely macro.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ixgbe: do not compile ixgbe_sysfs.c when CONFIG_IXGBE_HWMON is not set · 6cbc52ef
      ixgbe_sysfs.c is only needed when CONFIG_IXGBE_HWMON is configured in the
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Acked-by: default avatarDon Skidmore <Donald.c.skidmore@intel.com>
      Tested-by: default avatarPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ixgbe: align flow control DV macros with datasheet · 4f8a91ad
      The flow control DV macros are used to calculate the flow control
      high and low thresholds. This patch annotates these macros slightly
      better and fixes the issues below.
      The macro variables are renamed LINK to _max_frame_link and TC to
      _max_frame_tc. This was to avoid confusion and make them more
      readable. It was found that people auditing the code read TC to be
      'traffic class' in the 802.1Q definition instead of the max frame
      size of the tc. Hopefully it is clear now.
      This audit also found the following real deviations from the
      theoretical values. Fixed in this patch.
        * I multiplied the DV calculations by (36/25) which always
          evaluates to 1. This does not match the intended theoretical
          value of 1.44.
        * IXGBE_BT2KB added 1023 to account for rounding however this
          really should be 8 * 1023 - 1 to account for division by 8k.
        * x2 multiplication of max frame in DV calculations to account
          for updated hardware recommendations.
      With this patch the DV values are inline with the recommendations
      in the 82599 and 82598 data sheets. Its worth noting I did not
      see any dropped frames with flow control on in my experiments without
      this patch. However aligning with the hardware specs and
      recommendations seems like a good idea here to account for worst
      case scenarios.
      Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Tested-by: default avatarRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e1000e: use more informative logging macros when netdev not yet registered · 185095fb
      Based on a report from Ethan Zhao, before calling register_netdev() the
      driver should be using logging macros that do not display the potentially
      confusing "(unregistered net_device)" yet still display the useful driver
      name and PCI bus/device/function.
      Reported-by: default avatarEthan Zhao <ethan.kernel@gmail.com>
      Signed-off-by: default avatarBruce Allan <bruce.w.allan@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
    • Thomas Graf's avatar
      Thomas Graf authored
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bonding: drop_monitor aware · 04502430
      When packets are dropped in TX path, its better to use kfree_skb()
      instead of dev_kfree_skb() to give proper drop_monitor events.
      Also move the kfree_skb() call after read_unlock() in bond_alb_xmit()
      and bond_xmit_activebackup()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bnx2x: fix checksum validation · d6cb3e41
      bnx2x driver incorrectly sets ip_summed to CHECKSUM_UNNECESSARY on
      encapsulated segments. TCP stack happily accepts frames with bad
      checksums, if they are inside a GRE or IPIP encapsulation.
      Our understanding is that if no IP or L4 csum validation was done by the
      hardware, we should leave ip_summed as is (CHECKSUM_NONE), since
      hardware doesn't provide CHECKSUM_COMPLETE support in its cqe.
      Then, if IP/L4 checksumming was done by the hardware, set
      CHECKSUM_UNNECESSARY if no error was flagged.
      Patch based on findings and analysis from Robert Evans
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Yaniv Rosner <yanivr@broadcom.com>
      Cc: Merav Sicron <meravs@broadcom.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Robert Evans <evansr@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Acked-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>