summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'net-6.5-rc1' of ↵Linus Torvalds2023-07-056-26/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf and wireguard. Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind" * tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) nfp: clean mc addresses in application firmware when closing port selftests: mptcp: pm_nl_ctl: fix 32-bit support selftests: mptcp: depend on SYN_COOKIES selftests: mptcp: userspace_pm: report errors with 'remove' tests selftests: mptcp: userspace_pm: use correct server port selftests: mptcp: sockopt: return error if wrong mark selftests: mptcp: sockopt: use 'iptables-legacy' if available selftests: mptcp: connect: fail if nft supposed to work mptcp: do not rely on implicit state check in mptcp_listen() mptcp: ensure subflow is unhashed before cleaning the backlog s390/qeth: Fix vipa deletion octeontx-af: fix hardware timestamp configuration net: dsa: sja1105: always enable the send_meta options net: dsa: tag_sja1105: fix MAC DA patching from meta frames net: Replace strlcpy with strscpy pptp: Fix fib lookup calls. mlxsw: spectrum_router: Fix an IS_ERR() vs NULL check net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX xsk: Honor SO_BINDTODEVICE on bind ptp: Make max_phase_adjustment sysfs device attribute invisible when not supported ...
| * Merge branch 'wireguard-fixes'David S. Miller2023-07-036-26/+28
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard fixes for 6.4.2/6.5-rc1 Sorry to send these patches during the merge window, but they're net fixes, not netdev enhancements, and while I'd ordinarily wait anyway, I just got a first bug report for one of these fixes, which I originally had thought was mostly unlikely. So please apply the following three patches to net: 1) Make proper use of nr_cpu_ids with cpumask_next(), rather than awkwardly using modulo, to handle dynamic CPU topology changes. Linus noticed this a while ago and pointed it out, and today a user actually got hit by it. 2) Respect persistent keepalive and other staged packets when setting the private key after the interface is already up. 3) Use timer_delete_sync() instead of del_timer_sync(), per the documentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * wireguard: timers: move to using timer_delete_syncJason A. Donenfeld2023-07-031-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | The documentation says that del_timer_sync is obsolete, and code should use the equivalent timer_delete_sync instead, so switch to it. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * wireguard: netlink: send staged packets when setting initial private keyJason A. Donenfeld2023-07-031-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Packets bound for peers can queue up prior to the device private key being set. For example, if persistent keepalive is set, a packet is queued up to be sent as soon as the device comes up. However, if the private key hasn't been set yet, the handshake message never sends, and no timer is armed to retry, since that would be pointless. But, if a user later sets a private key, the expectation is that those queued packets, such as a persistent keepalive, are actually sent. So adjust the configuration logic to account for this edge case, and add a test case to make sure this works. Maxim noticed this with a wg-quick(8) config to the tune of: [Interface] PostUp = wg set %i private-key somefile [Peer] PublicKey = ... Endpoint = ... PersistentKeepalive = 25 Here, the private key gets set after the device comes up using a PostUp script, triggering the bug. Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Reported-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Tested-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Link: https://lore.kernel.org/wireguard/87fs7xtqrv.fsf@gmail.com/ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * wireguard: queueing: use saner cpu selection wrappingJason A. Donenfeld2023-07-034-16/+14
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using `% nr_cpumask_bits` is slow and complicated, and not totally robust toward dynamic changes to CPU topologies. Rather than storing the next CPU in the round-robin, just store the last one, and also return that value. This simplifies the loop drastically into a much more common pattern. Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Manuel Leiner <manuel.leiner@gmx.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2023-06-290-0/+0
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "There is one set of patches to misc for a i915 gsc/mei proxy driver. Otherwise it's mostly amdgpu/i915/msm, lots of hw enablement and lots of refactoring. core: - replace strlcpy with strscpy - EDID changes to support further conversion to struct drm_edid - Move i915 DSC parameter code to common DRM helpers - Add Colorspace functionality aperture: - ignore framebuffers with non-primary devices fbdev: - use fbdev i/o helpers - add Kconfig options for fb_ops helpers - use new fb io helpers directly in drivers sysfs: - export DRM connector ID scheduler: - Avoid an infinite loop ttm: - store function table in .rodata - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() bridge: - fsl-ldb: support i.MX6SX - lt9211, lt9611: remove blanking packets - tc358768: implement input bus formats, devm cleanups - ti-snd65dsi86: implement wait_hpd_asserted - analogix: fix endless probe loop - samsung-dsim: support swapped clock, fix enabling, support var clock - display-connector: Add support for external power supply - imx: Fix module linking - tc358762: Support reset GPIO panel: - nt36523: Support Lenovo J606F - st7703: Support Anbernic RG353V-V2 - InnoLux G070ACE-L01 support - boe-tv101wum-nl6: Improve initialization - sharp-ls043t1le001: Mode fixes - simple: BOE EV121WXM-N10-1850, S6D7AA0 - Ampire AM-800480L1TMQW-T00H - Rocktech RK043FN48H - Starry himax83102-j02 - Starry ili9882t amdgpu: - add new ctx query flag to handle reset better - add new query/set shadow buffer for rdna3 - DCN 3.2/3.1.x/3.0.x updates - Enable DC_FP on loongarch - PCIe fix for RDNA2 - improve DC FAMS/SubVP support for better power management - partition support for lots of engines - Take NUMA into account when allocating memory - Add new DRM_AMDGPU_WERROR config parameter to help with CI - Initial SMU13 overdrive support - Add support for new colorspace KMS API - W=1 fixes amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions - Add debugger interface for enabling gdb - Add KFD event age tracking radeon: - Fix possible UAF i915: - new getparam for PXP support - GSC/MEI proxy driver - Meteorlake display enablement - avoid clearing preallocated framebuffers with TTM - implement framebuffer mmap support - Disable sampler indirect state in bindless heap - Enable fdinfo for GuC backends - GuC loading and firmware table handling fixes - Various refactors for multi-tile enablement - Define MOCS and PAT tables for MTL - GSC/MEI support for Meteorlake - PMU multi-tile support - Large driver kernel doc cleanup - Allow VRR toggling and arbitrary refresh rates - Support async flips on linear buffers on display ver 12+ - Expose CRTC CTM property on ILK/SNB/VLV - New debugfs for display clock frequencies - Hotplug refactoring - Display refactoring - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake - Use large rings for compute contexts - HuC loading for MTL - Allow user to set cache at BO creation - MTL powermanagement enhancements - Switch to dedicated workqueues to stop using flush_scheduled_work() - Move display runtime init under display/ - Remove 10bit gamma on desktop gen3 parts, they don't support it habanalabs: - uapi: return 0 for user queries if there was a h/w or f/w error - Add pci health check when we lose connection with the firmware. This can be used to distinguish between pci link down and firmware getting stuck. - Add more info to the error print when TPC interrupt occur. - Firmware fixes msm: - Adreno A660 bindings - SM8350 MDSS bindings fix - Added support for DPU on sm6350 and sm6375 platforms - Implemented tearcheck support to support vsync on SM150 and newer platforms - Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450 - Added support for DSI and 28nm DSI PHY on MSM8226 platform - Added support for DSI on sm6350 and sm6375 platforms - Added support for display controller on MSM8226 platform - A690 GPU support - Move cmdstream dumping out of fence signaling path - a610 support - Support for a6xx devices without GMU nouveau: - NULL ptr before deref fixes armada: - implement fbdev emulation as client sun4i: - fix mipi-dsi dotclock - release clocks vc4: - rgb range toggle property - BT601 / BT2020 HDMI support vkms: - convert to drmm helpers - add reflection and rotation support - fix rgb565 conversion gma500: - fix iomem access shmobile: - support renesas soc platform - enable fbdev mxsfb: - Add support for i.MX93 LCDIF stm: - dsi: Use devm_ helper - ltdc: Fix potential invalid pointer deref renesas: - Group drivers in renesas subdirectory to prepare for new platform - Drop deprecated R-Car H3 ES1.x support meson: - Add support for MIPI DSI displays virtio: - add sync object support mediatek: - Add display binding document for MT6795" * tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm: (1791 commits) drm/i915: Fix a NULL vs IS_ERR() bug drm/i915: make i915_drm_client_fdinfo() reference conditional again drm/i915/huc: Fix missing error code in intel_huc_init() drm/i915/gsc: take a wakeref for the proxy-init-completion check drm/msm/a6xx: Add A610 speedbin support drm/msm/a6xx: Add A619_holi speedbin support drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching drm/msm/a6xx: Use "else if" in GPU speedbin rev matching drm/msm/a6xx: Fix some A619 tunables drm/msm/a6xx: Add A610 support drm/msm/a6xx: Add support for A619_holi drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations drm/msm/a6xx: Introduce GMU wrapper support drm/msm/a6xx: Move CX GMU power counter enablement to hw_init drm/msm/a6xx: Extend and explain UBWC config drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init drm/msm/a6xx: Add a helper for software-resetting the GPU drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions() drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off() ...
| * Merge drm/drm-next into drm-misc-next-fixesThomas Zimmermann2023-06-150-0/+0
| |\ | | | | | | | | | | | | | | | Backmerging to sync drm-misc-next-fixes with drm-misc-next. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
* | | net: move gso declarations and functions to their own filesEric Dumazet2023-06-101-0/+1
| |/ |/| | | | | | | | | | | | | | | | | | | Move declarations into include/net/gso.h and code into net/core/gso.c Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stanislav Fomichev <sdf@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge branch 'for-6.4/amd-sfh' into for-linusJiri Kosina2023-04-261-1/+1
|\| | | | | | | - assorted functional fixes for amd-sfh driver (Basavaraj Natikar)
| * cpumask: fix incorrect cpumask scanning result checksLinus Torvalds2023-03-061-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations") exposed a number of cases of drivers not checking the result of "cpumask_next()" and friends correctly. The documented correct check for "no more cpus in the cpumask" is to check for the result being equal or larger than the number of possible CPU ids, exactly _because_ we've always done those constant-sized cpumask scans using a widened type before. So the return value of a cpumask scan should be checked with if (cpu >= nr_cpu_ids) ... because the cpumask scan did not necessarily stop exactly *at* that maximum CPU id. But a few cases ended up instead using checks like if (cpu == nr_cpumask_bits) ... which used that internal "widened" number of bits. And that used to work pretty much by accident (ok, in this case "by accident" is simply because it matched the historical internal implementation of the cpumask scanning, so it was more of a "intentionally using implementation details rather than an accident"). But the extended constant-sized optimizations then did that internal implementation differently, and now that code that did things wrong but matched the old implementation no longer worked at all. Which then causes subsequent odd problems due to using what ends up being an invalid CPU ID. Most of these cases require either unusual hardware or special uses to hit, but the random.c one triggers quite easily. All you really need is to have a sufficiently small CONFIG_NR_CPUS value for the bit scanning optimization to be triggered, but not enough CPUs to then actually fill that widened cpumask. At that point, the cpumask scanning will return the NR_CPUS constant, which is _not_ the same as nr_cpumask_bits. This just does the mindless fix with sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/' to fix the incorrect uses. The ones in the SCSI lpfc driver in particular could probably be fixed more cleanly by just removing that repeated pattern entirely, but I am not emptionally invested enough in that driver to care. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/ Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/ Reported-by: Vernon Yang <vernon2gm@gmail.com> Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/ Cc: Yury Norov <yury.norov@gmail.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* wireguard: timers: cast enum limits members to int in printsJiri Slaby (SUSE)2022-12-131-4/+4
| | | | | | | | | | | | | | | | | | Since gcc13, each member of an enum has the same type as the enum. And that is inherited from its members. Provided "REKEY_AFTER_MESSAGES = 1ULL << 60", the named type is unsigned long. This generates warnings with gcc-13: error: format '%d' expects argument of type 'int', but argument 6 has type 'long unsigned int' Cast those particular enum members to int when printing them. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113 Cc: Martin Liska <mliska@suse.cz> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/all/20221213225208.3343692-2-Jason@zx2c4.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2022-12-130-0/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "The biggest highlight is that the accel subsystem framework is merged. Hopefully for 6.3 we will be able to line up a driver to use it. In drivers land, i915 enables DG2 support by default now, and nouveau has a big stability refactoring and initial ampere support, AMD includes new hw IP support and should build on ARM again. There is also an ofdrm driver to take over offb on platforms it's used. Stuff outside my tree, the dma-buf patches hit a few places, the vc4 firmware changes also do, and i915 has some interactions with MEI for discrete GPUs. I think all of those should have been acked/reviewed by relevant parties. New driver: - ofdrm - replacement for offb fbdev: - add support for nomodeset fourcc: - add Vivante tiled modifier core: - atomic-helpers: CRTC primary plane test fixes, fb access hooks - connector: TV API consistency, cmdline parser improvements - send connector hotplug on cleanup - sort makefile objects tests: - sort kunit tests - improve DP-MST tests - add kunit helpers to create a device sched: - module param for scheduling policy - refcounting fix buddy: - add back random seed log ttm: - convert ttm_resource to size_t - optimize pool allocations edid: - HFVSDB parsing support fixes - logging/debug improvements - DSC quirks dma-buf: - Add unlocked vmap and attachment mapping - move drivers to common locking convention - locking improvements firmware: - new API for rPI firmware and vc4 xilinx: - zynqmp: displayport bridge support - dpsub fix bridge: - adv7533: Remove dynamic lane switching - it6505: Runtime PM support, sync improvements - ps8640: Handle AUX defer messages - tc358775: Drop soft-reset over I2C panel: - panel-edp: Add INX N116BGE-EA2 C2 and C4 support. - Jadard JD9365DA-H3 - NewVision NV3051D amdgpu: - DCN support on ARM - DCN 2.1 secure display - Sienna Cichlid mode2 reset fixes - new GC 11.x firmware versions - drop AMD specific DSC workarounds in favour of drm code - clang warning fixes - scheduler rework - SR-IOV fixes - GPUVM locking fixes - fix memory leak in CS IOCTL error path - flexible array updates - enable new GC/PSP/SMU/NBIO IP - GFX preemption support for gfx9 amdkfd: - cache size fixes - userptr fixes - enable cooperative launch on gfx 10.3 - enable GC 11.0.4 KFD support radeon: - replace kmap with kmap_local_page - ACPI ref count fix - HDA audio notifier support i915: - DG2 enabled by default - MTL enablement work - hotplug refactoring - VBT improvements - Display and watermark refactoring - ADL-P workaround - temp disable runtime_pm for discrete- - fix for A380 as a secondary GPU - Wa_18017747507 for DG2 - CS timestamp support fixes for gen5 and earlier - never purge busy TTM objects - use i915_sg_dma_sizes for all backends - demote GuC kernel contexts to normal priority - gvt: refactor for new MDEV interface - enable DC power states on eDP ports - fix gen 2/3 workarounds nouveau: - fix page fault handling - Ampere acceleration support - driver stability improvements - nva3 backlight support msm: - MSM_INFO_GET_FLAGS support - DPU: XR30 and P010 image formats - Qualcomm SM6115 support - DSI PHY support for QCM2290 - HDMI: refactored dev init path - remove exclusive-fence hack - fix speed-bin detection - enable clamp to idle on 7c3 - improved hangcheck detection vmwgfx: - fb and cursor refactoring - convert to generic hashtable - cursor improvements etnaviv: - hw workarounds - softpin MMU fixes ast: - atomic gamma LUT support - convert to SHMEM lcdif: - support YUV planes - Increase DMA burst size - FIFO threshold tuning meson: - fix return type of cvbs mode_valid mgag200: - fix PLL setup on some revisions sun4i: - A100 and D1 support udl: - modesetting improvements - hot unplug support vc4: - support PAL-M - fix regression preventing 4K @ 60Hz - fix NULL ptr deref v3d: - switch to drm managed resources renesas: - RZ/G2L DSI support - DU Kconfig cleanup mediatek: - fixup dpi and hdmi - MT8188 dpi support - MT8195 AFBC support tegra: - NVDEC hardware on Tegra234 SoC hdlcd: - switch to drm managed resources ingenic: - fix registration error path hisilicon: - convert to drm_mode_init maildp: - use managed resources mtk: - use drm_mode_init rockchip: - use drm_mode_copy" * tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits) drm/amdgpu: fix mmhub register base coding error drm/amdgpu: add tmz support for GC IP v11.0.4 drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4 drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4 drm/amdgpu: enable GFX IP v11.0.4 CG support drm/amdgpu: Make amdgpu_ring_mux functions as static drm/amdgpu: generally allow over-commit during BO allocation drm/amd/display: fix array index out of bound error in DCN32 DML drm/amd/display: 3.2.215 drm/amd/display: set optimized required for comp buf changes drm/amd/display: Add debug option to skip PSR CRTC disable drm/amd/display: correct DML calc error of UrgentLatency drm/amd/display: correct static_screen_event_mask drm/amd/display: Ensure commit_streams returns the DC return code drm/amd/display: read invalid ddc pin status cause engine busy drm/amd/display: Bypass DET swath fill check for max clocks drm/amd/display: Disable uclk pstate for subvp pipes drm/amd/display: Fix DCN2.1 default DSC clocks drm/amd/display: Enable dp_hdmi21_pcon support drm/amd/display: prevent seamless boot on displays that don't have the preferred dig ...
| * Merge drm/drm-next into drm-misc-next-fixesThomas Zimmermann2022-11-214-32/+26
| |\ | | | | | | | | | | | | | | | | | | Backmerging to update drm-misc-next-fixes for the final phase of the release cycle. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
* | | treewide: use get_random_u32_inclusive() when possibleJason A. Donenfeld2022-11-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These cases were done with this Coccinelle: @@ expression H; expression L; @@ - (get_random_u32_below(H) + L) + get_random_u32_inclusive(L, H + L - 1) @@ expression H; expression L; expression E; @@ get_random_u32_inclusive(L, H - + E - - E ) @@ expression H; expression L; expression E; @@ get_random_u32_inclusive(L, H - - E - + E ) @@ expression H; expression L; expression E; expression F; @@ get_random_u32_inclusive(L, H - - E + F - + E ) @@ expression H; expression L; expression E; expression F; @@ get_random_u32_inclusive(L, H - + E + F - - E ) And then subsequently cleaned up by hand, with several automatic cases rejected if it didn't make sense contextually. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | | treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld2022-11-182-12/+12
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | treewide: use get_random_bytes() when possibledeveldavem/net-nextJason A. Donenfeld2022-10-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The prandom_bytes() function has been a deprecated inline wrapper around get_random_bytes() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | treewide: use get_random_{u8,u16}() when possible, part 1Jason A. Donenfeld2022-10-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, simply use the get_random_{u8,u16}() functions, which are faster than wasting the additional bytes from a 32-bit value. This was done mechanically with this coccinelle script: @@ expression E; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; typedef __be16; typedef __le16; typedef u8; @@ ( - (get_random_u32() & 0xffff) + get_random_u16() | - (get_random_u32() & 0xff) + get_random_u8() | - (get_random_u32() % 65536) + get_random_u16() | - (get_random_u32() % 256) + get_random_u8() | - (get_random_u32() >> 16) + get_random_u16() | - (get_random_u32() >> 24) + get_random_u8() | - (u16)get_random_u32() + get_random_u16() | - (u8)get_random_u32() + get_random_u8() | - (__be16)get_random_u32() + (__be16)get_random_u16() | - (__le16)get_random_u32() + (__le16)get_random_u16() | - prandom_u32_max(65536) + get_random_u16() | - prandom_u32_max(256) + get_random_u8() | - E->inet_id = get_random_u32() + E->inet_id = get_random_u16() ) @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; identifier v; @@ - u16 v = get_random_u32(); + u16 v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; identifier v; @@ - u8 v = get_random_u32(); + u8 v = get_random_u8(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; u16 v; @@ - v = get_random_u32(); + v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; u8 v; @@ - v = get_random_u32(); + v = get_random_u8(); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Examine limits @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value < 256: coccinelle.RESULT = cocci.make_ident("get_random_u8") elif value < 65536: coccinelle.RESULT = cocci.make_ident("get_random_u16") else: print("Skipping large mask of %s" % (literal)) cocci.include_match(False) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; identifier add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + (RESULT() & LITERAL) Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | net: drop the weight argument from netif_napi_addJakub Kicinski2022-09-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2022-09-222-22/+16
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/net/ethernet/freescale/fec.h 7b15515fc1ca ("Revert "fec: Restart PPS after link state change"") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/ drivers/pinctrl/pinctrl-ocelot.c c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") 181f604b33cd ("pinctrl: ocelot: add ability to be used in a non-mmio configuration") https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/ tools/testing/selftests/drivers/net/bonding/Makefile bbb774d921e2 ("net: Add tests for bonding and team address list management") 152e8ec77640 ("selftests/bonding: add a test for bonding lladdr target") https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/ drivers/net/can/usb/gs_usb.c 5440428b3da6 ("can: gs_usb: gs_can_open(): fix race dev->can.state condition") 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") https://lore.kernel.org/all/84f45a7d-92b6-4dc5-d7a1-072152fab6ff@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * \ Merge branch 'wireguard-patches-for-6-0-rc6'Jakub Kicinski2022-09-202-22/+16
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard patches for 6.0-rc6 1) The ratelimiter timing test doesn't help outside of development, yet it is currently preventing the module from being inserted on some kernels when it flakes at insertion time. So we disable it. 2) A fix for a build error on UML, caused by a recent change in a different tree. 3) A WARN_ON() is triggered by Kees' new fortified memcpy() patch, due to memcpy()ing over a sockaddr pointer with the size of a sockaddr_in[6]. The type safe fix is pretty simple. Given how classic of a thing sockaddr punning is, I suspect this may be the first in a few patches like this throughout the net tree, once Kees' fortify series is more widely deployed (current it's just in next). ==================== Link: https://lore.kernel.org/r/20220916143740.831881-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * wireguard: netlink: avoid variable-sized memcpy on sockaddrJason A. Donenfeld2022-09-201-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doing a variable-sized memcpy is slower, and the compiler isn't smart enough to turn this into a constant-size assignment. Further, Kees' latest fortified memcpy will actually bark, because the destination pointer is type sockaddr, not explicitly sockaddr_in or sockaddr_in6, so it thinks there's an overflow: memcpy: detected field-spanning write (size 28) of single field "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16) Fix this by just assigning by using explicit casts for each checked case. Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * wireguard: ratelimiter: disable timings test by defaultJason A. Donenfeld2022-09-201-15/+10
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A previous commit tried to make the ratelimiter timings test more reliable but in the process made it less reliable on other configurations. This is an impossible problem to solve without increasingly ridiculous heuristics. And it's not even a problem that actually needs to be solved in any comprehensive way, since this is only ever used during development. So just cordon this off with a DEBUG_ ifdef, just like we do for the trie's randomized tests, so it can be enabled while hacking on the code, and otherwise disabled in CI. In the process we also revert b56cf61d29db. Fixes: b56cf61d29db ("wireguard: ratelimiter: use hrtimer in selftest") Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* / genetlink: start to validate reserved header bytesJakub Kicinski2022-08-291-0/+1
|/ | | | | | | | | | | | | | | | | We had historically not checked that genlmsghdr.reserved is 0 on input which prevents us from using those precious bytes in the future. One use case would be to extend the cmd field, which is currently just 8 bits wide and 256 is not a lot of commands for some core families. To make sure that new families do the right thing by default put the onus of opting out of validation on existing families. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel) Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'char-misc-6.0-rc1' of ↵Linus Torvalds2022-08-041-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver updates from Greg KH: "Here is the large set of char and misc and other driver subsystem changes for 6.0-rc1. Highlights include: - large set of IIO driver updates, additions, and cleanups - new habanalabs device support added (loads of register maps much like GPUs have) - soundwire driver updates - phy driver updates - slimbus driver updates - tiny virt driver fixes and updates - misc driver fixes and updates - interconnect driver updates - hwtracing driver updates - fpga driver updates - extcon driver updates - firmware driver updates - counter driver update - mhi driver fixes and updates - binder driver fixes and updates - speakup driver fixes All of these have been in linux-next for a while without any reported problems" * tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (634 commits) drivers: lkdtm: fix clang -Wformat warning char: remove VR41XX related char driver misc: Mark MICROCODE_MINOR unused spmi: trace: fix stack-out-of-bound access in SPMI tracing functions dt-bindings: iio: adc: Add compatible for MT8188 iio: light: isl29028: Fix the warning in isl29028_remove() iio: accel: sca3300: Extend the trigger buffer from 16 to 32 bytes iio: fix iio_format_avail_range() printing for none IIO_VAL_INT iio: adc: max1027: unlock on error path in max1027_read_single_value() iio: proximity: sx9324: add empty line in front of bullet list iio: magnetometer: hmc5843: Remove duplicate 'the' iio: magn: yas530: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros iio: magnetometer: ak8974: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros iio: light: veml6030: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros iio: light: vcnl4035: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros iio: light: vcnl4000: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros iio: light: tsl2591: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() iio: light: tsl2583: Use DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr() iio: light: isl29028: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() iio: light: gp2ap002: Switch to DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr() ...
| * pm/sleep: Add PM_USERSPACE_AUTOSLEEP KconfigKalesh Singh2022-07-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Systems that initiate frequent suspend/resume from userspace can make the kernel aware by enabling PM_USERSPACE_AUTOSLEEP config. This allows for certain sleep-sensitive code (wireguard/rng) to decide on what preparatory work should be performed (or not) in their pm_notification callbacks. This patch was prompted by the discussion at [1] which attempts to remove CONFIG_ANDROID that currently guards these code paths. [1] https://lore.kernel.org/r/20220629150102.1582425-1-hch@lst.de/ Suggested-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Link: https://lore.kernel.org/r/20220630191230.235306-1-kaleshsingh@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'wireguard-patches-for-5-20-rc1'Jakub Kicinski2022-08-023-17/+23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard patches for 5.20-rc1 I had planned to send these out eventually as net.git patches, but as you emailed earlier, I figure there's no harm in just doing this now for net-next.git. Please apply the following small fixes: 1) Rather than using msleep() in order to approximate ktime_get_coarse_ boottime_ns(), instead use an hrtimer, rounded heuristically. 2) An update in selftest config fragments, from Lukas. 3) Linus noticed that a debugging WARN_ON() to detect (impossible) stack corruption would still allow the corruption to happen, making it harder to get the report about the corruption subsequently. 4) Support for User Mode Linux in the test suite. This depends on some UML patches that are slated for 5.20. Richard hasn't sent his pull in, but they're in his tree, so I assume it'll happen. ==================== Link: https://lore.kernel.org/r/20220802125613.340848-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | wireguard: allowedips: don't corrupt stack when detecting overflowJason A. Donenfeld2022-08-022-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case push_rcu() and related functions are buggy, there's a WARN_ON(len >= 128), which the selftest tries to hit by being tricky. In case it is hit, we shouldn't corrupt the kernel's stack, though; otherwise it may be hard to even receive the report that it's buggy. So conditionalize the stack write based on that WARN_ON()'s return value. Note that this never *actually* happens anyway. The WARN_ON() in the first place is bounded by IS_ENABLED(DEBUG), and isn't expected to ever actually hit. This is just a debugging sanity check. Additionally, hoist the constant 128 into a named enum, MAX_ALLOWEDIPS_BITS, so that it's clear why this value is chosen. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/all/CAHk-=wjJZGA6w_DxA+k7Ejbqsq+uGK==koPai3sqdsfJqemvag@mail.gmail.com/ Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | wireguard: ratelimiter: use hrtimer in selftestJason A. Donenfeld2022-08-021-11/+14
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using msleep() is problematic because it's compared against ratelimiter.c's ktime_get_coarse_boottime_ns(), which means on systems with slow jiffies (such as UML's forced HZ=100), the result is inaccurate. So switch to using schedule_hrtimeout(). However, hrtimer gives us access only to the traditional posix timers, and none of the _COARSE variants. So now, rather than being too imprecise like jiffies, it's too precise. One solution would be to give it a large "range" value, but this will still fire early on a loaded system. A better solution is to align the timeout to the actual coarse timer, and then round up to the nearest tick, plus change. So add the timeout to the current coarse time, and then schedule_hrtimer() until the absolute computed time. This should hopefully reduce flakes in CI as well. Note that we keep the retry loop in case the entire function is running behind, because the test could still be scheduled out, by either the kernel or by the hypervisor's kernel, in which case restarting the test and hoping to not be scheduled out still helps. Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge branch 'net-adopt-u64_stats_t-type'Jakub Kicinski2022-06-091-8/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet says: ==================== net: adopt u64_stats_t type While KCSAN has not raised any reports yet, we should address the potential load/store tearing problem happening with per cpu stats. This series is not exhaustive, but hopefully a step in the right direction. ==================== Link: https://lore.kernel.org/r/20220608154640.1235958-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: receive: use dev_sw_netstats_rx_add()Eric Dumazet2022-06-091-8/+1
|/ | | | | | | | | We have a convenient helper, let's use it. This will make the following patch easier to review and smaller. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'wireguard-patches-for-5-18-rc4'Jakub Kicinski2022-04-221-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard patches for 5.18-rc4 Here are two small wireguard fixes for 5.18-rc4: 1) We enable ACPI in the QEMU test harness, so that multiple CPUs are actually used on x86 for testing for races. 2) Sending skbs with metadata dsts attached resulted in a null pointer dereference, triggerable from executing eBPF programs. The fix is a oneliner, changing a skb_dst() null check into a skb_valid_dst() boolean check. ==================== Link: https://lore.kernel.org/r/20220421134805.279118-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: device: check for metadata_dst with skb_valid_dst()Nikolay Aleksandrov2022-04-221-1/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we try to transmit an skb with md_dst attached through wireguard we hit a null pointer dereference in wg_xmit() due to the use of dst_mtu() which calls into dst_blackhole_mtu() which in turn tries to dereference dst->dev. Since wireguard doesn't use md_dsts we should use skb_valid_dst(), which checks for DST_METADATA flag, and if it's set, then falls back to wireguard's device mtu. That gives us the best chance of transmitting the packet; otherwise if the blackhole netdev is used we'd get ETH_MIN_MTU. [ 263.693506] BUG: kernel NULL pointer dereference, address: 00000000000000e0 [ 263.693908] #PF: supervisor read access in kernel mode [ 263.694174] #PF: error_code(0x0000) - not-present page [ 263.694424] PGD 0 P4D 0 [ 263.694653] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 263.694876] CPU: 5 PID: 951 Comm: mausezahn Kdump: loaded Not tainted 5.18.0-rc1+ #522 [ 263.695190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014 [ 263.695529] RIP: 0010:dst_blackhole_mtu+0x17/0x20 [ 263.695770] Code: 00 00 00 0f 1f 44 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 10 48 83 e0 fc 8b 40 04 85 c0 75 09 48 8b 07 <8b> 80 e0 00 00 00 c3 66 90 0f 1f 44 00 00 48 89 d7 be 01 00 00 00 [ 263.696339] RSP: 0018:ffffa4a4422fbb28 EFLAGS: 00010246 [ 263.696600] RAX: 0000000000000000 RBX: ffff8ac9c3553000 RCX: 0000000000000000 [ 263.696891] RDX: 0000000000000401 RSI: 00000000fffffe01 RDI: ffffc4a43fb48900 [ 263.697178] RBP: ffffa4a4422fbb90 R08: ffffffff9622635e R09: 0000000000000002 [ 263.697469] R10: ffffffff9b69a6c0 R11: ffffa4a4422fbd0c R12: ffff8ac9d18b1a00 [ 263.697766] R13: ffff8ac9d0ce1840 R14: ffff8ac9d18b1a00 R15: ffff8ac9c3553000 [ 263.698054] FS: 00007f3704c337c0(0000) GS:ffff8acaebf40000(0000) knlGS:0000000000000000 [ 263.698470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 263.698826] CR2: 00000000000000e0 CR3: 0000000117a5c000 CR4: 00000000000006e0 [ 263.699214] Call Trace: [ 263.699505] <TASK> [ 263.699759] wg_xmit+0x411/0x450 [ 263.700059] ? bpf_skb_set_tunnel_key+0x46/0x2d0 [ 263.700382] ? dev_queue_xmit_nit+0x31/0x2b0 [ 263.700719] dev_hard_start_xmit+0xd9/0x220 [ 263.701047] __dev_queue_xmit+0x8b9/0xd30 [ 263.701344] __bpf_redirect+0x1a4/0x380 [ 263.701664] __dev_queue_xmit+0x83b/0xd30 [ 263.701961] ? packet_parse_headers+0xb4/0xf0 [ 263.702275] packet_sendmsg+0x9a8/0x16a0 [ 263.702596] ? _raw_spin_unlock_irqrestore+0x23/0x40 [ 263.702933] sock_sendmsg+0x5e/0x60 [ 263.703239] __sys_sendto+0xf0/0x160 [ 263.703549] __x64_sys_sendto+0x20/0x30 [ 263.703853] do_syscall_64+0x3b/0x90 [ 263.704162] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 263.704494] RIP: 0033:0x7f3704d50506 [ 263.704789] Code: 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89 [ 263.705652] RSP: 002b:00007ffe954b0b88 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 263.706141] RAX: ffffffffffffffda RBX: 0000558bb259b490 RCX: 00007f3704d50506 [ 263.706544] RDX: 000000000000004a RSI: 0000558bb259b7b2 RDI: 0000000000000003 [ 263.706952] RBP: 0000000000000000 R08: 00007ffe954b0b90 R09: 0000000000000014 [ 263.707339] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe954b0b90 [ 263.707735] R13: 000000000000004a R14: 0000558bb259b7b2 R15: 0000000000000001 [ 263.708132] </TASK> [ 263.708398] Modules linked in: bridge netconsole bonding [last unloaded: bridge] [ 263.708942] CR2: 00000000000000e0 Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Link: https://github.com/cilium/cilium/issues/19428 Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'wireguard-patches-for-5-18-rc1'Jakub Kicinski2022-03-302-3/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard patches for 5.18-rc1 Here's a small set of fixes for the next net push: 1) Pipacs reported a CFI violation in a cleanup routine, which he triggered using grsec's RAP. I haven't seen reports of this yet from the Android/CFI world yet, but it's only a matter of time there. 2) A small rng cleanup to the self test harness to make it initialize faster on 5.18. 3) Wang reported and fixed a skb leak for CONFIG_IPV6=n. 4) After Wang's fix for the direct leak, I investigated how that code path even could be hit, and found that the netlink layer still handles IPv6 endpoints, when it probably shouldn't. ==================== Link: https://lore.kernel.org/r/20220330013127.426620-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: socket: ignore v6 endpoints when ipv6 is disabledJason A. Donenfeld2022-03-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous commit fixed a memory leak on the send path in the event that IPv6 is disabled at compile time, but how did a packet even arrive there to begin with? It turns out we have previously allowed IPv6 endpoints even when IPv6 support is disabled at compile time. This is awkward and inconsistent. Instead, let's just ignore all things IPv6, the same way we do other malformed endpoints, in the case where IPv6 is disabled. Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: socket: free skb in send6 when ipv6 is disabledWang Hai2022-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I got a memory leak report: unreferenced object 0xffff8881191fc040 (size 232): comm "kworker/u17:0", pid 23193, jiffies 4295238848 (age 3464.870s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff814c3ef4>] slab_post_alloc_hook+0x84/0x3b0 [<ffffffff814c8977>] kmem_cache_alloc_node+0x167/0x340 [<ffffffff832974fb>] __alloc_skb+0x1db/0x200 [<ffffffff82612b5d>] wg_socket_send_buffer_to_peer+0x3d/0xc0 [<ffffffff8260e94a>] wg_packet_send_handshake_initiation+0xfa/0x110 [<ffffffff8260ec81>] wg_packet_handshake_send_worker+0x21/0x30 [<ffffffff8119c558>] process_one_work+0x2e8/0x770 [<ffffffff8119ca2a>] worker_thread+0x4a/0x4b0 [<ffffffff811a88e0>] kthread+0x120/0x160 [<ffffffff8100242f>] ret_from_fork+0x1f/0x30 In function wg_socket_send_buffer_as_reply_to_skb() or wg_socket_send_ buffer_to_peer(), the semantics of send6() is required to free skb. But when CONFIG_IPV6 is disable, kfree_skb() is missing. This patch adds it to fix this bug. Signed-off-by: Wang Hai <wanghai38@huawei.com> Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: queueing: use CFI-safe ptr_ring cleanup functionJason A. Donenfeld2022-03-301-1/+2
|/ | | | | | | | | | | | | | We make too nuanced use of ptr_ring to entirely move to the skb_array wrappers, but we at least should avoid the naughty function pointer cast when cleaning up skbs. Otherwise RAP/CFI will honk at us. This patch uses the __skb_array_destroy_skb wrapper for the cleanup, rather than directly providing kfree_skb, which is what other drivers in the same situation do too. Reported-by: PaX Team <pageexec@freemail.hu> Fixes: 0188f76c8d48 ("wireguard: receive: use ring buffer for incoming handshakes") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* wireguard: device: clear keys on VM forkJason A. Donenfeld2022-03-121-11/+27
| | | | | | | | | | | | | | | | | | | When a virtual machine forks, it's important that WireGuard clear existing sessions so that different plaintexts are not transmitted using the same key+nonce, which can result in catastrophic cryptographic failure. To accomplish this, we simply hook into the newly added vmfork notifier. As a bonus, it turns out that, like the vmfork registration function, the PM registration function is stubbed out when CONFIG_PM_SLEEP is not set, so we can actually just remove the maze of ifdefs, which makes it really quite clean to support both notifiers at once. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* lib/crypto: blake2s: move hmac construction into wireguardJason A. Donenfeld2022-01-181-6/+39
| | | | | | | | | | | | | | | Basically nobody should use blake2s in an HMAC construction; it already has a keyed variant. But unfortunately for historical reasons, Noise, used by WireGuard, uses HKDF quite strictly, which means we have to use this. Because this really shouldn't be used by others, this commit moves it into wireguard's noise.c locally, so that kernels that aren't using WireGuard don't get this superfluous code baked in. On m68k systems, this shaves off ~314 bytes. Cc: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Merge tag '5.17-net-next' of ↵Linus Torvalds2022-01-101-3/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Defer freeing TCP skbs to the BH handler, whenever possible, or at least perform the freeing outside of the socket lock section to decrease cross-CPU allocator work and improve latency. - Add netdevice refcount tracking to locate sources of netdevice and net namespace refcount leaks. - Make Tx watchdog less intrusive - avoid pausing Tx and restarting all queues from a single CPU removing latency spikes. - Various small optimizations throughout the stack from Eric Dumazet. - Make netdev->dev_addr[] constant, force modifications to go via appropriate helpers to allow us to keep addresses in ordered data structures. - Replace unix_table_lock with per-hash locks, improving performance of bind() calls. - Extend skb drop tracepoint with a drop reason. - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW. BPF --- - New helpers: - bpf_find_vma(), find and inspect VMAs for profiling use cases - bpf_loop(), runtime-bounded loop helper trading some execution time for much faster (if at all converging) verification - bpf_strncmp(), improve performance, avoid compiler flakiness - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt() for tracing programs, all inlined by the verifier - Support BPF relocations (CO-RE) in the kernel loader. - Further the support for BTF_TYPE_TAG annotations. - Allow access to local storage in sleepable helpers. - Convert verifier argument types to a composable form with different attributes which can be shared across types (ro, maybe-null). - Prepare libbpf for upcoming v1.0 release by cleaning up APIs, creating new, extensible ones where missing and deprecating those to be removed. Protocols --------- - WiFi (mac80211/cfg80211): - notify user space about long "come back in N" AP responses, allow it to react to such temporary rejections - allow non-standard VHT MCS 10/11 rates - use coarse time in airtime fairness code to save CPU cycles - Bluetooth: - rework of HCI command execution serialization to use a common queue and work struct, and improve handling errors reported in the middle of a batch of commands - rework HCI event handling to use skb_pull_data, avoiding packet parsing pitfalls - support AOSP Bluetooth Quality Report - SMC: - support net namespaces, following the RDMA model - improve connection establishment latency by pre-clearing buffers - introduce TCP ULP for automatic redirection to SMC - Multi-Path TCP: - support ioctls: SIOCINQ, OUTQ, and OUTQNSD - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT, IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY - support cmsgs: TCP_INQ - improvements in the data scheduler (assigning data to subflows) - support fastclose option (quick shutdown of the full MPTCP connection, similar to TCP RST in regular TCP) - MCTP (Management Component Transport) over serial, as defined by DMTF spec DSP0253 - "MCTP Serial Transport Binding". Driver API ---------- - Support timestamping on bond interfaces in active/passive mode. - Introduce generic phylink link mode validation for drivers which don't have any quirks and where MAC capability bits fully express what's supported. Allow PCS layer to participate in the validation. Convert a number of drivers. - Add support to set/get size of buffers on the Rx rings and size of the tx copybreak buffer via ethtool. - Support offloading TC actions as first-class citizens rather than only as attributes of filters, improve sharing and device resource utilization. - WiFi (mac80211/cfg80211): - support forwarding offload (ndo_fill_forward_path) - support for background radar detection hardware - SA Query Procedures offload on the AP side New hardware / drivers ---------------------- - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with real-time requirements for isochronous communication with protocols like OPC UA Pub/Sub. - Qualcomm BAM-DMUX WWAN - driver for data channels of modems integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974 (qcom_bam_dmux). - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver with support for bridging, VLANs and multicast forwarding (lan966x). - iwlmei driver for co-operating between Intel's WiFi driver and Intel's Active Management Technology (AMT) devices. - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips - Bluetooth: - MediaTek MT7921 SDIO devices - Foxconn MT7922A - Realtek RTL8852AE Drivers ------- - Significantly improve performance in the datapaths of: lan78xx, ax88179_178a, lantiq_xrx200, bnxt. - Intel Ethernet NICs: - igb: support PTP/time PEROUT and EXTTS SDP functions on 82580/i354/i350 adapters - ixgbevf: new PF -> VF mailbox API which avoids the risk of mailbox corruption with ESXi - iavf: support configuration of VLAN features of finer granularity, stacked tags and filtering - ice: PTP support for new E822 devices with sub-ns precision - ice: support firmware activation without reboot - Mellanox Ethernet NICs (mlx5): - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - support TC forwarding when tunnel encap and decap happen between two ports of the same NIC - dynamically size and allow disabling various features to save resources for running in embedded / SmartNIC scenarios - Broadcom Ethernet NICs (bnxt): - use page frag allocator to improve Rx performance - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - Other Ethernet NICs: - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support - Microsoft cloud/virtual NIC (mana): - add XDP support (PASS, DROP, TX) - Mellanox Ethernet switches (mlxsw): - initial support for Spectrum-4 ASICs - VxLAN with IPv6 underlay - Marvell Ethernet switches (prestera): - support flower flow templates - add basic IP forwarding support - NXP embedded Ethernet switches (ocelot & felix): - support Per-Stream Filtering and Policing (PSFP) - enable cut-through forwarding between ports by default - support FDMA to improve packet Rx/Tx to CPU - Other embedded switches: - hellcreek: improve trapping management (STP and PTP) packets - qca8k: support link aggregation and port mirroring - Qualcomm 802.11ax WiFi (ath11k): - qca6390, wcn6855: enable 802.11 power save mode in station mode - BSS color change support - WCN6855 hw2.1 support - 11d scan offload support - scan MAC address randomization support - full monitor mode, only supported on QCN9074 - qca6390/wcn6855: report signal and tx bitrate - qca6390: rfkill support - qca6390/wcn6855: regdb.bin support - Intel WiFi (iwlwifi): - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS) in cooperation with the BIOS - support for Optimized Connectivity Experience (OCE) scan - support firmware API version 68 - lots of preparatory work for the upcoming Bz device family - MediaTek WiFi (mt76): - Specific Absorption Rate (SAR) support - mt7921: 160 MHz channel support - RealTek WiFi (rtw88): - Specific Absorption Rate (SAR) support - scan offload - Other WiFi NICs - ath10k: support fetching (pre-)calibration data from nvmem - brcmfmac: configure keep-alive packet on suspend - wcn36xx: beacon filter support" * tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits) tcp: tcp_send_challenge_ack delete useless param `skb` net/qla3xxx: Remove useless DMA-32 fallback configuration rocker: Remove useless DMA-32 fallback configuration hinic: Remove useless DMA-32 fallback configuration lan743x: Remove useless DMA-32 fallback configuration net: enetc: Remove useless DMA-32 fallback configuration cxgb4vf: Remove useless DMA-32 fallback configuration cxgb4: Remove useless DMA-32 fallback configuration cxgb3: Remove useless DMA-32 fallback configuration bnx2x: Remove useless DMA-32 fallback configuration et131x: Remove useless DMA-32 fallback configuration be2net: Remove useless DMA-32 fallback configuration vmxnet3: Remove useless DMA-32 fallback configuration bna: Simplify DMA setting net: alteon: Simplify DMA setting myri10ge: Simplify DMA setting qlcnic: Simplify DMA setting net: allwinner: Fix print format page_pool: remove spinlock in page_pool_refill_alloc_cache() amt: fix wrong return type of amt_send_membership_update() ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-12-029-52/+59
| |\ | | | | | | | | | Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * \ Merge branch 'skbuff-struct-group'David S. Miller2021-11-221-3/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kees Cook says: ==================== skbuff: Switch structure bounds to struct_group() This is a pair of patches to add struct_group() to struct sk_buff. The first is needed to work around sparse-specific complaints, and is new for v2. The second patch is the same as originally sent as v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | skbuff: Switch structure bounds to struct_group()Kees Cook2021-11-221-3/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Replace the existing empty member position markers "headers_start" and "headers_end" with a struct_group(). This will allow memcpy() and sizeof() to more easily reason about sizes, and improve readability. "pahole" shows no size nor member offset changes to struct sk_buff. "objdump -d" shows no object code changes (outside of WARNs affected by source line number changes). Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> # drivers/net/wireguard/* Link: https://lore.kernel.org/lkml/20210728035006.GD35706@embeddedor Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge drm/drm-next into drm-misc-next-fixesThomas Zimmermann2021-12-179-52/+59
|\ \ \ | |/ / |/| / | |/ | | | | | | Backmerging to bring drm-misc-next-fixes up to the latest state for the current release cycle. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
| * Merge branch 'wireguard-siphash-patches-for-5-16-rc6'Jakub Kicinski2021-11-299-52/+59
|/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason A. Donenfeld says: ==================== wireguard/siphash patches for 5.16-rc Here's quite a largeish set of stable patches I've had queued up and testing for a number of months now: - Patch (1) squelches a sparse warning by fixing an annotation. - Patches (2), (3), and (5) are minor improvements and fixes to the test suite. - Patch (4) is part of a tree-wide cleanup to have module-specific init and exit functions. - Patch (6) fixes a an issue with dangling dst references, by having a function to release references immediately rather than deferring, and adds an associated test case to prevent this from regressing. - Patches (7) and (8) help mitigate somewhat a potential DoS on the ingress path due to the use of skb_list's locking hitting contention on multiple cores by switching to using a ring buffer and dropping packets on contention rather than locking up another core spinning. - Patch (9) switches kvzalloc to kvcalloc for better form. - Patch (10) fixes alignment traps in siphash with clang-13 (and maybe other compilers) on armv6, by switching to using the unaligned functions by default instead of the aligned functions by default. ==================== Link: https://lore.kernel.org/r/20211129153929.3457-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()Gustavo A. R. Silva2021-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Use 2-factor argument form kvcalloc() instead of kvzalloc(). Link: https://github.com/KSPP/linux/issues/162 Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> [Jason: Gustavo's link above is for KSPP, but this isn't actually a security fix, as table_size is bounded to 8192 anyway, and gcc realizes this, so the codegen comes out to be about the same.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: receive: drop handshakes if queue lock is contendedJason A. Donenfeld2021-11-291-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're being delivered packets from multiple CPUs so quickly that the ring lock is contended for CPU tries, then it's safe to assume that the queue is near capacity anyway, so just drop the packet rather than spinning. This helps deal with multicore DoS that can interfere with data path performance. It _still_ does not completely fix the issue, but it again chips away at it. Reported-by: Streun Fabio <fstreun@student.ethz.ch> Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: receive: use ring buffer for incoming handshakesJason A. Donenfeld2021-11-295-43/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently the spinlock on incoming_handshake's skb_queue is highly contended, and a torrent of handshake or cookie packets can bring the data plane to its knees, simply by virtue of enqueueing the handshake packets to be processed asynchronously. So, we try switching this to a ring buffer to hopefully have less lock contention. This alleviates the problem somewhat, though it still isn't perfect, so future patches will have to improve this further. However, it at least doesn't completely diminish the data plane. Reported-by: Streun Fabio <fstreun@student.ethz.ch> Reported-by: Joel Wanner <joel.wanner@inf.ethz.ch> Fixes: a8f1bc7bdea3 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: device: reset peer src endpoint when netns exitsJason A. Donenfeld2021-11-292-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each peer's endpoint contains a dst_cache entry that takes a reference to another netdev. When the containing namespace exits, we take down the socket and prevent future sockets from being created (by setting creating_net to NULL), which removes that potential reference on the netns. However, it doesn't release references to the netns that a netdev cached in dst_cache might be taking, so the netns still might fail to exit. Since the socket is gimped anyway, we can simply clear all the dst_caches (by way of clearing the endpoint src), which will release all references. However, the current dst_cache_reset function only releases those references lazily. But it turns out that all of our usages of wg_socket_clear_peer_endpoint_src are called from contexts that are not exactly high-speed or bottle-necked. For example, when there's connection difficulty, or when userspace is reconfiguring the interface. And in particular for this patch, when the netns is exiting. So for those cases, it makes more sense to call dst_release immediately. For that, we add a small helper function to dst_cache. This patch also adds a test to netns.sh from Hangbin Liu to ensure this doesn't regress. Tested-by: Hangbin Liu <liuhangbin@gmail.com> Reported-by: Xiumei Mu <xmu@redhat.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Fixes: be0d977e56c7 ("wireguard: device: avoid circular netns references") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: main: rename 'mod_init' & 'mod_exit' functions to be module-specificRandy Dunlap2021-11-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename module_init & module_exit functions that are named "mod_init" and "mod_exit" so that they are unique in both the System.map file and in initcall_debug output instead of showing up as almost anonymous "mod_init". This is helpful for debugging and in determining how long certain module_init calls take to execute. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * wireguard: allowedips: add missing __rcu annotation to satisfy sparseJason A. Donenfeld2021-11-291-1/+1
|/ | | | | | | | | A __rcu annotation got lost during refactoring, which caused sparse to become enraged. Fixes: aa6fe5251561 ("wireguard: allowedips: free empty intermediate nodes when removing single node") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>