wireguard-go.git - A snapshot of wireguard-go

	Commit message (Collapse)	Author	Age	Files	Lines
*	conn: make binds replacable	Jason A. Donenfeld	2021-02-23	1	-10/+3
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: return error from Up() and Down()	Jason A. Donenfeld	2021-02-10	1	-13/+19
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: handshake routine writes into encryption queue	Jason A. Donenfeld	2021-02-09	1	-0/+1
\| \| \| \| \| \| \|	Since RoutineHandshake calls peer.SendKeepalive(), it potentially is a writer into the encryption queue, so we need to bump the wg count. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: make RoutineReadFromTUN keep encryption queue alive	Josh Bleecher Snyder	2021-02-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RoutineReadFromTUN can trigger a call to SendStagedPackets. SendStagedPackets attempts to protect against sending on the encryption queue by checking peer.isRunning and device.isClosed. However, those are subject to TOCTOU bugs. If that happens, we get this: goroutine 1254 [running]: golang.zx2c4.com/wireguard/device.(Peer).SendStagedPackets(0xc000798300) .../wireguard-go/device/send.go:321 +0x125 golang.zx2c4.com/wireguard/device.(Device).RoutineReadFromTUN(0xc000014780) .../wireguard-go/device/send.go:271 +0x21c created by golang.zx2c4.com/wireguard/device.NewDevice .../wireguard-go/device/device.go:315 +0x298 Fix this with a simple, big hammer: Keep the encryption queue alive as long as it might be written to. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: clarify device.state.state docs (again)	Josh Bleecher Snyder	2021-02-09	1	-2/+4
\| \| \| \|	Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: rename unsafeRemovePeer to removePeerLocked	Jason A. Donenfeld	2021-02-09	1	-9/+5
\| \| \| \| \| \|	This matches the new naming scheme of upLocked and downLocked. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: remove deviceStateNew	Jason A. Donenfeld	2021-02-09	1	-8/+6
\| \| \| \| \| \| \|	It's never used and we won't have a use for it. Also, move to go-running stringer, for those without GOPATHs. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: fix comment typo and shorten state.mu.Lock to state.Lock	Jason A. Donenfeld	2021-02-09	1	-8/+7
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: fix typo in comment	Jason A. Donenfeld	2021-02-09	1	-1/+1
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: fix alignment on 32-bit machines and test for it	Jason A. Donenfeld	2021-02-09	1	-6/+1
\| \| \| \| \| \| \| \| \|	The test previously checked the offset within a substruct, not the offset within the allocated struct, so this adds the two together. It then fixes an alignment crash on 32-bit machines. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: do not log on idempotent device state change	Jason A. Donenfeld	2021-02-09	1	-1/+0
\| \| \| \| \| \| \|	Part of being actually idempotent is that we shouldn't penalize code that takes advantage of this property with a log splat. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: create channels.go	Josh Bleecher Snyder	2021-02-08	1	-61/+0
\| \| \| \| \| \| \|	We have a bunch of stupid channel tricks, and I'm about to add more. Give them their own file. This commit is 100% code movement. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: remove device.state.stopping from RoutineTUNEventReader	Josh Bleecher Snyder	2021-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The TUN event reader does three things: Change MTU, device up, and device down. Changing the MTU after the device is closed does no harm. Device up and device down don't make sense after the device is closed, but we can check that condition before proceeding with changeState. There's thus no reason to block device.Close on RoutineTUNEventReader exiting. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: overhaul device state management	Josh Bleecher Snyder	2021-02-08	1	-128/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit simplifies device state management. It creates a single unified state variable and documents its semantics. It also makes state changes more atomic. As an example of the sort of bug that occurred due to non-atomic state changes, the following sequence of events used to occur approximately every 2.5 million test runs: * RoutineTUNEventReader received an EventDown event. * It called device.Down, which called device.setUpDown. * That set device.state.changing, but did not yet attempt to lock device.state.Mutex. * Test completion called device.Close. * device.Close locked device.state.Mutex. * device.Close blocked on a call to device.state.stopping.Wait. * device.setUpDown then attempted to lock device.state.Mutex and blocked. Deadlock results. setUpDown cannot progress because device.state.Mutex is locked. Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done. Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked. As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked. This commit fixes that deadlock by holding device.state.mu when checking that the device is not closed. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: remove device.state.stopping from RoutineHandshake	Josh Bleecher Snyder	2021-02-08	1	-1/+0
\| \| \| \| \| \|	It is no longer necessary. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: remove device.state.stopping from RoutineDecryption	Josh Bleecher Snyder	2021-02-08	1	-1/+1
\| \| \| \| \| \| \|	It is no longer necessary, as of 454de6f3e64abd2a7bf9201579cd92eea5280996 (device: use channel close to shut down and drain decryption channel). Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: tie encryption queue lifetime to the peers that write to it	Josh Bleecher Snyder	2021-02-03	1	-2/+4
\| \| \| \| \|	Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: use a waiting sync.Pool instead of a channel	Jason A. Donenfeld	2021-02-02	1	-6/+3
\| \| \| \| \| \|	Channels are FIFO which means we have guaranteed cache misses. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: use int64 instead of atomic.Value for time stamp	Jason A. Donenfeld	2021-01-29	1	-13/+3
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: use new model queues for handshakes	Jason A. Donenfeld	2021-01-29	1	-27/+28
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: simplify peer queue locking	Jason A. Donenfeld	2021-01-29	1	-14/+14
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	global: bump copyright	Jason A. Donenfeld	2021-01-28	1	-1/+1
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: do not allow get to run while set runs	Jason A. Donenfeld	2021-01-28	1	-1/+2
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: avoid deadlock when changing private key and removing self peers	Jason A. Donenfeld	2021-01-27	1	-0/+2
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: use linked list for per-peer allowed-ip traversal	Jason A. Donenfeld	2021-01-27	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes the IpcGet method much faster. We also refactor the traversal API to use a callback so that we don't need to allocate at all. Avoiding allocations we do self-masking on insertion, which in turn means that split intermediate nodes require a copy of the bits. benchmark old ns/op new ns/op delta BenchmarkUAPIGet-16 3243 2659 -18.01% benchmark old allocs new allocs delta BenchmarkUAPIGet-16 35 30 -14.29% benchmark old bytes new bytes delta BenchmarkUAPIGet-16 1218 737 -39.49% This benchmark is good, though it's only for a pair of peers, each with only one allowedips. As this grows, the delta expands considerably. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: combine debug and info log levels into 'verbose'	Jason A. Donenfeld	2021-01-26	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	There are very few cases, if any, in which a user only wants one of these levels, so combine it into a single level. While we're at it, reduce indirection on the loggers by using an empty function rather than a nil function pointer. It's not like we have retpolines anyway, and we were always calling through a function with a branch prior, so this seems like a net gain. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: change logging interface to use functions	Josh Bleecher Snyder	2021-01-26	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit overhauls wireguard-go's logging. The primary, motivating change is to use a function instead of a *log.Logger as the basic unit of logging. Using functions provides a lot more flexibility for people to bring their own logging system. It also introduces logging helper methods on Device. These reduce line noise at the call site. They also allow for log functions to be nil; when nil, instead of generating a log line and throwing it away, we don't bother generating it at all. This spares allocation and pointless work. This is a breaking change, although the fix required of clients is fairly straightforward. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: serialize access to IpcSetOperation	Josh Bleecher Snyder	2021-01-25	1	-0/+1
\| \| \| \| \| \|	Interleaves IpcSetOperations would spell trouble. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: remove unnecessary zeroing	Josh Bleecher Snyder	2021-01-20	1	-5/+0
\| \| \| \| \| \|	Newly allocated objects are already zeroed. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: put handshake buffer in pool in FlushPacketQueues	Josh Bleecher Snyder	2021-01-20	1	-1/+2
\| \| \| \| \| \|	This appears to have been an oversight. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: use channel close to shut down and drain decryption channel	Josh Bleecher Snyder	2021-01-20	1	-12/+25
\| \| \| \| \| \| \| \| \|	This is similar to commit e1fa1cc5560020e67d33aa7e74674853671cf0a0, but for the decryption channel. It is an alternative fix to f9f655567930a4cd78d40fa4ba0d58503335ae6a. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: receive: drain decryption queue before exiting RoutineDecryption	Jason A. Donenfeld	2021-01-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	It's possible for RoutineSequentialReceiver to try to lock an elem after RoutineDecryption has exited. Before this meant we didn't then unlock the elem, so the whole program deadlocked. As well, it looks like the flush code (which is now potentially unnecessary?) wasn't properly dropping the buffers for the not-already-dropped case. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	all: use ++ to increment	Josh Bleecher Snyder	2021-01-07	1	-1/+1
\| \| \| \| \| \|	Make the code slightly more idiomatic. No functional changes. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: add missing colon to error line	Jason A. Donenfeld	2021-01-07	1	-1/+1
\| \| \| \| \| \| \|	People are actually hitting this condition, so make it uniform. Also, change a printf into a println, to match the other conventions. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	device: fix data race in peer.timersActive	Josh Bleecher Snyder	2021-01-07	1	-2/+4
\| \| \| \| \| \| \| \| \|	Found by the race detector and existing tests. To avoid introducing a lock into this hot path, calculate and cache whether any peers exist. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: fix persistent_keepalive_interval data races	Josh Bleecher Snyder	2021-01-07	1	-1/+1
\| \| \| \| \|	Co-authored-by: David Anderson <danderson@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: use channel close to shut down and drain encryption channel	Josh Bleecher Snyder	2021-01-07	1	-7/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new test introduced in this commit used to deadlock about 1% of the time. I believe that the deadlock occurs as follows: * The test completes, calling device.Close. * device.Close closes device.signals.stop. * RoutineEncryption stops. * The deferred function in RoutineEncryption drains device.queue.encryption. * RoutineEncryption exits. * A peer's RoutineNonce processes an element queued in peer.queue.nonce. * RoutineNonce puts that element into the outbound and encryption queues. * RoutineSequentialSender reads that elements from the outbound queue. * It waits for that element to get Unlocked by RoutineEncryption. * RoutineEncryption has already exited, so RoutineSequentialSender blocks forever. * device.RemoveAllPeers calls peer.Stop on all peers. * peer.Stop waits for peer.routines.stopping, which blocks forever. Rather than attempt to add even more ordering to the already complex centralized shutdown orchestration, this commit moves towards a data-flow-oriented shutdown. The device.queue.encryption gets closed when there will be no more writes to it. All device.queue.encryption readers always read until the channel is closed and then exit. We thus guarantee that any element that enters the encryption queue also exits it. This removes the need for central control of the lifetime of RoutineEncryption, removes the need to drain the encryption queue on shutdown, and simplifies RoutineEncryption. This commit also fixes a data race. When RoutineSequentialSender drains its queue on shutdown, it needs to lock the elem before operating on it, just as the main body does. The new test in this commit passed 50k iterations with the race detector enabled and 150k iterations with the race detector disabled, with no failures. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: remove starting waitgroups	Josh Bleecher Snyder	2021-01-07	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In each case, the starting waitgroup did nothing but ensure that the goroutine has launched. Nothing downstream depends on the order in which goroutines launch, and if the Go runtime scheduler is so broken that goroutines don't get launched reasonably promptly, we have much deeper problems. Given all that, simplify the code. Passed a race-enabled stress test 25,000 times without failure. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
*	device: wait for routines to stop before removing peers	Dmytro Shynkevych	2020-07-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Peers are currently removed after Device's goroutines are signaled to stop, but without waiting for them to actually do so, which is racy. For example, RoutineHandshake may be in Peer.SendKeepalive when the corresponding peer is removed, which closes its nonce channel. This causes a send on a closed channel, as observed in tailscale/tailscale#487. This patch seems to be the correct synchronizing action: Peer's goroutines are receivers and handle channel closure gracefully, so Device's goroutines are the ones that should be fully stopped first. Signed-Off-By: Dmytro Shynkevych <dmytro@tailscale.com>
*	device: export Bind and remove socketfd shims for android	David Crawshaw	2020-06-22	1	-0/+6
\| \| \| \|	Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
*	global: update header comments and modules	Jason A. Donenfeld	2020-05-02	1	-1/+1
\| \| \| \|	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
*	conn: introduce new package that splits out the Bind and Endpoint types	David Crawshaw	2020-05-02	1	-10/+136
\| \| \| \| \| \| \| \| \| \|	The sticky socket code stays in the device package for now, as it reaches deeply into the peer list. This is the first step in an effort to split some code out of the very busy device package. Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
*	noise: unify zero checking of ecdh	Jason A. Donenfeld	2020-03-17	1	-3/+0
\|
*	device: fix private key removal logic	Jason A. Donenfeld	2020-02-04	1	-13/+4
\|
*	device: drop lock before expiring keys	Jason A. Donenfeld	2019-08-05	1	-4/+11
\|
*	device: immediately rekey all peers after changing device private key	Jason A. Donenfeld	2019-07-11	1	-0/+6
\| \| \| \|	Reported-by: Derrick Pallas <derrick@pallas.us>
*	tun: remove TUN prefix from types to reduce stutter elsewhere	Matt Layher	2019-06-14	1	-3/+2
\| \| \| \|	Signed-off-by: Matt Layher <mdlayher@gmail.com>
*	device: add SendKeepalivesToPeersWithCurrentKeypair for handover	Jason A. Donenfeld	2019-05-30	1	-0/+17
\|
*	device: fail to give bind if it doesn't exist	Jason A. Donenfeld	2019-05-17	1	-0/+1
\|
*	global: regroup all imports	Jason A. Donenfeld	2019-05-14	1	-2/+3
\|