After >year, WireGuard stops sharing usable data with clients
Hi folks. Unlike https://www.linode.com/community/questions/21946/marketplace-wireguard-install-not-working I manually installed wireguard, unbound & pi-hole to my Ubuntu 20.10 Linode a year ago, and have likewise manually upgraded it all the way to 21.10. Yes, version compatibility red flags, which I will address below.
That said, using it as a WireGuard server which serves DNS and ad-blocking to its clients has worked largely without issue (except for the occasional need to reboot after updates, and at one time setting up a .conf file inside /etc/dnsmasq.d .conf file with content
interface=wg0, then later upon subsequent updates, removing it) for more than a solid year. It's never been down more than a couple of hours at a time, for (always successful) troubleshooting. Until yesterday.
I updated, the watchdog told me I had no new kernel modules or services that had needed restarting, so I just exited out & rebooted when I got to a point where I could use the Linode Cloud Manager and/or LISH console in (and therefore not deal with say, ssh needing to restart). After this reboot, only nominal connection data got shared between wg0 server and client (say, 92B, then 188B, then slowly growing amounts of data that the client received, while it sent several un-responded KB; both client in its interface & server using
sudo wg show though, acknowledged these amounts, which eventually get to the hundreds of KB--nowhere near the MBs of GBs of a usable connection. That is reflected by the fact that clients had no connection to web pages or any other usable network services). This has happened before, always with a resolution. So I went through my troubleshooting and rebooted from within the server with
sudo reboot, rebooted from the Linode Cloud manager, etc. I put in and removed the
interface=wg0 .conf file into /etc/dnsmasq.d over subsequent reboots. Nothing.
Thinking that pihole's been nagging me for months that it can't support being run on anything higher than Ubuntu 20.04 LTS, and that other software packages may crop up incompatibilities as they are installed to a non-LTS system (and having a physical server running the same services, plus some others, on 20.04 LTS solidly), I spun up a new 20.04 LTS Linode and set about installing wireguard, unbound & pihole (all manually via
apt or pihole's own install script, none of it via MarketPlace).
Configuration for all three of these servers (old 21.10 Linode, new 20.04 LTS Linode, and physical 20.04 LTS server) is as follows (minus, of course, the individual keys and stuff I have to redact):
On both Linodes, I've made sure to edit sysctl.conf so
net.ipv4.ip_forward=1, and ufw is set as such:
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
New profiles: skip
To Action From
-- ------ ----
[wg0 ListenPort]/udp ALLOW IN Anywhere
53 on wg0 ALLOW IN 10.[x.y].0/24
[ssh]/tcp ALLOW IN Anywhere
80/tcp on wg0 ALLOW IN 10.[x.y].0/24
[wg0 ListenPort]/udp (v6) ALLOW IN Anywhere (v6)
[ssh]/tcp (v6) ALLOW IN Anywhere (v6)
Ports or sections of IPv4 addresses in brackets are those I've set to my own values (and don't worry, set properly in sshd_config and wg0.conf). PiHole/unbound are also set within unbound's pi-hole.conf internally to a different DNS port which it then converts to the port 53 clients expect, to respond to their queries; again this system has worked on the 21.10 linode and by 20.04 physical home server, for more than a year.
The 80 port open for wg0 clients is so that they can log into the (insecure but internal) pihole web interface, which nobody else should be looking at, anyway.
wg0.conf is set up on the server (with redactions & only one peer listed, showing the format used for all peers) like this. You'll notice that
# comments are copied/pasted out of saving myself some typing while retaining a functional reference for each line:
# Server configuration
PrivateKey = [notshared] # The server_private.key value.
Address = 10.[x.y].1/24 # Internal IP address of the VPN server.
ListenPort = [notshared] # Previously, we opened this port to listen for incoming connections to the firewall.
# Change "enp0s5" to the name of your network interface in the following two settings. This command configures iptables for WireGuard.
PostUp = iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
# Configurations for the clients. You need to add a [Peer] section for each VPN client.
PublicKey = [notshared] # [client1's] client_public.key value.
AllowedIPs = 10.[x.y].2/32 # Internal address of the VPN client.
landscape-sysinfo on login and manually invoked does show a wg0 interface, and
ip a show wg0 lists it as expected. Again,
sudo wg show and the client's WireGuard interface both show data going back and forth between the server and client peers, just very small amounts and no webpages/other useful internet traffic coming through.
I reached out to Linode support, who, on my suspicion that WireGuard might be generally, temporarily & intermittently having trouble sending data through their networks, set up an Ubuntu 20.10 server with wireguard, unbound & pihole, and it worked for them, including after reboot. I didn't hear whether they deployed WireGuard from the MarketPlace, or manually via
Any thoughts? Anything anyone can see that I'm missing? Worked fine (with troubleshooting & maintenance) until yesterday, and then a new LTS server set up the same way wouldn't work with the same setup.
UPDATE: about 30 hours after the update & (not directly) subsequent reboot that somehow broke WireGuard functionality, it seems to be working again on the original Linode. I'd changed nothing, from software to configuration files. I had just rebooted it one final time, left it alone for a while, and later found myself idly considering the option to connect a Linux client.
After all that time banging my head against a wall, it worked. So did an iOS client. Problem apparently solved (for now). Without getting paranoid about the level of detail I shared above, all I can think is that either Linode support found something external to my Linode to reset, or there was a more general system-wide flush of network configuration caches with a cron job set for a longer-than-a-day cycle.
In any event, I'm apparently back to functionality. I'll cautiously take it. If Linode support had a hand, thanks!