Why do I keep losing connection to my Linode after 12 hours?

Linode Staff

It's a bit weird. Happens everytime. After about 12 hours (I am not sure), I just cannot connect to my Linode.
SSH does not work, even though SSH Daemon is running (checked from Weblish); When trying to access stuff I'm hosting on Linode, every single one of them says unreachable. Once rebooting, everything goes back to normal.
I've been rebooting manually ever since, but it gets tiresome.

networkctl
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 eth0 ether routable failed
3 br-03900297e9ec bridge routable unmanaged
4 br-40f9222d63fd bridge no-carrier unmanaged
5 br-e0e4fc002b2e bridge routable unmanaged
6 br-efc86c063fc3 bridge no-carrier unmanaged
7 docker0 bridge no-carrier unmanaged
8 br-33360848ce2c bridge routable unmanaged
9 br-4b6abeade984 bridge routable unmanaged
10 br-685ede2cdfc3 bridge no-carrier unmanaged
11 br-7557cc666f1f bridge routable unmanaged
12 br-90fd378f3161 bridge no-carrier unmanaged
13 br-d89d19903abd bridge no-carrier unmanaged
14 br-d8b8a84a8134 bridge routable unmanaged
16 vethc4da231 ether degraded unmanaged
18 veth6512b84 ether carrier unmanaged
20 vetha52ba75 ether degraded unmanaged
22 veth8370c55 ether degraded unmanaged
24 veth6e506bf ether degraded unmanaged
26 veth75f75e1 ether degraded unmanaged
28 veth7f8f6a3 ether carrier unmanaged
30 vethcffefd8 ether degraded unmanaged
32 veth3a156a9 ether degraded unmanaged
34 veth71c06c8 ether degraded unmanaged
36 vethb4ad7cf ether degraded unmanaged
38 vethe5fb405 ether degraded unmanaged
40 veth4705ecc ether degraded unmanaged
42 vetheb4df8b ether degraded unmanaged
44 vethd5c2a38 ether degraded unmanaged
46 veth187c19f ether degraded unmanaged
48 veth6c38738 ether degraded unmanaged
50 veth759293c ether degraded unmanaged
52 vethd5fde81 ether degraded unmanaged
54 veth4b3dd0b ether degraded unmanaged
58 vetha4e1a34 ether degraded unmanaged
60 veth05bb6df ether degraded unmanaged
62 veth4d7f78e ether degraded unmanaged
64 veth332e97e ether degraded unmanaged
66 veth1c832b0 ether degraded unmanaged
68 vethb63929f ether degraded unmanaged
70 vetha370484 ether degraded unmanaged
72 vethd15710f ether degraded unmanaged
74 veth2f0411c ether degraded unmanaged
76 veth6a215c5 ether degraded unmanaged

> systemctl status systemd-networkd
? systemd-networkd.service - Network Service
Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled;>
Active: active (running) since Tue 2020-05-19 22:11:33 JST; 1 weeks 0 days>
TriggeredBy: ? systemd-networkd.socket
Docs: man:systemd-networkd.service(8)
Main PID: 263 (systemd-network)
Status: "Processing requests..."
Tasks: 1 (limit: 2367)
Memory: 2.4M
CGroup: /system.slice/systemd-networkd.service
??263 /usr/lib/systemd/systemd-networkd

May 27 14:37:34 [HOSTNAME] systemd-networkd[263]: rtnl: received neighbor for link '21826' we don't know about, ignoring.
May 27 14:37:34 [HOSTNAME] systemd-networkd[263]: rtnl: received neighbor for link '21826' we don't know about, ignoring.
May 27 14:38:34 [HOSTNAME] systemd-networkd[263]: veth1ef0e81: rtnl: received neighbor message with invalid family, ignoring.
May 27 14:38:34 S[HOSTNAME] systemd-networkd[263]: veth1ef0e81: rtnl: received neighbor message with invalid family, ignoring.
May 27 14:38:34 [HOSTNAME] systemd-networkd[263]: veth1ef0e81: Link UP
May 27 14:38:34 [HOSTNAME] systemd-networkd[263]: veth1ef0e81: Gained carrier
May 27 14:38:35 [HOSTNAME] systemd-networkd[263]: veth1ef0e81: Lost carrier
May 27 14:38:35 [HOSTNAME] systemd-networkd[263]: veth1ef0e81: Link DOWN
May 27 14:38:35 [HOSTNAME] systemd-networkd[263]: rtnl: received neighbor for link '21826' we don't know about, ignoring.
May 27 14:38:35 [HOSTNAME] systemd-networkd[263]: rtnl: received neighbor for link '21826' we don't know about, ignoring.

12 Replies

Hey there,

It looks like this may be caused by udev and systemd-networkd both trying to control the naming scheme of your network interfaces.

Can you try running echo 'GRUB_CMDLINE_LINUX="net.ifnames=0"' >>/etc/default/grub, and then rebooting your Linode? It looks like this should prevent udev from trying to apply the predictable network interface names which should hopefully resolve this for you.

Let us know if this helps!

Regards,
Ryan L.
Linode Support Staff

Done. The system booted fine.
I will wait for a day and check back (since that is the only way to see if it's working unfortunately.)

(Also, if anyone has the same problem and is reading this thread, make sure to run grub-mkconfig -o /boot/grub/grub.cfg afterwards. Though it might be an Arch only thing.)

So far about 18 hours have passed, and it's working!
Thank you @rl0nergan and the linode team very, very much.

Uh oh. It's happening again.
The MTR test gives me the "network unreachable" error.
After rebooting, I could confirm that the MTR works again, and networkctl output is different:

2 eth0 ether routable configured

Does this information help?

Hi @rl0nergan, could you help me?

Hey @Zaen,

I'm taking another look at into it now and trying to recreate it on my end. Hopefully I can find some more information for you that might be able to help. I'll follow up with you when I have some more info.

Ryan L.

This may be unrelated but….

On another VPS provider (not Linode) I lost my IPv4 connection every 12 hours.

The VPS was configured to use DHCP which gave a lifetime of - you guessed it - 12 hours.

The networkctl output resembled yours:

IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 eth0 ether routable failed

Because eth0 was 'failed', systemd didn't issue the renewal and the IP address lease expired.

In my case, it was due to a route conflict it was receiving from the DHCP server, and I had to tell netplan to ignore DHCP routes, which prevented the error when it first obtained the IP, and allowed systemd to see that all was OK with the adapter and it renewed the lease happily.

Linodes by default (I believe) don't use DHCP, so this likely doesn't apply, but I thought I'd share just in case.

Hey @Zaen,

I've been trying to recreate this on my end, but unfortunately with no success. Did you check out Andy's suggestion to see if that might help?

Ryan L.

I didn't notice the replies, sorry to both @andysh and @rl0nergan.
Is there a way to subscribe to this thread? (email)

I don't know what netplan is, but I am in fact using dhcpcd. I installed and enabled it when I installed Arch, and left it that way ever since. Is that the problem?

I checked the systemd log, and there was some normal 'deleting address' and 'carrier acquired' stuff going on, but as I have just rebooted my linode, the problem is not happening right now. I will post the dhcpcd logs after the problem occurs again.

I don't know what netplan is

Netplan is an Ubuntu thing; it allows defining networking in YAML files, and writes out the relevant config files for different backends.

I checked the systemd log, and there was some normal 'deleting address' and 'carrier acquired' stuff going on, but as I have just rebooted my linode, the problem is not happening right now. I will post the dhcpcd logs after the problem occurs again.

In my case, systemd didn't even try to renew the lease. It was a failure in the initial config of the adapter when the system was first booted. systemd was obtaining a route from the DHCP server that it believed was invalid.

Because of the error, systemd believed the adapter wasn't fully configured so wasn't watching for when the IP expired and didn't request a new IP from DHCP when it should have done.

In my experience with Linode, you are much better off configuring addresses statically; they won't change.

https://www.linode.com/docs/networking/linux-static-ip-configuration/

Hey @Zaen,

No worries! That's actually a great idea, I can see how that would be helpful. I've passed that feedback along to our developers to see if that's something we could implement in the future.

Ryan L.
Linode Support Staff

@andysh, @rl0nergan So sorry for going radio silent once again…
Weirdly (or not), the problem disappeared! Or rather, the network fails sometimes, and after that it just recovers. I'm guessing that's the case from looking at my Seafile (a cloud server thingy) client, like it says last updated 4 days ago, but when I refresh it, it starts working again - so the problem must have happened 4 days ago, and seafile tried reaching the server a bunch of times before giving up.

@andysh I have bookmarked the link, and I will use it in the future when I'm fixing this problem once and for all.
Because of the whole Covid thing I'm working outside less, so my motive to fix this is very low. I know I asked this question, but… I'm lazy :( I really appreciate both your help, please don't get me wrong.

I also looked everywhere to enable email notification for forum post, but I can't find it.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct