Guides - Recover from Unexpected Shutdowns with Lassie (Shutdown Watchdog)
Linux virtual machines equipped with a tailored set of resources designed to run any cloud-based workload.
Linode Compute Instances have a featured called Lassie (Linode Autonomous System Shutdown Intelligent rEbooter), also referred to as the Shutdown Watchdog. When this feature is enabled, a Compute Instance automatically reboots if it ever powers off unexpectedly.
Shutdown Recovery Behavior
The Shutdown Watchdog feature detects when a Compute Instance is powered off and checks if that directive came from the Linode platform (such as the Cloud Manager or Linode API). If the power off command did not originate from the Linode platform, the shutdown is considered unexpected and the Compute Instance is automatically powered back on.
Enable (or Disable) Shutdown Watchdog
By default, Shutdown Watchdog is enabled on all new Compute Instances. If you wish to disable or re-enable this feature, follow the instructions below:
Log in to the Cloud Manager and navigate to the Linodes link in the sidebar.
Select the Linode Compute Instance that you wish to modify.
Navigate to the Settings tab.
Scroll down to the section labeled Shutdown Watchdog.
From here, click the corresponding toggle button to update this setting to the desired state, either enabled or disabled.
Reasons for an Unexpected Shutdown
An unexpected shutdown is when a Compute Instance powers off without receiving a power off command from the Linode platform (such as one issued by a user in the Cloud Manager or API). In general, this is caused within a Compute Instance’s internal system or software configuration. The following list includes potential reasons for these unexpected shutdowns.
A user issues the shutdown command in the shell environment of a Compute Instance. In Linux, a system can be powered off by entering the
shutdowncommand (or other similar commands) in the system’s terminal. Since Linode has no knowledge of internal commands issued on a Compute Instance, it is considered an unexpected shutdown.
Kernel panic: A kernel panic can occur when your system detects a fatal error and it isn’t able to safely recover. Here is an example of a console log entry that indicates a kernel panic has occurred:
Kernel panic - not syncing: No working init found.
Out of memory (OOM) error: When a Linux system runs out of memory, it can start killing processes to free up additional memory. In many cases, your system remains accessible but some of the software you use may stop functioning properly. OOMing can occasionally result in your system becoming unresponsive or crashing, causing an unexpected shutdown.
kernel: Out of memory: Kill process [...]
Other system crashes, such as a crash caused by the software installed on your system or a malicious process (such as malware).
Investigate the Cause of a Shutdown
The underlying cause of these issues can vary. The most helpful course of action is to review your system logs.
Open the Lish console. This displays your system’s boot log and, if your system boot was normal, a login prompt appears. If you do not see a login prompt, look for any errors or unexpected output that indicates a kernel panic, file system corruption, or other type of system crash.
Log in to your system through either SSH or Lish and review the log files for you system using either journald or syslog. For systems using systemd-journald for logging, you can use the
journalctlcommand to review system logs. See Use journalctl to View Your System’s Logs for instructions.
journalctl -b: Log entries for the last system boot
journalctl -k: Kernel messages
For systems using syslog, you should review the following log files using your preferred text editor (such as nano or vim) or file viewer (such as cat or less).
/var/log/syslog: Most logs as recorded by syslog.
/var/log/boot.log: Log entries for the last system boot
/var/log/kern.log: Kernel messages
/var/log/messages: Various system notifications and messages typically recorded at boot.
You may also want to review log files for any other software you have installed on your system that might be causing these issues.
File System Corruption
In some cases, unexpected shutdowns can cause file system corruption on a Compute Instance. If an error message (such as the one below) appears within your console logs, your file system may be corrupt or otherwise be in an inconsistent state.
/dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
In cases like this, it is recommended that you attempt to correct the issue by running the
fsck tool in Rescue Mode. See Using fsck to Find and Repair Disk Errors and Bad Sectors for instructions.
This page was originally published on