Jump to content


Photo

Journald Memory Leak on VPS


  • This topic is locked This topic is locked
No replies to this topic

#1 HelioHost

HelioHost

    Announcements

  • Moderators
  • 6,337 posts
  • Gender:Not Telling

Posted 27 September 2020 - 04:25 PM

We've just discovered that some of our VPS may be affected by a memory leak. Ubuntu 20 for sure has this issue, and other OS choices may as well. Here are some of the symptoms you can check to see if this issue is affecting your VPS:
 
/var/log/journal/ is huge:
root@krydos5:/var/log/journal# du -sh /var/log/journal
2.5G    /var/log/journal
The journald process is using tons of memory. This command shows you the percent of memory that journald is using. In this example journald is using 23.6% of the total system memory:
root@krydos5:/var/log/journal# ps -o %mem,command ax|grep -v grep|grep journald
23.6 /lib/systemd/systemd-journald
The journal log is full of lines about sda every few seconds:
root@krydos5:/home/krydos# journalctl -xe
Sep 27 15:42:24 krydos5.heliohost.org multipathd[685]: sda: add missing path
Sep 27 15:42:24 krydos5.heliohost.org multipathd[685]: sda: failed to get udev uid: Invalid argument
Sep 27 15:42:24 krydos5.heliohost.org multipathd[685]: sda: failed to get sysfs uid: Invalid argument
Sep 27 15:42:24 krydos5.heliohost.org multipathd[685]: sda: failed to get sgio uid: No such file or directory
The /dev/disk/by-id/ directory doesn't have any lines beginning with scsi*
root@krydos5:/home/krydos# ls -la /dev/disk/by-id/
total 0
drwxr-xr-x 2 root root  60 Sep 14 03:53 .
drwxr-xr-x 7 root root 140 Sep 14 03:53 ..
lrwxrwxrwx 1 root root   9 Sep 14 03:53 ata-VMware_Virtual_SATA_CDRW_Drive_00000000000000000001 -> ../../sr0
If your VPS is showing these signs that means you are affected by this memory leak. The temporary solution is to restart journald to recover the memory and delete old logs to recover the disk space.

The better and permanent solution is we can shutdown your VPS, edit the hardware configuration, and boot your VPS back up for you. It will result in less than 10 minutes of downtime.

The reason this memory leak is happening is because we've been using the default ESXI configuration to create new VPS, but apparently that configuration doesn't provide the scsi uuid to the OS. The os tries to check the uuid every few seconds, and fills the logs quickly. All future VPS we sell shouldn't have this issue anymore now that we know about it and can configure the hardware to show the scsi uuid to the OS.
This is an unmonitored account for announcements only.
Wiki | Facebook | Twitter
server_load_s.gifserver_load_j.gifserver_load_t.gif
server_uptime_s.gifserver_uptime_j.gifserver_uptime_t.gif




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users