| Author |
Message |
sweh
Joined: 13 Apr 2004
Posts: 223
|
| Posted: Sun Feb 10, 2008 10:01 am Post subject: Kernel memory leak? |
|
|
"free" on my box is showing lots of memory being used. I shut down nearly all the applications I use (lighttpd, postfix, named, stunnel mainly). But even then it still claimed 230Mb of memory being used:
Code: % free
total used free shared buffers cached
Mem: 356008 257692 98316 0 5316 21356
-/+ buffers/cache: 231020 124988
Swap: 263160 576 262584
But a complete "ps" listing is showing much less than that
Code: % ps auxww
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 1696 608 ? Ss Jan31 0:01 init [3]
root 2 0.0 0.0 0 0 ? S< Jan31 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S< Jan31 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S< Jan31 0:00 [events/0]
root 5 0.0 0.0 0 0 ? S< Jan31 0:00 [khelper]
root 43 0.0 0.0 0 0 ? S< Jan31 0:02 [kblockd/0]
root 62 0.0 0.0 0 0 ? S< Jan31 0:24 [kswapd0]
root 63 0.0 0.0 0 0 ? S< Jan31 0:00 [aio/0]
root 71 0.0 0.0 0 0 ? S< Jan31 0:00 [jfsIO]
root 72 0.0 0.0 0 0 ? S< Jan31 0:00 [jfsCommit]
root 73 0.0 0.0 0 0 ? S< Jan31 0:00 [jfsSync]
root 74 0.0 0.0 0 0 ? S< Jan31 0:00 [xfslogd/0]
root 75 0.0 0.0 0 0 ? S< Jan31 0:00 [xfsdatad/0]
root 76 0.0 0.0 0 0 ? S< Jan31 0:00 [xfs_mru_cache]
root 635 0.0 0.0 0 0 ? S< Jan31 0:00 [kcryptd/0]
root 636 0.0 0.0 0 0 ? S< Jan31 0:00 [ksnapd]
root 656 0.0 0.0 0 0 ? S< Jan31 0:00 [rpciod/0]
root 752 0.0 0.0 0 0 ? S< Jan31 0:00 [kjournald]
root 2479 0.0 0.1 1604 536 ? Ss Jan31 0:04 syslogd -m 0
root 2483 0.0 0.1 1552 372 ? Ss Jan31 0:00 klogd -x
root 2510 0.0 0.3 4096 1144 ? Ss Jan31 0:02 /usr/sbin/sshd
root 2540 0.0 0.1 1588 468 ? Ss Jan31 0:00 udevd
root 2610 0.0 0.2 4540 920 ? Ss Jan31 0:00 crond
root 2625 0.0 0.1 1780 424 ? Ss Jan31 0:00 /usr/sbin/atd
dbus 2639 0.0 0.2 2476 932 ? Ss Jan31 0:00 dbus-daemon-1 --system
root 2649 0.0 0.6 4084 2312 ? Ss Jan31 0:00 hald
root 2688 0.0 0.1 1540 460 tty0 Ss+ Jan31 0:00 /sbin/mingetty tty0
root 9361 0.0 0.0 0 0 ? S Feb09 0:00 [pdflush]
root 13551 0.0 0.0 0 0 ? S Feb09 0:00 [pdflush]
root 15613 0.0 0.6 6940 2224 ? Ss 09:15 0:00 sshd: sweh [priv]
sweh 15615 0.0 0.4 6940 1488 ? S 09:15 0:00 sshd: sweh@pts/0
sweh 15616 0.0 0.3 4680 1388 pts/0 Ss 09:15 0:00 -ksh
root 15636 0.0 0.4 4664 1456 pts/0 S 09:15 0:00 ksh
root 15926 0.0 0.2 2380 772 pts/0 R+ 09:51 0:00 ps auxww
There's no shared memory (ipcs -a) in use. So where has all the free memory gone?
Unfortunately I'll have to reboot because I can't leave my linode non-working, but any ideas welcome to assist in diagnosis next time around.
ETA:
For what it's worth, after a reboot...
Code: % ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.9 0.1 1696 608 ? Ss 10:02 0:00 init [3]
root 2 0.0 0.0 0 0 ? S< 10:02 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S< 10:02 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S< 10:02 0:00 [events/0]
root 5 0.0 0.0 0 0 ? S< 10:02 0:00 [khelper]
root 48 0.0 0.0 0 0 ? S< 10:02 0:00 [kblockd/0]
root 66 0.0 0.0 0 0 ? S 10:02 0:00 [pdflush]
root 67 0.0 0.0 0 0 ? S 10:02 0:00 [pdflush]
root 68 0.0 0.0 0 0 ? S< 10:02 0:00 [kswapd0]
root 69 0.0 0.0 0 0 ? S< 10:02 0:00 [aio/0]
root 74 0.0 0.0 0 0 ? S< 10:02 0:00 [cifsoplockd]
root 75 0.0 0.0 0 0 ? S< 10:02 0:00 [cifsdnotifyd]
root 79 0.0 0.0 0 0 ? S< 10:02 0:00 [jfsIO]
root 80 0.0 0.0 0 0 ? S< 10:02 0:00 [jfsCommit]
root 81 0.0 0.0 0 0 ? S< 10:02 0:00 [jfsSync]
root 82 0.0 0.0 0 0 ? S< 10:02 0:00 [xfslogd/0]
root 83 0.0 0.0 0 0 ? S< 10:02 0:00 [xfsdatad/0]
root 84 0.0 0.0 0 0 ? S< 10:02 0:00 [xfs_mru_cache]
root 642 0.0 0.0 0 0 ? S< 10:02 0:00 [ksnapd]
root 662 0.0 0.0 0 0 ? S< 10:02 0:00 [rpciod/0]
root 758 0.0 0.0 0 0 ? S< 10:02 0:00 [kjournald]
root 2462 0.0 0.1 1604 536 ? Ss 10:03 0:00 syslogd -m 0
root 2467 0.0 0.1 1552 372 ? Ss 10:03 0:00 klogd -x
named 2481 0.0 0.7 30240 2744 ? Ssl 10:03 0:00 /usr/sbin/named -
root 2487 0.0 0.1 1588 460 ? Ss 10:03 0:00 udevd
root 2504 0.1 0.3 4096 1140 ? Ss 10:03 0:00 /usr/sbin/sshd
root 2573 0.0 0.4 5184 1640 ? Ss 10:03 0:00 /usr/libexec/post
postfix 2580 0.0 0.4 5232 1664 ? S 10:03 0:00 pickup -l -t fifo
postfix 2582 0.0 0.4 5280 1728 ? S 10:03 0:00 qmgr -l -t fifo -
postfix 2593 0.0 0.5 6060 2084 ? S 10:03 0:00 trivial-rewrite -
postfix 2594 0.0 0.5 5356 1800 ? S 10:03 0:00 smtp -t unix -u
lighttpd 2598 0.0 0.1 1916 624 ? S 10:03 0:00 /usr/sbin/lighttp
root 2613 0.0 0.2 4540 916 ? Ss 10:03 0:00 crond
root 2622 0.0 0.1 1780 424 ? Ss 10:03 0:00 /usr/sbin/atd
dbus 2631 0.0 0.2 2476 932 ? Ss 10:03 0:00 dbus-daemon-1 --s
postfix 2642 0.0 0.6 6408 2332 ? S 10:03 0:00 smtpd -n smtp -t
postfix 2646 0.0 0.4 5220 1636 ? S 10:03 0:00 proxymap -t unix
postfix 2650 0.0 0.4 5228 1648 ? S 10:03 0:00 anvil -l -t unix
root 2651 0.1 0.6 4084 2320 ? Ss 10:03 0:00 hald
root 2658 0.0 0.1 3744 656 ? Ss 10:03 0:00 /usr/sbin/stunnel
teamspk 2679 1.2 0.4 20688 1580 ? SN 10:03 0:00 ./server_linux
teamspk 2680 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2681 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2682 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2683 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
postfix 2684 0.0 0.6 6408 2336 ? S 10:03 0:00 smtpd -n smtp -t
teamspk 2685 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2686 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2687 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
teamspk 2688 0.0 0.4 20688 1580 ? S 10:03 0:00 ./server_linux
root 2696 0.0 0.1 1540 456 tty0 Ss+ 10:03 0:00 /sbin/mingetty tt
postfix 2697 0.0 0.5 5636 2052 ? S 10:03 0:00 cleanup -z -t uni
postfix 2698 0.0 0.5 5356 1824 ? S 10:03 0:00 smtp -t unix -u
postfix 2721 0.0 0.6 6408 2328 ? S 10:03 0:00 smtpd -n smtp -t
postfix 2722 0.0 0.6 6408 2328 ? S 10:03 0:00 smtpd -n smtp -t
postfix 2723 0.0 0.5 5400 1984 ? Ss 10:03 0:00 verify -l -t unix
postfix 2724 0.0 0.5 5356 1824 ? S 10:03 0:00 smtp -t unix -u
postfix 2725 0.0 0.6 6408 2320 ? S 10:03 0:00 smtpd -n smtp -t
postfix 2726 0.0 0.6 6408 2332 ? S 10:03 0:00 smtpd -n smtp -t
postfix 2727 0.0 0.6 6408 2324 ? S 10:03 0:00 smtpd -n smtp -t
root 2736 0.0 0.6 6940 2228 ? Ss 10:03 0:00 sshd: sweh [priv]
sweh 2747 0.0 0.4 7084 1512 ? R 10:03 0:00 sshd: sweh@pts/0
sweh 2766 0.0 0.3 4680 1380 pts/0 Rs+ 10:03 0:00 -ksh
sweh 2786 0.0 0.2 2380 772 pts/0 R 10:04 0:00 ps -aux
Code: % free
total used free shared buffers cached
Mem: 355800 57940 297860 0 4896 28812
-/+ buffers/cache: 24232 331568
Swap: 263160 0 263160 |
|
| Back to top |
|
bdonlan
Joined: 22 Jan 2008
Posts: 67
|
| Posted: Sun Feb 10, 2008 10:20 am Post subject: |
|
|
| If this occurs again, save a copy of /proc/meminfo and /proc/slabinfo before rebooting - that'd help track down what was actually using that memory. |
|
| Back to top |
|
pclissold
Joined: 24 Oct 2003
Posts: 470
Location: Netherlands
|
| Posted: Sun Feb 10, 2008 2:07 pm Post subject: |
|
|
Your free output looks normal to me. Here is the same data from a lightly loaded Xenode with an uptime of around 20 days: Code: fremont ~ # free
total used free shared buffers cached
Mem: 368848 358896 9952 0 74532 79976
-/+ buffers/cache: 204388 164460
Swap: 524280 148 524132
fremont ~ #
The Linux vm system uses the available memory for page, buffer and swap caches. This memory is freed as soon as it is needed for 'normal' use. |
|
| Back to top |
|
sweh
Joined: 13 Apr 2004
Posts: 223
|
| Posted: Sun Feb 10, 2008 3:22 pm Post subject: |
|
|
pclissold wrote: Your free output looks normal to me....The Linux vm system uses the available memory for page, buffer and swap caches
The 2nd line (+/- buffers) takes that into account. I had close to zero applications running but still had 200Mb used after taking buffer/cache into account.
I have an equivalently configured Xen machine (same applications running, same cron jobs etc.. it's my backup machine; it's just not doing live SMTP requests) at an alternate provider, and that's showing numbers close to 100Mb used after 33 days uptime which is what I expected. The linode was showing 250Mb used after 8 days.
Given that I had to reboot 8 days ago because of an out-of-memory condition, this worries me slightly! |
|
| Back to top |
|
freelikegnu
Joined: 14 Mar 2008
Posts: 7
|
| Posted: Mon Mar 17, 2008 9:25 pm Post subject: |
|
|
I seem to be having a similar issue:
2.6.23.12-linode41 #1 Wed Feb 13 13:03:11 EST 2008 i686 GNU/Linux
# cat /proc/meminfo
MemTotal: 356116 kB
MemFree: 23052 kB
Buffers: 3640 kB
Cached: 22100 kB
SwapCached: 4000 kB
Active: 34580 kB
Inactive: 5464 kB
SwapTotal: 262136 kB
SwapFree: 239940 kB
Dirty: 1152 kB
Writeback: 0 kB
AnonPages: 14072 kB
Mapped: 8328 kB
Slab: 287928 kB
SReclaimable: 2040 kB
SUnreclaim: 285888 kB
PageTables: 696 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 440192 kB
Committed_AS: 86140 kB
VmallocTotal: 2632688 kB
VmallocUsed: 1908 kB
VmallocChunk: 2630780 kB
should I use a different kernel? |
|
| Back to top |
|
bdonlan
Joined: 22 Jan 2008
Posts: 67
|
| Posted: Mon Mar 17, 2008 9:29 pm Post subject: |
|
|
freelikegnu wrote:
Slab: 287928 kB
This definitely looks like a kernel memory leak. Can you pastebin the contents of /proc/slabinfo? That might help track down where the leak is. |
|
| Back to top |
|
freelikegnu
Joined: 14 Mar 2008
Posts: 7
|
| Posted: Mon Mar 17, 2008 10:22 pm Post subject: |
|
|
| http://pastebin.linode.com/630 |
|
| Back to top |
|
bdonlan
Joined: 22 Jan 2008
Posts: 67
|
| Posted: Mon Mar 17, 2008 10:33 pm Post subject: |
|
|
Quote: #
size-128 2110787 2111070 128 30 1 : tunables 120 60 0 : slabdata 70369 70369 0
Well, that's not very helpful... some kind of random kmalloc leak I guess.
Is there anything suspicious in dmesg? If not I'd reboot and head to a different kernel version, unless anyone else has suggestions on data to collect for postmortem debugging... |
|
| Back to top |
|
freelikegnu
Joined: 14 Mar 2008
Posts: 7
|
| Posted: Sat Apr 12, 2008 11:24 pm Post subject: |
|
|
this seems to be a persistant issue as of 2.6.23.17:
Code:
# uname -a
Linux freelikegnu.org 2.6.23.17-linode43 #1 Wed Mar 5 13:57:22 EST 2008 i686 GNU/Linux
# cat /proc/meminfo
MemTotal: 356116 kB
MemFree: 37960 kB
Buffers: 17552 kB
Cached: 69164 kB
SwapCached: 0 kB
Active: 143324 kB
Inactive: 27132 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 852 kB
Writeback: 0 kB
AnonPages: 83760 kB
Mapped: 12160 kB
Slab: 142720 kB
SReclaimable: 3648 kB
SUnreclaim: 139072 kB
PageTables: 748 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 178056 kB
Committed_AS: 135692 kB
VmallocTotal: 2636784 kB
VmallocUsed: 1776 kB
VmallocChunk: 2635008 kB
|
|
| Back to top |
|
| |