Where's My Memory Going?
(image: /content/07-blog/769-dtracing-hardware-cache-counters/ask-mr-bruning-logo.png)
While teaching a combined SDC, SmartOS Internals, ZFS Internals, and DTrace course over the last 2 weeks, a student noted that he thought he was seeing a memory leak while running imgadm import dataset_uuid
. He was doing this on his laptop on top of VirtualBox. The command eventually failed. So, the question he wanted to answer was "What is using up the memory"? We'll also take a look at how the command failed (it was not because of being out of space).
I was able to reproduce the behavior, and if you like, you can follow along by doing the same steps on your SmartOS system; you can get the latest version here. Here are good instructions for getting SmartOS running on VMware Fusion, and a link for doing the same on VirtualBox. If you prefer to run it on bare metal, you can get the USB image from us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos-latest-USB.img.bz2, copy the img to a USB key, and boot your system off of the USB key. There are also links for running on VMware, VirtualBox, or from the USB key on the download SmartOS wiki page.
Assuming you've logged into the global zone as root using ssh, we'll do the following:
# imgadm availUUID NAME VERSION OS PUBLISHEDfebaa412-6417-11e0-bc56-535d219f2590 smartos 1.3.12 smartos 2011-04-11T08:45:00Z7456f2b0-67ac-11e0-b5ec-832e6cf079d5 nodejs 1.1.3 smartos 2011-04-15T22:04:12Z41da9c2e-7175-11e0-bb9f-536983f41cd8 nodejs 1.1.4 smartos 2011-04-28T17:33:45Z63ce06d8-7ae7-11e0-b0df-1fcf8f45c5d5 smartos 1.3.13 smartos 2011-05-10T09:25:00Z184c9b38-ad3d-11e0-bad6-1b7240aaa5fc smartos 1.3.15 smartos 2011-07-13T11:39:22Z70856d1a-b860-11e0-9bbf-bf2a7494dd82 smartos 1.4.1 smartos 2011-07-27T15:17:41Z7abad50e-b865-11e0-8462-730bcfdde020 smartos64 1.4.1 smartos 2011-07-27T15:33:26Za0f8cf30-f2ea-11e1-8a51-5793736be67c standard64 1.0.7 smartos 2012-08-30T21:37:30Z...6344c3f4-64d9-11e2-af3d-af797e5b19ca hadoop 1.0.0 smartos 2013-01-23T05:10:08Z1327ed36-5130-11e2-95a8-9b5a153adf3e ubuntu-12.04 2.3.1 linux 2013-01-24T21:31:37Z46ecf60e-52c8-11e2-b212-9b51fc749547 debian-6.0.6 2.3.1 linux 2013-01-24T21:36:28Z1d4fd806-6ef8-11e2-93ab-4f22d6d207c9 mongodb 1.4.5 smartos 2013-02-04T18:43:44Zd2fc0a96-61d1-11e2-8701-c7c137935369 chefserver 1.0.0 smartos 2013-02-04T19:30:16Z4bf9530a-7ae5-11e2-bb4e-3bad5fbc3de9 sngl 0.99.0 smartos 2013-02-19T19:43:00Z03d515f2-872c-11e2-8314-8387b5044570 base 1.9.0 smartos 2013-03-07T15:11:00Zbad2face-8738-11e2-ac72-0378d02f84de base64 1.9.0 smartos 2013-03-07T15:14:30Z#
The imgadm(1M)
is explained here, and in the manual page (currently, only in SmartOS).
There are currently 105 different images available. My student tried with an ubuntu image, but we'll try a SmartOS image.
Just to make this clear, the command does not always fail. But while it is running, memory usage always increases.
Here is the command to import the dataset.
# imgadm import a0f8cf30-f2ea-11e1-8a51-5793736be67cImporting image a0f8cf30-f2ea-11e1-8a51-5793736be67c (standard64 1.0.7) from "https://images.joyent.com"0% [ ] time 2.0s eta 406.2s...
While the above is running, in another window we'll run vmstat(1M)
.
# vmstat 2 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr lf rm s0 s1 in sy cs us sy id 0 0 0 3452628 1400208 724 1763 397 0 40 0 2052 36 -280 47 142 585 6864 2946 14 10 77 0 0 0 3183736 1080148 1 21 0 0 0 0 0 0 0 0 0 734 2807 1687 21 7 72 0 0 0 3180548 1076972 0 3 0 0 0 0 0 0 0 109 0 974 2198 2868 16 11 73 0 0 0 3177216 1073668 0 1 0 0 0 0 0 0 0 0 0 908 2887 1473 19 7 74 0 0 0 3174452 1070916 0 1 0 0 0 0 0 0 0 0 0 921 2572 1296 17 7 76...
From the above output (as with most of the *stat commands, ignoring the first line of output), the free memory on the machine is steadily dropping.
Let's see if prstat(1M)
tells us anything. Basically, the memory is either being used by a process(es), for files using tmpfs (i.e., files in /tmp), or by the kernel itself (very possibly by ZFS' ARC).
# prstat -s size -c# prstat -s size -cPlease wait... PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 3553 root 102M 92M run 59 0 0:00:08 7.2% imgadm/4 2125 root 35M 18M sleep 1 0 0:00:00 0.0% metadata/3 2489 root 33M 20M sleep 59 0 0:00:04 0.0% vmadmd/3 2164 root 19M 10M sleep 55 0 0:00:01 0.0% fmd/26 10 root 10M 8312K sleep 29 0 0:00:07 0.0% svc.configd/33 8 root 7800K 5088K sleep 55 0 0:00:01 0.0% svc.startd/12 3335 root 6976K 4208K sleep 53 0 0:00:07 1.9% sshd/1 2522 root 6868K 2256K sleep 1 0 0:00:01 0.0% sendmail/1 2432 smmsp 6568K 1560K sleep 1 0 0:00:00 0.0% sendmail/1 3370 root 6384K 3700K sleep 59 0 0:00:00 0.0% sshd/1 3504 root 6384K 3704K run 1 0 0:00:00 0.0% sshd/1 2450 root 6380K 5136K sleep 1 0 0:00:00 0.0% intrd/1 2158 root 6344K 3876K sleep 55 0 0:00:01 0.0% nscd/24 55 root 5988K 2696K sleep 29 0 0:00:00 0.0% syseventd/19 290 root 5960K 2804K sleep 29 0 0:00:00 0.0% syseventd/19Total: 60 processes, 365 lwps, load averages: 0.53, 0.18, 0.07 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 3553 root 102M 93M sleep 23 0 0:00:08 7.3% imgadm/4 2125 root 35M 18M sleep 1 0 0:00:00 0.0% metadata/3 2489 root 33M 20M sleep 59 0 0:00:04 0.0% vmadmd/3 2164 root 19M 10M sleep 55 0 0:00:01 0.0% fmd/26 10 root 10M 8312K sleep 29 0 0:00:07 0.0% svc.configd/33 8 root 7800K 5088K sleep 55 0 0:00:01 0.0% svc.startd/12 3335 root 6976K 4208K sleep 53 0 0:00:07 1.9% sshd/1 2522 root 6868K 2256K sleep 1 0 0:00:01 0.0% sendmail/1 2432 smmsp 6568K 1560K sleep 1 0 0:00:00 0.0% sendmail/1 3370 root 6384K 3700K sleep 59 0 0:00:00 0.0% sshd/1 3504 root 6384K 3704K sleep 1 0 0:00:00 0.0% sshd/1 2450 root 6380K 5136K sleep 1 0 0:00:00 0.0% intrd/1 2158 root 6344K 3876K sleep 55 0 0:00:01 0.0% nscd/24 55 root 5988K 2696K sleep 29 0 0:00:00 0.0% syseventd/19 290 root 5960K 2804K sleep 29 0 0:00:00 0.0% syseventd/19Total: 60 processes, 365 lwps, load averages: 0.53, 0.18, 0.07...
From the above output, process id 3553 (imgadm) is by far the largest process, and its RSS (Resident Set Size, i.e., amount of physical memory in use) is growing. A closer look at this process shows that it is the node.js
engine:
# pgrep imgadm3553# pargs 35533553: /usr/node/bin/node /usr/sbin/imgadm import a0f8cf30-f2ea-11e1-8a51-5793736be67cargv[0]: /usr/node/bin/nodeargv[1]: /usr/sbin/imgadmargv[2]: importargv[3]: a0f8cf30-f2ea-11e1-8a51-5793736be67c# ptree 35532245 /usr/lib/ssh/sshd 3334 /usr/lib/ssh/sshd 3335 /usr/lib/ssh/sshd 3338 -bash 3553 /usr/node/bin/node /usr/sbin/imgadm import a0f8cf30-f2ea-11e1-8a5 3557 /usr/bin/bzip2 -cdfq 3558 /usr/sbin/zfs receive zones/a0f8cf30-f2ea-11e1-8a51-5793736be67#
So, imgadm is a node.js script. Is there a problem with node? It turns out that node.js is built with libumem.so, so we can get a dump and see if there are possible memory leaks.
# pldd 35533553: /usr/node/bin/node /usr/sbin/imgadm import a0f8cf30-f2ea-11e1-8a51-579/lib/libumem.so.1/lib/libz.so.1/lib/libssl.so.1.0.0/lib/libcrypto.so.1.0.0/lib/libkstat.so.1/lib/libsocket.so.1/lib/libnsl.so.1/usr/lib/libstdc++.so.6.0.13/lib/libm.so.2/usr/lib/libgcc_s.so.1/lib/libpthread.so.1/usr/lib/libc/libc_hwcap1.so.1/usr/node/0.8/node_modules/dtrace-provider.node/lib/libumem_trampoline.so.1# cd /var/tmp# gcore 3553gcore: core.3553 dumped#
A quick look at the core file for possible memory leaks:
# mdb core.3553Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]> ::findleaks -f -vfindleaks: maximum buffers => 20613findleaks: actual buffers => 19722findleaks:findleaks: potential pointers => 9391153findleaks: dismissals => 3760417 (40.0%)findleaks: misses => 3942678 (41.9%)findleaks: dups => 1668382 (17.7%)findleaks: follows => 19676 ( 0.2%)findleaks:findleaks: peak memory usage => 299 kBfindleaks: elapsed CPU time => 0.5 secondsfindleaks: elapsed wall time => 0.8 secondsfindleaks:CACHE LEAKED BUFFER CALLER08619a10 35 08624b10 ?08619e10 4 0862fce0 ?0861a010 3 08623a08 ?0861a410 2 0868dee8 ?0861ba10 2 08698c40 ?------------------------------------------------------------------------ Total 46 buffers, 1688 bytes>
So basically, node does not seem to be leaking memory (only 1688 bytes). If node was growing, and we did suspect a leak, we could use DTrace to try to determine where the leak is occuring.
# dtrace -q -n 'syscall::brk:entry/pid == 3363/{printf("brk(%lx) = ", arg0); self->in=1;} syscall::brk:return/self->in/{printf("%lx\n", arg1); self->in=0;}tick-5m{exit(0);}'brk(cf02000) = 0brk(d003000) = 0brk(d104000) = 0brk(d205000) = 0brk(d306000) = 0brk(d407000) = 0## dtrace -q -n 'syscall::brk:entry/pid == 3553/{printf("brk(%lx) = ", arg0); self->in=1;} syscall::brk:return/self->in/{printf("%lx\n", arg1); ustack(); self->in=0;}tick-5m{exit(0);}'brk(d8df000) = 0 libc.so.1`_brk_unlocked+0x15 libumem.so.1`vmem_sbrk_alloc+0xa1 libumem.so.1`vmem_xalloc+0x45e libumem.so.1`vmem_alloc+0x19f libumem.so.1`vmem_xalloc+0x45e libumem.so.1`vmem_alloc+0x19f libumem.so.1`umem_alloc+0xf0 libumem.so.1`malloc+0x36 libstdc++.so.6.0.13`_Znwj+0x28 libstdc++.so.6.0.13`_Znaj+0x1e node`_ZN4node6Buffer7ReplaceEPcjPFvS1_PvES2_+0xed node`_ZN4node6BufferC1EN2v86HandleINS1_6ObjectEEEj+0x97 node`_ZN4node6Buffer3NewERKN2v89ArgumentsE+0x172 node`_ZN2v88internalL30Builtin_HandleApiCallConstructENS0_12_GLOBAL__N_116BuiltinArgumentsILNS0_21BuiltinExtraArgumentsE1EEEPNS0_7IsolateE+0x1d6 0xba00a336 0xba021aef 0x9f20b009 0xba00db41 0xba015e82 0x9f20f940## mdb /usr/node/bin/node> ::dem node`_ZN4node6Buffer7ReplaceEPcjPFvS1_PvES2_node`_ZN4node6Buffer7ReplaceEPcjPFvS1_PvES2_ == node`node::Buffer::Replace>
So the node::Buffer::Replace method is calling malloc(), which in turn decided it needed more memory so it called brk(2)
. The brk(2)
system call is used to grow the virtual size of the heap. But again, node is large, but most likely not leaking memory.
And the tmpfs
file system does not appear to be using the memory:
# df -h | grep swapswap 2.1G 884K 2.1G 1% /etc/svc/volatileswap 2.1G 12K 2.1G 1% /tmpswap 2.1G 32K 2.1G 1% /var/run# swap -lhswapfile dev swaplo blocks free/dev/zvol/dsk/zones/swap 90,1 4K 2.0G 2.0G# swap -shtotal: 171M allocated + 37M reserved = 208M used, 2.1G available#
Let's take a look at the kernel. Each of the following commands were run a few minutes apart.
# echo ::memstat | mdb -kPage Summary Pages MB %Tot------------ ---------------- ---------------- ----Kernel 180073 703 35%ZFS File Data 181992 710 35%Anon 49035 191 9%Exec and libs 1282 5 0%Page cache 16457 64 3%Free (cachelist) 7591 29 1%Free (freelist) 85440 333 16%Total 521870 2038Physical 521869 2038# echo ::memstat | mdb -kPage Summary Pages MB %Tot------------ ---------------- ---------------- ----Kernel 181625 709 35%ZFS File Data 186546 728 36%Anon 49028 191 9%Exec and libs 1282 5 0%Page cache 16456 64 3%Free (cachelist) 7593 29 1%Free (freelist) 79340 309 15%Total 521870 2038Physical 521869 2038# echo ::memstat | mdb -kPage Summary Pages MB %Tot------------ ---------------- ---------------- ----Kernel 187335 731 36%ZFS File Data 177882 694 34%Anon 49040 191 9%Exec and libs 1284 5 0%Page cache 16459 64 3%Free (cachelist) 7326 28 1%Free (freelist) 82544 322 16%Total 521870 2038Physical 521869 2038#
The kernel is growing, but is not using an excessive amount of space. So, applications are growing, but not enough to cause problems. The same can be said for /tmp files, and for the kernel. The conclusion is that there is no memory leak.
My student thought a memory leak was causing the imgadm import to fail. Here is the failure message:
...74% [===================== ] time 372.5s eta 130.9s{"name":"imgadm","hostname":"00-0c-29-4e-0c-94","pid":2891,"level":50,"cmd":"/usr/sbin/zfs destroy -r zones/1327ed36-5130-11e2-95a8-9b5a153adf3e-partial","error":{"killed":false,"code":1,"signal":null},"stdout":"","stderr":"cannot destroy 'zones/1327ed36-5130-11e2-95a8-9b5a153adf3e-partial': dataset is busy\ncannot destroy 'zones/1327ed36-5130-11e2-95a8-9b5a153adf3e-partial': dataset is busy\n","cleanDsName":"zones/1327ed36-5130-11e2-95a8-9b5a153adf3e-partial","msg":"error destroying tmp dataset while cleaning up","time":"2013-03-13T13:08:05.784Z","v":0}imgadm: error (InternalError): uncompression error while importing: exit code 2#
This occured on a system where network access was slow and not very reliable. Most likely, the failure is due not to a memory shortage, but to corruption of data received over the network. We may come back to this in a future blog.
To summarize what to look for when you suspect a memory leak:
- First, does vmstat(1M) show free memory (and/or free swap) declining over time.
- If so, use prstat(1M) to check for large, growing processes.
- If no processes are large and growing, check tmpfs files (/tmp).
- If no large files under tmpfs, check the kernel.
- If the kernel is fine, no large (and growing) tmpfs files, and no large and growing processes, you probably don't have a leak.
Note that files that have been unlinked but are still open can cause memory to be used up if these files are being written to and are in a tmpfs file system. How to deal with this will be covered in a future blog.
We offer comprehensive training for Triton Developers, Operators and End Users.
Post written by Mr. Max Bruning