Operational Defect Database

BugZero updated this defect 33 days ago.

VMware | 330723

Smarts: How do I set MALLOC_CHECK_ environment variable for heap memory debugging of Smarts environment?

Last update date:

4/16/2024

Affected products:

Smart Assurance - NCM

Smart Assurance - SMARTS

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Symptoms

This article explains how to set MALLOC_CHECK_ environment variable for heap memory debugging of Smarts environment. Errors similar to the following are found in the stacktrace from any logs or core files for Smarts environment: Thread 208 (Thread 1157204288 (LWP 2576)):#0 0x0000003f7500d9eb in read () from /lib64/libpthread.so.0#1 0x00002b8e5788a5e5 in _lineread (fd=4, buf=0x44f92ac0 "", bufsz=<value optimized out>) at /work/blackcurrent/DMT-9.0.2.X/7/smarts/platform/misc/posix/POSIX_stackdump.c:699#2 0x00002b8e5788a79b in sm_POSIXstacktrace (fd=-1, stdinfo=<value optimized out>, line=453, file=0x2b8e57915070 "/work/blackcurrent/DMT-9.0.2.X/7/smarts/platform/misc/posix/POSIX_stackdump.c") at /work/blackcurrent/DMT-9.0.2.X/7/smarts/platform/misc/posix/POSIX_stackdump.c:531#3 0x00002b8e5788b1ad in sm_LINUXstacktrace (fd=-1, stdinfo=0) at /work/blackcurrent/DMT-9.0.2.X/7/smarts/platform/misc/posix/POSIX_stackdump.c:453#4 0x00002b8e57886a68 in fatalHandler (sig=11, info=0x44f96230, context=<value optimized out>) at /work/blackcurrent/DMT-9.0.2.X/7/smarts/platform/thread/posix/sthread.c:926#5 0x00002aaabc36345a in call_chained_handler () from /opt/InCharge8/SAM/smarts/jre/lib/amd64/server/libjvm.so#6 0x00002aaabc3603fb in os::Linux::chained_handler () from /opt/InCharge8/SAM/smarts/jre/lib/amd64/server/libjvm.so#7 0x00002aaabc363f40 in JVM_handle_linux_signal () from /opt/InCharge8/SAM/smarts/jre/lib/amd64/server/libjvm.so#8 0x00002aaabc36030e in signalHandler () from /opt/InCharge8/SAM/smarts/jre/lib/amd64/server/libjvm.so#9 <signal handler called>#10 malloc_consolidate (av=0x2b8e578245c0) at malloc.c:4874#11 0x00002b8e5761e47b in _int_malloc (av=0x2aaade075080, bytes=<value optimized out>) at malloc.c:4290#12 0x00002b8e57621e1e in malloc (bytes=776) at malloc.c:3655#13 0x0000003f75cbd25d in operator new () from /usr/lib64/libstdc++.so.6#14 0x0000003f75cbd379 in operator new[] () from /usr/lib64/libstdc++.so.6#15 0x00002aaaac80cd05 in CI_Sequence_U<MR_MonitoringSystem::MR_ValueChange>::resize (this=0x44f967b0, nSize=16) at /work/blackcurrent/DMT-9.0.2.X/7/smarts/install/linux_rhAS50-x86-64/optimize/include/clsapi/ci_sequence_t.h:90#16 0x00002aaaac808713 in MR_MonitoringLog::commit_end (this=0x1b75f180) at /work/blackcurrent/DMT-9.0.2.X/7/smarts/install/linux_rhAS50-x86-64/optimize/include/clsapi/ci_sequence.h:77#17 0x00002aaaaca55785 in MR_LogManager::commit_end_st () at /work/blackcurrent/DMT-9.0.2.X/7/smarts/repos/mr/log.c:311

Resolution

On the server where Smarts SAM/IP or other Smarts domain or broker resides, edit the runcmd_env.sh and add or edit the MALLOC_CHECK_= value. This enables the diagnostic memory heap corruption logging feature in glibc. This can be run by using the following steps: IMPORTANT! Setting the MALLOC_CHECK_ variable causes system performance to be reduced by as much as 25% when running. You need to unset the MALLOC_CHECK_ variable to 0 when finished with data collection or performance issues may be caused by this setting. Run the following command from the command line: <Basedir>/smarts/bin/sudo ./sm_edit <Basedir>/local/conf/runcmd_env.sh Add or edit the following line in the runcmd_env.sh file, specifying the value corresponding to the level of logging needed. The following example sets MALLOC_CHECK value to 3 (see following section): export MALLOC_CHECK_=3 Save your changes and restart the SAM/IP/SMARTS domain or broker. MALLOC_CHECK_ variable valuesThe MALLOC_CHECK_ variable can have the following values and logging levels: 0: Any detected heap corruption is silently ignored and an error message is not generated. This gives a higher tolerance for errors caused by memory heap corruption.1: The error message is printed on stderr, but the program is not aborted.2: abort() is called immediately, but the error message is not generated.3: Will combine the attributes of setting 1 and 2 output and actions. The error message is printed on stderr and program is aborted. This can be useful because otherwise a crash may happen much later, and the true cause for the problem is then very hard to track down.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...