How to kill Zombie processes in Linux

A zombie process in Linux refers to those already dead processes but, in one way or another, are still present in the process table of the system. The loophole is that, for some reason, this process was not cleaned by the parent from the process table. Usually, this happens after the completion of the execution process.

The usual way of operation in Linux is that after a process completes its execution, it notifies its parent, which is responsible for removing the process from the table. Unfortunately, the parent is unable to remove the process from memory in cases where the parent cannot read the child’s status. This is how it comes to be that we have dead processes in the process table. These are what we are calling the zombie processes.

What causes Linux Zombie processes?

When the child process is created, a poorly written parent process may fail to call the wait() function. As a result, its zombie children will linger in memory until they are extinguished.

This means that nothing is monitoring the infant process for state changes, and the SIGCHLD signal will be ignored. Perhaps another application is interfering with the parent process’s execution, either through lousy programming or malicious intent.

The proper system housekeeping will not occur if the parent process is not watching for state changes in the child process.

When the infant process finishes, the PCB and the entry in the process table will not be removed. The zombie state is never removed from the PCB as a result of this.

Zombies do have some memory, but it isn’t usually a problem. Because Linux systems have a finite number of PIDs (albeit a large number), if enough PIDs are zombied, no other process can start. It’s doubtful that this will happen.

However, zombied processes suggest that something has gone wrong with an application and that a specific program may have a bug.
Software bugs in data centers should not be tolerated and must be addressed.
You should keep an eye out for and destroy zombie processes until the fault is fixed.

The process ID can’t be reused until it’s launched, so the process table’s entry is tiny.
Because the PCB is much bigger than the process table entry in a 64-bit operating system, this is unlikely to cause any problems.

The amount of memory available for other processes could be affected by a large number of zombies. However, if you have that many zombies, you have a severe problem with the parent application or a bug in the operating system.

So, what do you do when a procedure turns into a zombie? You track down and eliminate the zombie processes.

How to find a zombie process?

The initial stop to killing a zombie process in the system is first to identify it. Because the init process cleans up after zombies regularly, all you have to do to get rid of them is destroy the process that created them.

The top command is a quick way to see if there are any zombies in your area. To achieve this, we will execute the following command.

top

top command results
top command results

The number of zombie processes in this system will be shown on the output. In our case above, we have 0 zombies.
Using the ps command and piping it into egrep, we can get a list of them. The state flag for zombie processes is “Z,” and you’ll sometimes see “defunct” as well.

tuts@fosslinux:~$ ps aux | egrep "Z|defunct"

The state flag for zombie processes is Z or defunct
The state flag for zombie processes is Z or defunct

Let’s break down the various sections of this command.

Z in the STAT column of the output identifies a zombie process.
[defunct] in the last (COMMAND) column of the output also identifies a zombie process.

Ideally, it is not possible to kill a Zombie process because it is dead. Instead, we notify the parent to attempt and read the child’s process status and finally clean them from the system’s table. To trigger this process, we send a SIGCHLD signal to the process’s parent. Identifying the parent process ID or what is called PID involves running the following command:

tuts@fosslinux:~$ ps -o ppid= <Child PID>

Identifying the parent process ID
Identifying the parent process ID

After getting the Zombie’s PID, use the command SIGCHLD signal to the previously identified parent processes.

tuts@fosslinux:~$ kill -s SIGCHLD <Parent PID>

use the command SIGCHLD signal
use the command SIGCHLD signal

In some cases, this does not clear out the Zombie process. This calls us to engage in plan b or c. The prior entails restarting the parent process or killing the parent processes. On the other hand, the latter cases involve doing a system reboot, especially when the Zombie process could cause an outage or a massive surge because of the Zombie process.

Below is the command to kill the parent process.

tuts@fosslinux:~$ kill -9 <Parent PID>

command to kill the parent process
command to kill the parent process

In case a parent process is killed, by extension, all the child processes of the given parent are also killed. In case one of the child processes is critical at the given time, you may need to postpone the killing until it is safe. On the other hand, a quick double-check can tell you how much memory or processing power the Zombie processes are consuming. This helps determine if the better option is to kill the parent processor to do a reboot of the system in the following cycle of the system maintenance that is already scheduled.

On Linux, how do processes states work?

Of course, Linux must keep track of all the applications and daemons running on your computer. Maintaining the process table is one of the ways it accomplishes this.
This is a list of kernel memory structures. This list includes an entry for each process that contains some information about it. Each of the process table structures contains very little information.

They store the process ID, a few other pieces of information, and a pointer to the process control block (PCB).

The PCB is where Linux stores all of the information it needs to look up or set for each process. As a process is created, it is modified, given processing time, and then destroyed.

There are over 95 fields on the Linux PCB. It’s defined in the task structure, which is over 700 lines long. The following kinds of information can be found on the PCB:

The states of the process are illustrated below

  • Process Number: The operating system’s distinctive identifier.
  • Program Counter: When this process is given access to the CPU again, the system will use this address to locate the next instruction of the process to be executed.
  • Registers: This procedure uses a list of CPU registers called registers. Accumulators, index registers, and stack pointers may be included in the list.
  • Open File List: Files associated with this procedure are included in the Open File List.
  • CPU Scheduling Information: Used to calculate how often and for how long this process receives CPU processing time.
    The PCB must record the process priority, pointers to scheduling queues, and other scheduling parameters.
  • Memory Management Information: Information about the memory that this process is using, such as the process memory’s start and end addresses, as well as pointers to memory pages.
  • Information on the I/O status: Any devices that the process uses as inputs or outputs.

Any of the following can be the “Process State”:

  • R: A running or able-to-run process. It’s running, which means it’s getting and executing CPU cycles.
    A procedure that is ready to run is awaiting a CPU slot.
  • S: The act of sleeping.
    The process is awaiting the completion of an action, such as an input or output operation. Or a resource’s availability.
  • D: The procedure is in a state of non-interruptible sleep. It’s using a blocking system call, which means it won’t proceed until the system calls are completed. Unlike the “Sleep” state, a process in this state will not respond to signals until the system call is completed and execution has returned to the process.
  • T: Because it got the SIGSTOP signal, the process has ended (stopped).
    It will only respond to the SIGKILL or SIGCONT signals, either killing or instructing the process to continue. When you switch from the foreground (fg) to background (bg) tasks, this happens.
  • Z: stands for Zombie Process. When a process is finished, it does not simply disappear. Instead, it frees up any memory it’s currently using and exits memory, but its process table entry and PCB remain.
    Its state is set to EXIT ZOMBIE, and its parent process is told that the infant process has been completed via the SIGCHLD signal.

Conclusion

Unless they’re part of a vast horde, Zombies aren’t that harmful. A few aren’t a big deal, and a quick reboot will clear them out. However, there is one point to consider.

Linux architectures have a maximum number of processes and, as a result, a maximum number of process ID numbers. When a computer’s maximum number of zombie processes is achieved, new processes cannot be started.

Zombie processes aren’t processes; they’re the remnants of dead processes that their parent process hasn’t correctly cleaned up. However, if you notice that a particular application or process is constantly spawning zombies, you should investigate further.

Most likely, it’s just a poorly written program; in that case, maybe there’s an updated version that cleans up after its child processes properly.

Leave a comment

Your email address will not be published. Required fields are marked *