this post was submitted on 22 Aug 2024
65 points (97.1% liked)

Linux

48165 readers
975 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Via Andy Miller (2007), an amusing metaphor for Linux memory overcommit. Originally posted by Andries Brouwer to the linux-kernel mailing list, 2004-09-24, in the thread titled “oom_pardon, aka don’t kill my xlock”:

An aircraft company discovered that it was cheaper to fly its planes with less fuel on board. The planes would be lighter and use less fuel and money was saved. On rare occasions however the amount of fuel was insufficient, and the plane would crash. This problem was solved by the engineers of the company by the development of a special OOF (out-of-fuel) mechanism. In emergency cases a passenger was selected and thrown out of the plane. (When necessary, the procedure was repeated.) A large body of theory was developed and many publications were devoted to the problem of properly selecting the victim to be ejected. Should the victim be chosen at random? Or should one choose the heaviest person? Or the oldest? Should passengers pay in order not to be ejected, so that the victim would be the poorest on board? And if for example the heaviest person was chosen, should there be a special exception in case that was the pilot? Should first class passengers be exempted? Now that the OOF mechanism existed, it would be activated every now and then, and eject passengers even when there was no fuel shortage. The engineers are still studying precisely how this malfunction is caused.

Twenty years later, as far as I know, the OOM killer is still going strong. In fact, if you don’t like the airline’s policy on what counts as an “emergency” (for example, that it might exhaust your swap partition too before killing any bad actor at all), you can hire your own hit man, in the form of the userspace daemon earlyoom.

Explanation of the OOM-Killer: Understanding Out of Memory Killer (OOM Killer) in Linux

all 10 comments
sorted by: hot top controversial new old
[–] [email protected] 21 points 2 months ago (1 children)

I started using one of the userspace oom killers a while ago and have been much happier. Instead of the system becoming unresponsive, suddenly Slack just dies. It's great.

[–] [email protected] 3 points 2 months ago (2 children)

Why is it though that the system just becomes unresponsive? That is always my experience too, but shouldn't just the kernel's OOM killer kill something?

[–] [email protected] 3 points 2 months ago

Yes that's true however the default OOM killer tries its best to save the processes first and it can take while sometimes it took for me over 30 mins until it killed the bad process and then it all became responsive again.

[–] [email protected] 1 points 2 months ago

I don't know the details behind it, but it sure takes its sweet time figuring it out. I've let it sit 20 minutes before giving up.

[–] [email protected] 8 points 2 months ago

The OOM killer is particularly bad with ZFS since the kernel doesn’t by default (at least on Ubuntu 22.04 and Debian 12 where I use it) see the ZFS as cache and so thinks its out of memory when really ZFS just needs to free up some of its cache, which happens after the OOM killer has already killed my most important VM. So I’m left running swap to avoid the OOM killer going around causing chaos.

[–] [email protected] 5 points 2 months ago* (last edited 2 months ago) (2 children)

Can it not be disabled? I've heard so many horror stories about the OOM killer that I'm really not a fan at this point.

And might as well add one of my own.

I needed to do an unpacking of a very large file, which I kept running in the background, but it used a ton of memory and took a ton of time. So to ensure I'm not bored for 30 mins, I opened up the browser. Around 10 mins or so later, I go to check up on the window where the operation is running only to find out the operation.... stoppped? So after that, I just started the operation again, closed all other windows and background programs, and checked out stuff on my phone while I waited.

[–] [email protected] 5 points 2 months ago* (last edited 2 months ago) (2 children)

You will always need some sort of oom killer unless you have endless memory (or swap space, which comes with its own problems in the form of grinding your system to an almost halt). Imagine all memory is in use, then some system critical task (or even the kernel itself) needs memory as well. If the kernel can't kill a less important process to free memory in such a situation you might just crash your system.

[–] [email protected] 1 points 2 months ago

I mean, this is literally what someone in the original mailing list said:

How about a sysctl that does "for the love of kbaek, don't ever kill these processes when OOM. If nothing else can be killed, I'd rather you panic"?

[–] [email protected] 2 points 2 months ago

AFAIK Solaris and Haiku don't have an OOM Killer by default. malloc just fails if the kernel can't provide enough memory.