Kernel Hardening - security-misc

raja · July 13, 2022, 1:54pm

Patrick · July 15, 2022, 1:29pm

Merged. Thank you!

Patrick · July 15, 2022, 1:30pm

Someone sent this to my inbox where I cannot handle it.

Spectre Side Channels
Spectre Side Channels — The Linux Kernel documentation

L1TF
L1TF - L1 Terminal Fault — The Linux Kernel documentation

MDS
MDS - Microarchitectural Data Sampling — The Linux Kernel documentation

TAA-TSX Asynchronous Abort
TAA - TSX Asynchronous Abort — The Linux Kernel documentation

ITLB Multihit
iTLB multihit — The Linux Kernel documentation

SRBDS Special Register Buffer Data Sampling
SRBDS - Special Register Buffer Data Sampling — The Linux Kernel documentation

Core Scheduling
Core Scheduling — The Linux Kernel documentation

L1D Flushing
L1D Flushing — The Linux Kernel documentation

Some hardening settings are already applied in security-misc.

Help welcome checking what’s missing if any.

Patrick · July 23, 2022, 11:53am

## Makes the kernel panic on uncorrectable errors in ECC memory that an attacker could exploit.
GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX mce=0"

Shouldn’t that be mce=off?

And useful for security anyhow?

https://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.rst

Patrick · July 23, 2022, 12:00pm

Thank you!

This was merged.

But undone one single commit of the many other commits:

related:
set oops=panic kernel parameter or kernel.panic_on_oops=1 sysctl for better security

raja · July 25, 2022, 9:50am

Looking into mce=0, I am not entirely sure what its connection to ECC memory is based on the kernel docs.

Regarding mce=off, I think that would be less secure as per the definition of a machine check:
https://www.kernel.org/doc/html/latest/x86/x86_64/machinecheck.html

“Machine checks report internal hardware error conditions detected by the CPU. Uncorrected errors typically cause a machine check (often with panic), corrected ones cause a machine check log entry.”

Therefore, maybe we should remove or comment out that command as I am not sure what it does. On the other hand, replacing it with mce=off may not be desirable.

Patrick · July 25, 2022, 6:06pm

I wonder where this mce=0 think is coming from.

web search term:

linux “mce” security hardening
linux “mce=off” ecc security
linux “mce=0” ecc security

Quote Tails - kernel hardening

mce=0

Mostly useful for systems with ECC memory, setting mce to 0 will cause the kernel to panic on any uncorrectable errors detected by the machine check exception system. Corrected errors will just be logged. The default is mce=1, which will SIGBUS on many uncorrected errors. Unfortunately this means malicious processes which try to exploit hardware bugginess (such as rowhammer) will be able to try over and over, suffering only a SIGBUS at failure. Setting mce=0 should have no impact. Any hardware which regularly triggers a memory-based MCE is unlikely to even boot, and the default is 1 only for long-lived servers.

https://www.kernel.org/doc/Documentation/x86/x86_64/machinecheck.rst

Patrick · July 25, 2022, 6:15pm

https://www.mcelog.org/

https://www.mcelog.org/faq.html

https://www.mcelog.org/references.html

web search terms:

site:kernel.org “mce=0”
site:kernel.org “mce=off”
site:kernel.org “nomce”

So I guess we want to check but once an exception is found whether correctable or uncorrectable, a kernel panic is preferred.

I haven’t found a very authoritative source (kernel) to me that setting mce=0 (which can no longer be found in kernel documentation), nomce or mce=off results in kernel panic if an exception is found or that exceptions are even still detected or that this setting is related to security.

raja · July 25, 2022, 6:34pm

This must be the reference.

Also I think setting nomce or mce=off stops the kernel from panicking if a machine check error is detected.

Therefore the existing command should either be removed or commented out. This way, uncorrected errors will cause a panic as desired.

Patrick · July 25, 2022, 7:16pm

Maybe there is a general kernel feature to always treat =0 as =off?

https://www.kernel.org/doc/html/v5.3/x86/x86_64/machinecheck.html

tolerant
Tolerance level. When a machine check exception occurs for a non corrected machine check the kernel can take different actions.

0: always panic on uncorrected errors, log corrected errors
1: panic or SIGBUS on uncorrected errors, log corrected errors
2: SIGBUS or log uncorrected errors, log corrected errors
3: never panic or SIGBUS, log all errors (for testing only)

As per that link close to the desired outcome to kernel panic on error would be setting tolerant (tolerance level) to 0.

(The kernel manual does not show any example. Might be chimerical settings. Might be setting mce.tolerant=0 is sysctl.)

This could be added to file:
/etc/sysctl.d/30_security-misc.conf

But that as per kernel manual is only for non-corrected errors. For the (maybe?) desired outcome of also kernel panicking as soon as a correctable error is detected I haven’t found a setting yet.

maybe: Not clear yet if changing MCE settings can increase security at all.

Patrick · August 16, 2022, 9:44am

Got this by e-mail.

Kernel panic is a software routine. It should be executed by the kernel
itself. The MCE handler decides whether the system should panic or not based on
the exception that happened. If you disable MCE and the aforementioned file is
the only place that is called upon identifying such exceptions, the panic will
not happen at all.

And by the way I agree with their concern regarding exposing log messages to
malicious processes. But I would expect them to refer to an study, blog post,
article, code example, etc, to show that how this concern can be valid in real
world.

Patrick · August 16, 2022, 9:45am

Therefore in absence of any authoritative recommendation to change any mce settings, it’s best we comment it out and don’t change anything related to mce until there is a better argument.

Patrick · August 22, 2022, 10:56am

Patrick · August 22, 2022, 10:56am

Merged, thanks!

Patrick · September 21, 2022, 5:26pm

HulaHoop · September 23, 2022, 11:46am

Tails enabled 3 boot time kernel options for hardening. Some are caused by changes to upstream security features, another affects TTY which I am not sure how it will impact our current config.

Patrick · September 23, 2022, 6:51pm

That was all previously done already.

Using.

Also done.

Using.

Thanks anyhow since we could be missing something in theory.

To verify, grepping the source code (or probably enough grepping the security-misc source code) for kernel parameters which should be used or not used would work (which I’ve just done but anyone else welcome to check this as well).

raja · September 27, 2022, 4:03pm

Awesome, great to see more eyes keeping track of potential hardening methods.

Patrick · October 18, 2022, 6:38pm

hidepid can be re-tested once based on Debian bookworm. Maybe pkexec based applications will be no longer broken by hidepid.

Patrick · November 10, 2022, 4:27am