opened 04:04PM - 06 Oct 21 UTC
T: bug
C: other
P: default
needs diagnosis
Originally brought up by me in https://github.com/QubesOS/qubes-issues/issues/61…74#issuecomment-936180012
> > > [0.048xxx] random: crng done (trusting CPU's manufacturer)
> >
> >
> > This! I've just rechecked the failed log, and I don't see `trusting CPU's manufacturer` part there. And indeed that CPU does not support RDRAND. This means, the extreme issue I see, applies only to quite old systems (and hopefully does not affect majority of our users - even good old x230 already has RDRAND). So, I'm lowering the priority. But it's still worth improving the situation.
>
> Strongly discouraged to rely on RDRAND for security / entropy quality anyhow as per: https://www.whonix.org/wiki/Dev/Entropy#RDRAND
@marmarek https://github.com/QubesOS/qubes-issues/issues/6174#issuecomment-936226779:
> > Strongly discouraged to rely on RDRAND for security / entropy quality anyhow as per:
>
> In context of _this issue_, it is not a problem, because stubdomain does not use RNG for any security critical task. There is not crypto involved etc. One could argue it may make ASLR for qemu less effective, but we don't consider qemu trusted, so it is not a huge deal (and remember the RDRAND issues are still very hypothetical - see below).
>
> In a broader context of RDRAND, I don't think we should worry about _backdoors_ there. Or rather: if you consider intentional backdoors in your CPU a valid threat, throw away that CPU. There is no really a difference how such hypothetical backdoor could work - whether that would be predictable RDRAND, [reacting to some magic values to any other instruction](http://blog.cr4.sh/2015/07/building-reliable-smm-backdoor-for-uefi.html), or anything else. We could worry about its effectiveness - not intentional bugs, which indeed is hard to reason about, since its being opaque.
Seems like I need to make a better argument.
Quote https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
> random.trust_cpu={on,off}
>
> [KNL] Enable or disable trusting the use of the CPU's random number generator (if available) to fully seed the kernel's CRNG. Default is controlled by CONFIG_RANDOM_TRUST_CPU.
The name of the kernel parameter `random.trust_cpu` is a bit non-ideal. There is no need to invoke big words such as "trust" or "backdoor" for the sake of this argument. Not even trust or a backdoor is required for this being an issue. Even a bug that happened in past would justify this change.
Ars Technica reported, [AMD shipped Ryzen 3000 with a serious microcode bug in its random number generator.](https://arstechnica.com/gadgets/2019/10/how-a-months-old-amd-microcode-bug-destroyed-my-weekend/) Lennart Poettering (@poettering) [summarized](https://twitter.com/pid_eins/status/1149649806056280069) the issue nicely.
> Finally, AMD admits it's their fault, and they are preparing a BIOS update to fix RDRAND. You probably should avoid running a CONFIG_RANDOM_TRUST_CPU=y Linux kernel (Fedora) on a Ryzen system without that BIOS update, or all crypto keys generated are not as random as you hope.
That bug that gladly was discovered and publicized by a white hat. Due to the large amount of different CPU models, different batches it's not a good idea to rely on white hats to swiftly report it.
Or this other bug [Kernel bug report from 2014, rdrand instruction fails after resume on AMD family 22 CPU](https://bugzilla.kernel.org/show_bug.cgi?id=85911).
"[D. J. Bernstein isn't a fan of RDRAND either.](https://groups.google.com/g/randomness-generation/c/z3Uid45DV34)" In the same mailing list thread someone else posted:
> On https://spideroak.com/browse/share/UTwente/RNG/Tests/NIST-STS/ you can find the results of randomness tests of several random generators including RDRAND.
>
> In the document No_of_failures_calculation.txt you can find the used testing method and the test results.
>
> The actual number of failed tests of RDRAND deviates more then 4 sigma from the expected number of failed tests.
>
> The used software can also be downloaded from the same link so these tests can be reproduced.
>
> As you also can see the XOR_SHIFT PRNG and the Picoquant PQRNG150 TRNG pass the tests with a number of failed tets within the 3 sigma deviation so the tests seem to work fine.
I didn't verify the latter but for my part I've seen enough.
`random.trust_cpu=on` means that [`RDRAND`](https://en.wikipedia.org/wiki/RDRAND) has a privileged position within Linux entropy gathering process.
`random.trust_cpu=off` makes it only a "normal" ("unprivileged") source of entropy among other sources (such as keyboard, mouse, CPU jitter, and the usual).
----
Current kernel entropy sources in Qubes are:
* "privileged": RDRAND
* "unprivileged": keyboard, etc.
Suggested kernel entropy sources:
* "privileged": none
* "unprivileged": RDRAND, keyboard, etc.
----
`random.trust_cpu=on` advantages:
* Perhaps negligibly faster boot of dom0?
`random.trust_cpu=off` advantages:
* Being used as 1 entropy source normally, equal rights with other entropy sources. It doesn't disable RDRAND entirely.
----
[security-misc](https://github.com/Whonix/security-misc/) does it. (#1885)