Right, large portions can (and even must be to safe time) be skipped. Every audit needs a scope. It cannot be “audit everything”. There are repetitive things. Similar things. Examples of things to skip include code of during the audit trusted, out of scope dependency libraries. One needs a mental model, rough overview what the program is generally doing. At that stage, one would catch a “direct” backdoor. Next is to identify attack surface, how untrusted inputs are processed and how specifically crafted inputs could have unexpected outcomes. That goes for any review, disassembly or source code.
The GNU Hello [archive] program source file
hello.c
[archive] at time of writing contains170
lines. Theobjdump -d /usr/bin/hello
on Debian buster has2757
lines.
It’s to showcase the amount of education required.
And that is far harder than this:
For any program that does real things lines of codes of disassembly code to review will be far greater than it’s representation in high level abstraction source code.
Most difficult to least difficult:
- hand written object code
- disassembly code
- assembler source code
- C source code
- ruby
Experiment. Introduce people to reviewing disassembly code, teach assembler programming, teach C and teach ruby. Then see which one is the easiest to learn by most people. Try hello world in assembler language vs ruby.
https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html
I’ll go through a few points.
You don’t need reproducible builds.
Hyperbolic title and contradicted later in the article.
Current in Debian and most if not all other operating systems, a compromised build machine could introduce a bugdoor or even “direct” backdoor, malware during compilation. Such introduced malware in the binary would be difficult to spot.
Build machines could be compromised for example by insiders, an evil maid, remote attacks. In case of Debian, even an honest maintainer and otherwise fully honest developer community would’t necessarily notice.
Such malicious third parties can be kept out due to use of reproducible builds.
The problem with this scenario is that the user still has to trust the vendor to do the verification. If the trusted vendor is compromised, then they can provide tampered binaries. If they’re not compromised, then there was no benefit to reproducing it with third parties.
Trust isn’t yes/no.
Vendor for example in case of Debian isn’t a monolithic entity.
At the moment the build machine, the build machine administrator, the server center where the build machine might reside, etc. are in a position to backdoor a binary. With reproducible builds this could be prevented.
Now if the vendor is compromised or becomes malicious, they can’t give the user any compromised binaries without also providing the source code. This ignores some complexities, like ensuring security updates are delivered even if one vendor is compromised, what to do if the reproducers stop working, or how to reach consensus if the reproducers and your vendor disagree on what software or fork you should be using.
If the vendor is compromised, stop upgrading. Wait for the issue to be resolved. Use a different operating system. Obviously reproducible builds cannot help against the vendor being coerced to do bad things. But at least it can be noticed and user can make an informed decision.
Regardless, even if we ignore these practicalities,
Yes. Safely ignored for now. These issues seem solvable, theoretic, …
the problem with this solution is that the vendor that was only trusted once still provides the source code for the system you’re using. They can still provide malicious source code to the builders for them to build and sign.
I rest my case. Thanks for confirming, that reproducible builds can move the issue from binaries to source code. That’s the point of the exercise.
- Q. It’s easier to audit source code than binaries, and this will make it harder for vendors to hide malicious code.
I don’t think this is true, because of “bugdoors”. A bugdoor is simply an intentional security vulnerability that the vendor can “exploit” when they want backdoor access.
At least the issue of introducing extra backdoors at the binary level / build machine compromise can be resolved.
- Q. Build servers get compromised, and that’s a fact. Reproducible builds mean proprietary vendors can quickly check if their infrastructure is producing tampered binaries.
Ignoring since only on proprietary software.
- Q. If a user has chosen to trust a platform where all binaries must be codesigned by the vendor, but doesn’t trust the vendor, then reproducible builds allow them to verify the vendor isn’t malicious.
I think this is a fantasy threat model. If the user does discover the vendor was malicious, what are they supposed to do?
Stop installing upgrades. Monitor the situation. Share information with others. Change the vendor to one that isn’t malicious.
- Q. Whether it’s useful for end users or not, it will allow experts to monitor for compromised build servers producing tampered builds.
I think this is true,
Great!
but there are other attacks against compromised build servers, all of which are more common than producing tampered builds.
What other attacks against build servers?
More often, attackers want signing keys so they can sign their own binaries,
Attacker singing own binaries is what would happen if a Debian build server was compromised. Reproducible builds would stop that.
Compromise of Debian APT repository signing key would be a disaster but it’s an unrelated security issue. That would be survivable in theory too with end-to-end signed debs. debsign, debsig and dpkg-sig. I hope this will be tackeld next after reproducible builds.
Non-applicable to Freedom Software.
inject malicious code into source code tarballs,
or malicious patches into source repositories.
Reproducible builds don’t help with any of those problems.
Of course not. Reproducible builds are to force backdoor attempts to target the source code where it’s easier to spot, not the binary. However, one cannot needlessly allow issue to act as blocker for another issue.
In summary, not convincing at all. The blog post is making the case of reproducible builds. Not the case against reproducible builds.
Btw also Microsoft is going for reproducible builds. Quote:
Why are the module timestamps in Windows 10 so nonsensical? - The Old New Thing
One of the changes to the Windows engineering system begun in Windows 10 is the move toward reproducible builds. This means that if you start with the exact same source code, then you should finish with the exact same binary code.