Got an verbose answer from Mike Gerwitz.
(Formatting e-mail → forum by me. Otherwise his answer is as is.)
Hey Patrick,On Mon, Sep 22, 2014 at 01:55:01PM +0000, Patrick Schleizer wrote:
[quote]we’ve been wondering how secure signed git tags actually are. [2] [3]
Linus Torvalds said. [4] [5]
And goes on.
When you wrote that blog post [1], were you aware on what Linus Torvalds thinks on that matter?[/quote]
I did link to a related message from Linus within my article, but it appears that the link[0] is now broken. I’ll have to correct that.
Here’s an archive:
http://web.archive.org/web/20120701215331/http://kerneltrap.org/mailarchive/git/2006/8/27/211020
Signed tags have existed for quite some time and are a separate feature from signed commits. That said, we can consider them together.
To demonstrate what a signed tag contains, I’ll use tag 0.2.4 of GNU ease.js:
$ git cat-file -p 0.2.4 object ee85b058df783ffaa9f8d5ae58f9eb6d7586b0ca type commit tag 0.2.4 tagger Mike Gerwitz <mikegerwitz@gnu.org> 1407469612 -0400 GNU ease.js 0.2.4 released [stable] [...snip...] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJT5EgsAAoJEPIruBWO4w6rIdkP/14qtZdIJPpJqXBGkifrbVu2 XtCSOH+gXDKSpb4zarmvmJ8LZl6Wt7QidfBNrp5eBOXUTDiKF75Evsj1S61GJxC/ xh/PZtCcyeFInjd1k7kdIJ8ylK1jwVnToJNaaNtckueperVS6ao+ZKUOdnxSHYe3 Pa2jQvjMCdDqmoz0Uzaq9ot7lZ3Fdv2buSnme1DBFqa36OWna1ynJ0VOr5orf9DP 7gvksbG+sdWPkoDXtpzepnpx99CS+oJCUvkZ3pOmfFdU8g7E0rf5+jG3ZfpE0H8c R5rS3aO7HF8z0kdCrRCaEHetZbrw6M4+M49OnUHLNXcMgFUjGZtQGNgN4nIM3Ukl GjOI514IW88Ba0K+ybehTuT6A/UCxqUbdLhp0gb34C7rBsTEKNmarfUvO5GZ/9j+ g2w0sHU24tI4XiuBtd/dcYEG1Xudmci1k3T5ffX3yo2Jq/uwhdHtO6WZVE5JOKUh 945lJ9vq/w0jn2TufNbDnleT3QNhp4llBrCGBsq8CWl0XH6nf08R0KrPssUJDnzY An820qKuuAZ4DkV69XIdjXew4IlhEItdLi4n4iGvu6AzUpYgUD5WmWcp0894E8GG ruTx8xDlgiT8swCzgHCxMaMcTgxb1Xcwd6+lfPgz4cNKWqRY1THOc6TuzFz5L6MC eZ0cYtO46O5O1f4O4cJM =GnrZ -----END PGP SIGNATURE-----
Before we make too many assumptions here, let’s verify the reasonable assumption that the signature encompasses the entire output that precedes it:
$ t="$(git cat-file -p 0.2.4)" $ gpg --verify <( tail -17 <<< "$t" ) <( awk '/^-----BEGIN/{exit}{print}' <<< "$t" ) gpg: Signature made Thu 07 Aug 2014 11:46:52 PM EDT using RSA key ID 8EE30EAB gpg: Good signature from "Mike Gerwitz (Free Software Developer) <mike@mikegerwitz.com>" gpg: aka "Mike Gerwitz (GNU Project Maintainer) <mikegerwitz@gnu.org>"
Indeed it is. So the actual value provided by a GPG-signed tag message is that it provides verification that I personally issued a release of that object. So what would it take to compromise that tag?
Well, the tag is referenced by name, not hash, so that is immediately removed from the picture; we are therefore left with the object that it references—ee85b05, If we can create a SHA-1 hash that is both a valid commit object and hashes to the same value, then it will appear (to the user of the compromised repository) that I issued that bad commit.
So let’s take a look at that object:
$ git cat-file -p ee85b05 tree 32e5d1faecbc24b16e078ba42c1ab3e2c6515ab6 parent cef45cd0977f5f3f2baa5a5d2da857aff63ee50b parent a5c89565fe6ceb7ebeef9794afb57415bd9bf099 author Mike Gerwitz <mikegerwitz@gnu.org> 1407466634 -0400 committer Mike Gerwitz <mikegerwitz@gnu.org> 1407466634 -0400 gpgsig -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJT5D1nAAoJEPIruBWO4w6rHgEP/ivLes1/aPI9+a/D3yDs2wBm ejjz9KlObNzyKPTylbzEurAjVssFJVutwQW3Q4Rcmi+2lY41+3tQmq3M8k1zG3zw G4VxoAPQZ0C4N+garmKytUsTAgpV1NFQS+NjO0nUZH0dpv3bZcnBbnAV0CCQaslN WTMEDAo5HI9rEQeY/47Pt2AxGKh+cQaxn9Qnh3wrgAQ9oFrCxYRiV6qcZXDL2O8r x/xqgy2vDoAawT70pAqQMgHAGixv4YAklebfr3FQ9/0jT6//sPd+ulrzPbA3nXvv Xn9W06jUy4IO9ZSuR2MGMGOrhzW/yNK2UL8L9VLrdVrGsb4Jv3BEVy/pKXjU41hb mMsdNvzZpPzMtda9LNPEI7uOV8nYE201vxRzv1EKw4aR6GGuHtPnq025BvdbOBVk Gz2L2TVP7u3RZ472ovxmmF8jjutUmp+QbWtiH4p4GgWBcRKNFMPQUI6oZBkyth8Q BgpbJRL5fRsBxanns422hB8wfK7nYLf4QbDRnOcefISC0npo0DeGGUPB605mTEtP kYo3Uv/fORU5gjgzhaeUiQsXtc0EXLdsQBtzlHkapmpoKJNgBMqj2VVO/YHx3wik CaVzVl2zI3q0UFpnu0NkDne9svsjPZbOM/N8MVutvk9oDyZsKc0gxq8v+h+pUySg tm7cs9SXXKDvIiw3QpYB =z41h -----END PGP SIGNATURE----- Various ES3-related bugfixes for bugs introduced by v0.2.3 GNU ease.js remains committed to supporting environments as far back as ES3; unfortunately, this is important due to the popularity of older IE versions (IE<=8). Btw, ease.js runs on IE 5.5, in case you still need that. But please don't use a proprietary web browser. Indeed, this is why the breaks were introduced in the first place: I neglected to run the browser-based test suite on the proprietary Microsloth browsers until after the v0.2.3 release, because I do not own a copy of Windows; I had to run it at work. But, regardless---my apologies; I'll be more diligent.
Alright—there’s a number of things to note here. Firstly: a commit’s hash is generated from all of that above content:
$ git cat-file -p ee85b05 | git hash-object --stdin -tcommit ee85b058df783ffaa9f8d5ae58f9eb6d7586b0ca
You’ll notice that this is precisely the hash referenced in the tag. If we were to change the commit content in the slightest, we’d get a different hash:
$ cat <( git cat-file -p ee85b05 ) <( echo foo ) | git hash-object --stdin -tcommit 696a73618dd5d0d39f030d19ceab08c14115af4e
If we were even to change the type of the object (leaving the commit data
alone), we’d even get a different hash:$ git cat-file -p ee85b05 | git hash-object --stdin -tblob 441bab4e4006f63d859666322e53740014dcccf0
As an attacker, our approach would depend on what it is we are trying to manipulate. Let’s say that the goal is to introduce malicious code into the
system. Well, in the case of this commit, it’s signed—we cannot change its content without invalidating the signature. But not everybody signs their
commits, so in another repository, an object referenced by a tag may still be open for exploitation. But the case of ee85b05, such protection doesn’t
necessarily buy us anything: what if we instead looked at the tree?$ git cat-file -p 32e5d1faecbc24b16e078ba42c1ab3e2c6515ab6 100644 blob ca0ac30fb6f5c008ec7949cf78190876aaaab0ba .gitignore 100644 blob 0145be4a9a32e42a08de01f100ea243c264f6478 .mailmap 100644 blob 94a9ed024d3859793618152ea559a168bbcbb5e2 COPYING 100644 blob 5a25289c27e078c45be6fa22afc26225b78c65a0 Makefile.am 100644 blob f9d1937bc4209f7d24d6cf41a2c4095f41525aee README 100644 blob 7254909d0b2b03c680a74bc41b1101d9e3ce903e README.hacking 100644 blob a088c90395b34a113a9f1c1430ceb7d3a13162ea README.md 100644 blob 92a128e4015a50f7194ca3f67b4319bdf2715217 README.todo 100644 blob 437da55d65a395ab2fe37536c16783a571832161 README.traits 100644 blob 1c909f9e6c4fc74c49bad30cf78f38bd76e0765c configure.ac 040000 tree f060b0f430df3a42632bf934a71b6c103ab7b189 doc 100644 blob 171e23e95d28985cdd78c514272c7d44bc5a8ae0 index.js 040000 tree c663e578896401ec3a93ca8034130c984e8eab5e lib 100644 blob 9d5c6318b8c32fe2784924a52bf23705f56530b2 package.json.in 040000 tree 46e7c203f0e02e8af55ec5b54b813ffbcaabb075 test 040000 tree 5e5fc84526d864d407dfbfc8f99b9c72645ffb60 tools
One option is to rewrite any of those blobs such that they hash to the same value, but contain malicious code. But that is difficult—not only is it
hard to generate working code that hides a working exploit such that it hashes to the same value, but it must also play nicely with Git’s deltas:[1]
while the topmost blob will contain the full content, each previous commit will contain only a delta relative to it. So if you compromise the
repository, someone is going to notice, because the delta won’t apply; you’d need to distribute fresh repositories with properly resolved deltas.So on and so fourth, for every referenced hash, be it a tree, blob, or other commit—anything you modify, you have to either keep the hash the same and consider related consequences, or generate a new hash an be able to amend history such that, at some point, hashes match up so that you can join your amended history with the original. As an example:
--o--A--o--o--B--H \ / `--X-`
Perhaps you were able to figure out some commit
X
that hashes to B’s parent and references A as its own parent.So the goal is to mitigate this as much as possible. SHA-1 already does this fairly well by making it incredibly difficult to come up with sensible data
that both works with Git and produces the intended result. There is finally a strong move by Google to put a stop to SHA-1 web certificates[2], which
contains much rationale as to why SHA-1 is bad. And that’s all true—you should not be using SHA-1 cryptographically. But as you yourself mentioned, SHA-1 was never intended to provide any cryptographic assurances in Git. But we aren’t dealing with randomly generated keys here: we’re dealing with human-readable content. The content still has to be useful and still has to make sense—even if a SHA-1 collision is found for a commit in your repository, that doesn’t mean that it will get the job done, since it’s likely to be utter nonsense (random data). Therefore, the risk is much less serious than certificate verification.Nonetheless, SHA-1 is still involved in the signing process, because Git references them as a part of the signed text.
So you approached me with one problem, and I’ve demonstrated another:
- What is the value of GPG-signed tags?
- What is the value of GPG-signed commits?
The value of GPG-signed tags should at this point be clear: it can state, with confidence, that the signer did indeed issue a tag of the referenced object (SHA-1), with the included message. And that’s good—you want to know that a maintainer did actually make a release, and not just some random person that got a hold of credentials to the public repository and release server. But any further assurances are difficult, because the tag signs only a single hash: the only restriction an attacker has is that he/she must create a commit object that hashes to that same value (technically tags don’t have to be hashes, but users are expecting them to be in this context). There is no restriction on the content of that object, because a tag can point to any arbitrary commit. Therefore, an attacker has a great degree of flexibility with signed tags.
Take note of Linus’ reply mentioned above:
Which brings us to signed commits. In the case of GNU ease.js’ 0.2.4 tag, we have an additional protection, although it’s unfortunately enforced only by convention: I tag only signed commits. Therefore, even if an attacker were to compromise a tag, he/she would be unable to sign the newly minted commit, and an automated system (or user) could see this and say “no, something is wrong”.
But that does not prevent an attacker from going after any of the other commits in the repository—and there are plenty.
The goal is to make it as difficult as possible. The odds are already very difficult, so the effort you put into further deterring crackers is up to
you. My article mentions three options: signing every commit, signing only merge commits, and squashing into a single commit. I’d argue that squashing into a single commit is the least effective (for reasons described above), whereas signing every commit is quite effective, since the meaningful collision space limited then to tree and blob manipulation. But signing only merge commits isn’t all that bad either—you still reduce the odds of a lucky find.Certainly. I hope the level of detail I have provided is useful for everyone involved. As a disclaimer: please note that I am not a cryptographer, and I do not work on (hack) Git. I also didn’t have time to proofread, which is perhaps the most important part of this disclaimer.
0]: An Interview With Linus Torvalds: Linux and Git - Part 1 | Tag1 Consulting
1]: Git - Packfiles
2]: Why Google is Hurrying the Web to Kill SHA-1–
Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB