[SECURITY] Git General Verification + Verifying Whonix submodules

Using the Qubes template installer, I the main Whonix package is verified automatically using the code except below which works for the main package but it seems the submodules may all not be created properly to be able to run the process on them (when running git tag --points-at=HEAD on the master, we get a ‘’ response):

verify_tag() {
        sig_header="-----BEGIN PGP SIGNATURE-----"
        temp_name=`mktemp -d sig-verify.XXXXXX`
        git cat-file tag $1 | sed "/$sig_header/,//d"  > $temp_name/content
        git cat-file tag $1 | sed -n "/$sig_header/,//p" > $temp_name/content.asc
        gpg --verify --status-fd=1 $temp_name/content.asc 2>/dev/null|grep -q '^\[GNUPG:\] TRUST_\(FULLY\|ULTIMATE\)$'
        ret=$?
        rm -r $temp_name
        return $ret
}

REF=HEAD
git tag --points-at=$REF
if verify_tag $tag; then
                VALID_TAG_FOUND=1
        else

So, My question is do these sub-modules get verified by Whonix installer at run time or at any point or is there a better way of automatically verifying them?

Edit by Patrick:
Changed title.

Submodules are implicitly verified. The main repository Whonix/Whonix has a signed tag. By verifying and checking out that one you did your part. Later the build-steps.d/1100_prepare-build-machine build step will run.

Then the build-steps.d/1200_create-debian-packages build step will run.

And check if its output is empty. If it’s non-empty, something changed (different commit in submodule as recorded by Whonix/Whonix; uncommited change in submodule as recorded by Whonix/Whonix).

I hope this is a sufficient end-to-end trust chain, but scrutiny is welcome. Hopefully we don’t need individual tags for each submodule and write that manually into Whonix/Whonix. That would be awful.

Maybe we should move this thread to the general development area to get more input. I am not too familiar with submodule security and only learnt how Qubes did it a few days ago, which I found interesting and which led to the initial question.

Without being able to verify the sub-modules could a MITM attack be used substituting a submodule for another, or could someone with complete access to github change parts of the submodule code without you knowing?

I am sure there is some way you can create tags automatically (so nothing needs to manually be done) and I can point you to the verification code Qubes uses to both retrieve [url=https://github.com/nrgaway/qubes-builder/blob/master/get-sources.sh]https://github.com/nrgaway/qubes-builder/blob/master/get-sources.sh[/url] and verify [url=https://github.com/nrgaway/qubes-builder/blob/master/verify-git-tag.sh]https://github.com/nrgaway/qubes-builder/blob/master/verify-git-tag.sh[/url] the submodules. I can also assist implementing this behaviour within Whonix if you choose to go down that path.

Or maybe just store a hash of each submodule directory within the main Whonix repo which could be used to verify the submodule; although I still like the way Qubes handles the submosules.

Anyway, let me know how you wish to proceed. As I stated I am willing to write the code or assist in any way to may it happen if you and the Whonix community believes we should add additional measures to ensure the integrity of the code base.

Not that I know. As far I understand, all that git does when creating a (signed) tag is tying the sha1 hash of a commit to a tag. But the main source Whonix/Whonix (which can be verified) also ties the git submodule to the sha1 hash of a git commit.

I can also assist implementing this behaviour within Whonix if you choose to go down that path.
Please do.
Or maybe just store a hash of each submodule directory within the main Whonix repo which could be used to verify the submodule
This is what git submodules automatically do as I understand it. I would wonder if we have to manually do the same.

There is already some code mass git tag creation. [The purpose was to support building individual sub modules without getting the main source code. (For those who want to pick certain functionality and use it outside of Whonix.)]
https://github.com/Whonix/whonix-developer-meta-files/blob/master/debug-steps/packaging-helper-script#L491 in function git_tags().

Looks like I ought to ask some questions…

First one… Not that related, but still.

Combined command for git tag verification and git checkout?

Also updated the wiki:
https://www.whonix.org/wiki/Dev/Build_Documentation/9_full#Choose_Version

Questions to come:

  • How to git checkout branch/tag + git submodule update --init --recursive
  • How to git checkout + get rid of extraneous submodules?
Anyway, let me know how you wish to proceed. As I stated I am willing to write the code or assist in any way to may it happen if you and the Whonix community believes we should add additional measures to ensure the integrity of the code base.

That would be a welcome move. Thank you for offering to help.

I am not sure if its possible just to do it from just our end, but I think using something stronger than sha1 is necessary. Please use either SHA256 or SHA512 to be on the safe side. Google is hurrying to kill sha1 .pems
https://www.schneier.com/blog/archives/2014/09/security_of_the.html

Git does not have a sha256 / sha512 option. But I haven’t found any claims that git is insecure because of sha1. It uses it a integrity check, not for cryptographic verification.

He explains it here:

(or excerpt of it as text: SHA-1 - Wikipedia)

Haven’t found how git does cryptographic verification, but I also haven’t found any claims that “git tag verification is useless as soon as someone can get a hash collision”.

good article on repo security A Git Horror Story: Repository Integrity With Signed Commits — Mike Gerwitz

From what I understand git tags are important in proving that commits are actually done by who they claim to be by.

Has advice on alternative measures to safeguard your code.

That said, it is important to understand that the integrity of your repository guaranteed only if a hash collision cannot be created---that is, if an attacker were able to create the same SHA-1 hash with different data, then the child commit(s) would still be valid and the repository would have been successfully compromised. Vulnerabilities have been known in SHA-1 since 2005 that allow hashes to be computed faster than brute force, although they are not cheap to exploit. Given that, while your repository may be safe for now, there will come some point in the future where SHA-1 will be considered as crippled as MD5 is today. At that point in time, however, maybe Git will offer a secure migration solution to an algorithm like SHA-256 or better. Indeed, SHA-1 hashes were never intended to make Git cryptographically secure.

Paragraph is about consequences of SHA1 on git security, but the feasibility of sha1 collisions is extremely likely today and the cost is cheap for the people in our threatmodel

https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html

213 * 28.4 = 221.4 ~ $2.77M in 2012
211 * 28.4 = 219.4 ~ $700K by 2015

29 * 28.4 = 217.4 ~ $173K by 2018

27 * 28.4 = 215.4 ~ $43K by 2021</blockquote>
warning: unable to rmdir

[quote=“Patrick, post:5, topic:513”]Questions to come:

Also mailed Jens Lehmann if there are any updates on that issue.

SHA-1 is no question. I wouldn’t like to rely on it for verification purposes.

However, it’s not inherently insecure. If git does not rely on it for verification purposes, it even wouldn’t matter if git used md5 for non-verification integrity checks. (OpenPGP also uses SHA-1. dkg once explained to me that one may have reasons to criticize gpg, but SHA-1 fingerprints are not among the valid points, because… This is probably not the best place to go into this.)

[quote=“HulaHoop, post:8, topic:513”]good article on repo security A Git Horror Story: Repository Integrity With Signed Commits — Mike Gerwitz

From what I understand git tags are important in proving that commits are actually done by who they claim to be by.

Has advice on alternative measures to safeguard your code.

I don’t know if he is aware of what Linus Torvalds said. Will point Mike Gerwitz to what Linus said and this thread.

Asked on security.se.

How safe are signed git tags? Only as safe as SHA-1 or somehow safer?
https://security.stackexchange.com/questions/67920/how-safe-are-signed-git-tags-only-as-safe-as-sha-1-or-somehow-safer

Done.

question about security of signed git tags
Hi Mike,

in reference to your blog post “A Git Horror Story: Repository Integrity With Signed Commits” [1],

we’ve been wondering how secure signed git tags actually are. [2] [3]

Linus Torvalds said. [4] [5]

Git uses SHA-1 not for security

And goes on.

The security parts are elsewhere

When you wrote that blog post [1], were you aware on what Linus Torvalds thinks on that matter?

May I publish your answer?

Cheers,
Patrick

[1] A Git Horror Story: Repository Integrity With Signed Commits — Mike Gerwitz
[2] Whonix Forum
[3] hash - How safe are signed git tags? Only as safe as SHA-1 or somehow safer? - Information Security Stack Exchange
[4] https://www.youtube.com/watch?v=4XpnKHJAok8&t=56m20s
[5] SHA-1 - Wikipedia

Got an verbose answer from Mike Gerwitz. :slight_smile:

(Formatting e-mail → forum by me. Otherwise his answer is as is.)

Hey Patrick,

On Mon, Sep 22, 2014 at 01:55:01PM +0000, Patrick Schleizer wrote:

[quote]we’ve been wondering how secure signed git tags actually are. [2] [3]

Linus Torvalds said. [4] [5]

And goes on.

When you wrote that blog post [1], were you aware on what Linus Torvalds thinks on that matter?[/quote]

I did link to a related message from Linus within my article, but it appears that the link[0] is now broken. I’ll have to correct that.

Here’s an archive:

http://web.archive.org/web/20120701215331/http://kerneltrap.org/mailarchive/git/2006/8/27/211020

Signed tags have existed for quite some time and are a separate feature from signed commits. That said, we can consider them together.

To demonstrate what a signed tag contains, I’ll use tag 0.2.4 of GNU ease.js:

  $ git cat-file -p 0.2.4
  object ee85b058df783ffaa9f8d5ae58f9eb6d7586b0ca
  type commit
  tag 0.2.4
  tagger Mike Gerwitz <mikegerwitz@gnu.org> 1407469612 -0400

  GNU ease.js 0.2.4 released [stable]

  [...snip...]

  -----BEGIN PGP SIGNATURE-----
  Version: GnuPG v1.4.11 (GNU/Linux)

  iQIcBAABAgAGBQJT5EgsAAoJEPIruBWO4w6rIdkP/14qtZdIJPpJqXBGkifrbVu2
  XtCSOH+gXDKSpb4zarmvmJ8LZl6Wt7QidfBNrp5eBOXUTDiKF75Evsj1S61GJxC/
  xh/PZtCcyeFInjd1k7kdIJ8ylK1jwVnToJNaaNtckueperVS6ao+ZKUOdnxSHYe3
  Pa2jQvjMCdDqmoz0Uzaq9ot7lZ3Fdv2buSnme1DBFqa36OWna1ynJ0VOr5orf9DP
  7gvksbG+sdWPkoDXtpzepnpx99CS+oJCUvkZ3pOmfFdU8g7E0rf5+jG3ZfpE0H8c
  R5rS3aO7HF8z0kdCrRCaEHetZbrw6M4+M49OnUHLNXcMgFUjGZtQGNgN4nIM3Ukl
  GjOI514IW88Ba0K+ybehTuT6A/UCxqUbdLhp0gb34C7rBsTEKNmarfUvO5GZ/9j+
  g2w0sHU24tI4XiuBtd/dcYEG1Xudmci1k3T5ffX3yo2Jq/uwhdHtO6WZVE5JOKUh
  945lJ9vq/w0jn2TufNbDnleT3QNhp4llBrCGBsq8CWl0XH6nf08R0KrPssUJDnzY
  An820qKuuAZ4DkV69XIdjXew4IlhEItdLi4n4iGvu6AzUpYgUD5WmWcp0894E8GG
  ruTx8xDlgiT8swCzgHCxMaMcTgxb1Xcwd6+lfPgz4cNKWqRY1THOc6TuzFz5L6MC
  eZ0cYtO46O5O1f4O4cJM
  =GnrZ
  -----END PGP SIGNATURE-----

Before we make too many assumptions here, let’s verify the reasonable assumption that the signature encompasses the entire output that precedes it:

  $ t="$(git cat-file -p 0.2.4)"
  $ gpg --verify <( tail -17 <<< "$t" ) <( awk '/^-----BEGIN/{exit}{print}' <<< "$t" )
  gpg: Signature made Thu 07 Aug 2014 11:46:52 PM EDT using RSA key ID 8EE30EAB
  gpg: Good signature from "Mike Gerwitz (Free Software Developer) <mike@mikegerwitz.com>"
  gpg:                 aka "Mike Gerwitz (GNU Project Maintainer) <mikegerwitz@gnu.org>"

Indeed it is. So the actual value provided by a GPG-signed tag message is that it provides verification that I personally issued a release of that object. So what would it take to compromise that tag?

Well, the tag is referenced by name, not hash, so that is immediately removed from the picture; we are therefore left with the object that it references—ee85b05, If we can create a SHA-1 hash that is both a valid commit object and hashes to the same value, then it will appear (to the user of the compromised repository) that I issued that bad commit.

So let’s take a look at that object:

  $ git cat-file -p ee85b05
  tree 32e5d1faecbc24b16e078ba42c1ab3e2c6515ab6
  parent cef45cd0977f5f3f2baa5a5d2da857aff63ee50b
  parent a5c89565fe6ceb7ebeef9794afb57415bd9bf099
  author Mike Gerwitz <mikegerwitz@gnu.org> 1407466634 -0400
  committer Mike Gerwitz <mikegerwitz@gnu.org> 1407466634 -0400
  gpgsig -----BEGIN PGP SIGNATURE-----
   Version: GnuPG v1.4.11 (GNU/Linux)
  
   iQIcBAABAgAGBQJT5D1nAAoJEPIruBWO4w6rHgEP/ivLes1/aPI9+a/D3yDs2wBm
   ejjz9KlObNzyKPTylbzEurAjVssFJVutwQW3Q4Rcmi+2lY41+3tQmq3M8k1zG3zw
   G4VxoAPQZ0C4N+garmKytUsTAgpV1NFQS+NjO0nUZH0dpv3bZcnBbnAV0CCQaslN
   WTMEDAo5HI9rEQeY/47Pt2AxGKh+cQaxn9Qnh3wrgAQ9oFrCxYRiV6qcZXDL2O8r
   x/xqgy2vDoAawT70pAqQMgHAGixv4YAklebfr3FQ9/0jT6//sPd+ulrzPbA3nXvv
   Xn9W06jUy4IO9ZSuR2MGMGOrhzW/yNK2UL8L9VLrdVrGsb4Jv3BEVy/pKXjU41hb
   mMsdNvzZpPzMtda9LNPEI7uOV8nYE201vxRzv1EKw4aR6GGuHtPnq025BvdbOBVk
   Gz2L2TVP7u3RZ472ovxmmF8jjutUmp+QbWtiH4p4GgWBcRKNFMPQUI6oZBkyth8Q
   BgpbJRL5fRsBxanns422hB8wfK7nYLf4QbDRnOcefISC0npo0DeGGUPB605mTEtP
   kYo3Uv/fORU5gjgzhaeUiQsXtc0EXLdsQBtzlHkapmpoKJNgBMqj2VVO/YHx3wik
   CaVzVl2zI3q0UFpnu0NkDne9svsjPZbOM/N8MVutvk9oDyZsKc0gxq8v+h+pUySg
   tm7cs9SXXKDvIiw3QpYB
   =z41h
   -----END PGP SIGNATURE-----
  
  Various ES3-related bugfixes for bugs introduced by v0.2.3
  
  GNU ease.js remains committed to supporting environments as far back as ES3;
  unfortunately, this is important due to the popularity of older IE versions
  (IE<=8). Btw, ease.js runs on IE 5.5, in case you still need that. 
  
  But please don't use a proprietary web browser. Indeed, this is why the
  breaks were introduced in the first place: I neglected to run the
  browser-based test suite on the proprietary Microsloth browsers until after
  the v0.2.3 release, because I do not own a copy of Windows; I had to run it
  at work. But, regardless---my apologies; I'll be more diligent.

Alright—there’s a number of things to note here. Firstly: a commit’s hash is generated from all of that above content:

  $ git cat-file -p ee85b05 | git hash-object --stdin -tcommit
  ee85b058df783ffaa9f8d5ae58f9eb6d7586b0ca

You’ll notice that this is precisely the hash referenced in the tag. If we were to change the commit content in the slightest, we’d get a different hash:

  $ cat <( git cat-file -p ee85b05 ) <( echo foo ) | git hash-object --stdin -tcommit
  696a73618dd5d0d39f030d19ceab08c14115af4e

If we were even to change the type of the object (leaving the commit data
alone), we’d even get a different hash:

  $ git cat-file -p ee85b05 | git hash-object --stdin -tblob
  441bab4e4006f63d859666322e53740014dcccf0

As an attacker, our approach would depend on what it is we are trying to manipulate. Let’s say that the goal is to introduce malicious code into the
system. Well, in the case of this commit, it’s signed—we cannot change its content without invalidating the signature. But not everybody signs their
commits, so in another repository, an object referenced by a tag may still be open for exploitation. But the case of ee85b05, such protection doesn’t
necessarily buy us anything: what if we instead looked at the tree?

  $ git cat-file -p 32e5d1faecbc24b16e078ba42c1ab3e2c6515ab6
  100644 blob ca0ac30fb6f5c008ec7949cf78190876aaaab0ba    .gitignore
  100644 blob 0145be4a9a32e42a08de01f100ea243c264f6478    .mailmap
  100644 blob 94a9ed024d3859793618152ea559a168bbcbb5e2    COPYING
  100644 blob 5a25289c27e078c45be6fa22afc26225b78c65a0    Makefile.am
  100644 blob f9d1937bc4209f7d24d6cf41a2c4095f41525aee    README
  100644 blob 7254909d0b2b03c680a74bc41b1101d9e3ce903e    README.hacking
  100644 blob a088c90395b34a113a9f1c1430ceb7d3a13162ea    README.md
  100644 blob 92a128e4015a50f7194ca3f67b4319bdf2715217    README.todo
  100644 blob 437da55d65a395ab2fe37536c16783a571832161    README.traits
  100644 blob 1c909f9e6c4fc74c49bad30cf78f38bd76e0765c    configure.ac
  040000 tree f060b0f430df3a42632bf934a71b6c103ab7b189    doc
  100644 blob 171e23e95d28985cdd78c514272c7d44bc5a8ae0    index.js
  040000 tree c663e578896401ec3a93ca8034130c984e8eab5e    lib
  100644 blob 9d5c6318b8c32fe2784924a52bf23705f56530b2    package.json.in
  040000 tree 46e7c203f0e02e8af55ec5b54b813ffbcaabb075    test
  040000 tree 5e5fc84526d864d407dfbfc8f99b9c72645ffb60    tools

One option is to rewrite any of those blobs such that they hash to the same value, but contain malicious code. But that is difficult—not only is it
hard to generate working code that hides a working exploit such that it hashes to the same value, but it must also play nicely with Git’s deltas:[1]
while the topmost blob will contain the full content, each previous commit will contain only a delta relative to it. So if you compromise the
repository, someone is going to notice, because the delta won’t apply; you’d need to distribute fresh repositories with properly resolved deltas.

So on and so fourth, for every referenced hash, be it a tree, blob, or other commit—anything you modify, you have to either keep the hash the same and consider related consequences, or generate a new hash an be able to amend history such that, at some point, hashes match up so that you can join your amended history with the original. As an example:

  --o--A--o--o--B--H
        \      /
         `--X-`

Perhaps you were able to figure out some commit X that hashes to B’s parent and references A as its own parent.

So the goal is to mitigate this as much as possible. SHA-1 already does this fairly well by making it incredibly difficult to come up with sensible data
that both works with Git and produces the intended result. There is finally a strong move by Google to put a stop to SHA-1 web certificates[2], which
contains much rationale as to why SHA-1 is bad. And that’s all true—you should not be using SHA-1 cryptographically. But as you yourself mentioned, SHA-1 was never intended to provide any cryptographic assurances in Git. But we aren’t dealing with randomly generated keys here: we’re dealing with human-readable content. The content still has to be useful and still has to make sense—even if a SHA-1 collision is found for a commit in your repository, that doesn’t mean that it will get the job done, since it’s likely to be utter nonsense (random data). Therefore, the risk is much less serious than certificate verification.

Nonetheless, SHA-1 is still involved in the signing process, because Git references them as a part of the signed text.

So you approached me with one problem, and I’ve demonstrated another:

  1. What is the value of GPG-signed tags?
  2. What is the value of GPG-signed commits?

The value of GPG-signed tags should at this point be clear: it can state, with confidence, that the signer did indeed issue a tag of the referenced object (SHA-1), with the included message. And that’s good—you want to know that a maintainer did actually make a release, and not just some random person that got a hold of credentials to the public repository and release server. But any further assurances are difficult, because the tag signs only a single hash: the only restriction an attacker has is that he/she must create a commit object that hashes to that same value (technically tags don’t have to be hashes, but users are expecting them to be in this context). There is no restriction on the content of that object, because a tag can point to any arbitrary commit. Therefore, an attacker has a great degree of flexibility with signed tags.

Take note of Linus’ reply mentioned above:

Which brings us to signed commits. In the case of GNU ease.js’ 0.2.4 tag, we have an additional protection, although it’s unfortunately enforced only by convention: I tag only signed commits. Therefore, even if an attacker were to compromise a tag, he/she would be unable to sign the newly minted commit, and an automated system (or user) could see this and say “no, something is wrong”.

But that does not prevent an attacker from going after any of the other commits in the repository—and there are plenty.

The goal is to make it as difficult as possible. The odds are already very difficult, so the effort you put into further deterring crackers is up to
you. My article mentions three options: signing every commit, signing only merge commits, and squashing into a single commit. I’d argue that squashing into a single commit is the least effective (for reasons described above), whereas signing every commit is quite effective, since the meaningful collision space limited then to tree and blob manipulation. But signing only merge commits isn’t all that bad either—you still reduce the odds of a lucky find.

Certainly. I hope the level of detail I have provided is useful for everyone involved. As a disclaimer: please note that I am not a cryptographer, and I do not work on (hack) Git. I also didn’t have time to proofread, which is perhaps the most important part of this disclaimer.

0]: An Interview With Linus Torvalds: Linux and Git - Part 1 | Tag1 Consulting
1]: Git - Packfiles
2]: Why Google is Hurrying the Web to Kill SHA-1


Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

My answer.

Hi Mike,

thank you a lot, Mike for the great verbosity of your answer!

I find it useful. Probably useful for lots of other people. Since you took great effort and time by writing it, I’d consider posting it in your blog.

Much appreciated, thank you again, Mike!

Cheers,
Patrick

Excellent response from Mike. Thanks for asking him in the first place Patrick. I think there might be a thing or two useful for Whonix repo assurance.

Oh, this is where this thread went too :slight_smile:

I will post a link to it within the Qubes forum since this is an outstanding question that needs to be resolved and I hope the Qubes developers can also weigh in on it

I started a thread related to this topic on the Qubes developer list at https://groups.google.com/forum/#!topic/qubes-devel/MUTO9eC7nx4.

If there is any response I asked them to come here, but also stated I would let Patrick know the thread existed over there.

I am not sure you surpassed the threshold of this begin recognized as as possible security issue.

And what makes this difficult is mixing up topics. We were mixing at least 3 topics here:

  1. In general, are git submodules safe against an adversary that can break SHA-1?
  2. In general, is it sufficient, when a main repository is signed/verified to ensure integrity of git submodules?
  3. Whonix’s use of git submodules safe?

So perhaps wipe all Whonix specificness from your question and start with 1)?

Another even more basic question. Make that question number 0). Forget about git submodules for this question.
Small rehash: a signed git tag maybe is just as safe as a signed SHA-1 hash.
The question is, is git tag verification suitable in the Whonix or Qubes threat model at all due to SHA-1?

  • git tag verify without further checking the code (because you either trust the signer or verified that tag yourself already on another machine)
  • adversary perfectly capable of creating any SHA1 collisions

Edit:
Fixed typo.

On one hand, splitting Whonix into multiple packages was a great move. Functionality is now abstracted into packages, that makes grasping and developing Whonix much simpler. So I really want to keep each package as standalone as possible and in it’s own repository.

On the other hand, git submodules are a huge PITA when working with git branches, because they do not work as usual git branches. Backporting fixes from Whonix 10 development branch to Whonix 9 stable branch is a huge PITA. Also we haven’t finished scrutinizing security of git submodule verification, but due to open question 0) (see above) this might not be the biggest argument against them at the moment.

So in any case, I’d be interested to move away from git submodules. If there is an alternative, that works more like the usual git branching approach, but that still allows separate packages in separate git repository, that’s be awesome! Perhaps git subfolders could accomplish that, but I didn’t get to research that yet.

Notified tails-dev mailinglist, freepto mailing, boyska (boyska (BoySka) · GitHub) list about this forum thread.

[quote=“Patrick, post:18, topic:513”]Another even more basic question. Make that question number 0). Forget about git submodules for this question.
Small rehash: a signed git tag maybe is just as safe as a signed SHA-1 hash.
The question is, is git tag verification suitable in the Whonix or Qubes threat model at all due to SHA-1?