Identifiability in debian package metadata

Has anyone looked at the extent to which the apt/dpkg metadata might make a debian-based VM identifiable with repository access times?

As long as the repository is accessed over its own tor circuit it shouldn’t matter too much, I suppose. Just wondering how much work might be involved in cleaning that data.

So far, all I know is that removing /var/{log,lib,cache}/{apt,dpkg} breaks things pretty badly. :slight_smile:

Has anyone looked at the extent to which the apt/dpkg metadata might make a debian-based VM identifiable with repository access times?
To my knowledge, not a lot. This post is related: https://guardianproject.info/2014/10/16/reducing-metadata-leakage-from-software-updates/
Just wondering how much work might be involved in cleaning that data.
Patching apt-get and upstreaming these patches to Debian. Or... You tell me.

Thanks, Patrick, that link is very interesting.

I was thinking about cleaning the files after an upgrade. A little fragile, but hopefully not too much upfront work.

I don’t think that’s the best spot to start.

The TODO I see is learning what the leaked metadata actually, is before attempting to fix it. To analyze it, this could be done by running apt-get through a proxy and/or using wireshark and/or reading apt-get’s source code.

When just somehow modifying files with no previous knowledge in hope to improve anything could end up with worse results than before.