Has anyone looked at the extent to which the apt/dpkg metadata might make a debian-based VM identifiable with repository access times?
As long as the repository is accessed over its own tor circuit it shouldn’t matter too much, I suppose. Just wondering how much work might be involved in cleaning that data.
So far, all I know is that removing /var/{log,lib,cache}/{apt,dpkg} breaks things pretty badly. 
Has anyone looked at the extent to which the apt/dpkg metadata might make a debian-based VM identifiable with repository access times?
To my knowledge, not a lot. This post is related:
https://guardianproject.info/2014/10/16/reducing-metadata-leakage-from-software-updates/
Just wondering how much work might be involved in cleaning that data.
Patching apt-get and upstreaming these patches to Debian. Or... You tell me.
Thanks, Patrick, that link is very interesting.
I was thinking about cleaning the files after an upgrade. A little fragile, but hopefully not too much upfront work.
I don’t think that’s the best spot to start.
The TODO I see is learning what the leaked metadata actually, is before attempting to fix it. To analyze it, this could be done by running apt-get through a proxy and/or using wireshark and/or reading apt-get’s source code.
When just somehow modifying files with no previous knowledge in hope to improve anything could end up with worse results than before.