Idea proposal of Cover Traffic / a "Fake-Workstation"

Hi everyone,

As probably the weakest part of anonymity on Tor is traffic analysis, i.e. traffic size or timing correlation, I have been thinking about this topic recently. So, I wanted to share my current ideas and I believe we can brain storm with interested people and create tools and usage ideas.

Some related document links about traffic analysis:
Users Get Routed: Traffic Correlation on Tor
Towards Efficient Traffic-analysis Resistant Anonymity Networks
Inferring users’ online activities through traffic analysis

Some of my ideas:

  • Traffic analysis should become harder as the number and type of connections increase, for example torrent or web downloading, chatting, browsing at the same time should benefit each other activity against traffic analysis. It may also be bad for anonymity depending on the threat model, but our main topic is traffic analysis.

  • Using a VPN should benefit the user against traffic analysis, assuming not all traffic consists of Tor. For example regular web browsing, torrent downloading over VPN could be beneficial for the concurrent connections of Tor over VPN. Nested chains of VPN or Tor is another topic.

  • To achieve better anonymity, user needs more connections and connection types, preferably planned for each nested chain separately.

For example:
browser traffic over VPN
Tor traffic over VPN
Torrent traffic over VPN
i2p/freenet/… traffic over VPN
Torrent traffic over i2p over VPN
Whonix traffic over VPN
Torrent traffic over Whonix over VPN (just an example)

  • Some possible features:

By default, lots of different type of connections using low bandwidth, which would especially protect regular Tor Browser traffic, both in terms of traffic size and timing.

Traffic creation at random times

Traffic size selection or limits

Cronjob tasks, download tasks etc.

  • Some suggestions:
    I believe rather than wasting traffic we / the user could support other projects while at the same time protecting Tor connections. For example, when using a VPN, running other anonymizers like i2p and freenet at the background would continously create traffic which would benefit those networks and the user at the same time. Whether VPN is used or not maybe we could do similar things in this “Fake-Workstation” without harming Tor network. For example torrents might be good as a cover traffic, but it’s bad for Tor. Is i2p or freenet over Tor a good idea? What about downloading/seeding torrents over i2p over Tor?

Since Whonix is probably the only complete system dedicated to “real” anonymity, I believe there may be people interested in researching this topic further and share ideas.

My ultimate goal here is the creation of a minimal “Fake-Workstation” which would include several tools to create cover traffic, not necessarily fake traffic but using any means to protect users’ any other activities. This is a broad topic and it could be useful to have such a system in other places too, like routers, etc. Perhaps we could create tools that other systems, organizations or people could benefit / collaborate in future.

1 Like

Are there any research papers on cover traffic / padding?

The Tor Project made a rather devastating statement about the usefulness of it.

See:

See also:

(search for “cover traffic”)

I think it would be best to start a discussion about this on the tor-talk mailing list first to see if this has a chance to really improve things.

1 Like
1 Like

2011: Quote Experimental Defense for Website Traffic Fingerprinting | The Tor Project

We disagree with the background fetch approach because it seems that a slightly more sophisticated attack would train a separate classifier to recognize the background cover traffic and then subtract it before attempting to classify the remainder. In the face of this concern, it seems that the background request defense is not worth the additional network load until it can be further studied in detail.

Comment by Mike Perry on the same page:

We believe the defense provided by the researchers doesn’t seem like it will stand up to a more sophisticated attack, and will just waste network resources. We believe this because there is a history of failures of background cover traffic in the academic literature. The basic stuff definitely doesn’t work, and even the complex schemes are regarded as questionable.

2019: Quote New low cost traffic analysis attacks and mitigations | The Tor Project

Users: Do multiple things at once with your Tor client’ Because Tor uses encrypted TLS connections to carry multiple circuits, an adversary that externally observes Tor client traffic to a Tor Guard node will have a significantly harder time performing classification if that Tor client is doing multiple things at the same time. This was studied in section 6.3 of this paperarchive.org by Tao Wang and Ian Goldberg. A similar argument can be made for mixing your client traffic with your own Tor Relay or Tor Bridge that you run, but that is very tricky to do correctlyarchive.org for it to actually help.

2 Likes

The research papers linked by Patrick, especially the Website Fingerprinting paper by the University of Waterloo show that we would need a daemon in Whonix that can:

  • Download files of varying sizes, somehow all at once (to avoid the University of Waterloo attack) or donate bandwith to a project like what the OP suggested
  • Not give away a vector to infiltrate the Workstation (the source MUST be trusted)
  • Not significantly disrupt the user’s regular operations.

Getting rid of traffic analysis this way is not impossible. The issue is that we sacrifice one thing when we protect another. If a daemon runs constantly for all Whonix users, we run the risk of letting the ISP of any Whonix user know who is one, since the Tor connections of Whonix clients have more sporadic data collection.

If the users choose to enable the daemon, this allows clients that need protection from advanced adversaries to carefully choose if they want to sacrifice not glowing like a beacon on the adversaries map for the security of their communications based on their situation. However, we now make it so that users who choose this route glow as even more suspicious to a passive and/or advanced adversary.

A Whonix user who is torrenting massive files or donating bandwidth to Hyphanet (possible as UDP is coming for Tor!) while browsing a forum is technically protecting their forum browsing. If the adversary knows that they are one of a handful of Tor users downloading big files like that (they talk about it on the forum, for example), that’s still really bad.

OP’s suggestion is better off as an addition to the Wiki for advanced users fighting against advanced adversaries with a very big warning that the user needs to consider their security model to decide if sticking out to an adversary for further monitoring and giving a fingerprint is worth making it harder for the adversary to read your communications. The user should choose their obfuscation method (the decoy activities). Why? Because anything we code is predetermined and vulnerable to the background fetch attack!

We disagree with the background fetch approach because it seems that a slightly more sophisticated attack would train a separate classifier to recognize the background cover traffic and then subtract it before attempting to classify the remainder.

3 Likes

That’s no longer a concern. Network, Browser and Website Fingerprint

Network Stack Hardening Whonix has implemented various security hardening
measures like disabling TCP timestamps, ICMP redirections, firewalling invalid packages, and more. Unfortunately these measures can increase the risk of ISP or Local Network fingerprinting. Despite this, security hardening has been prioritized.

Security and anonymity is more important than bleeding in / operating system detection.

Please explain.

Background fetch attack?

Is there a solution?

2 Likes

There are other issues to keep this opt-in. At least initially.

There are two issues.

A) Tor network capacity limitations.

The Tor Project expressed they preferred if file sharing isn’t used over the Tor network. References here:

Such continuous traffic would be similar.

B) Legal liability; courtesy:

Making all Whonix users use cover traffic by default would add additional automated (“robot”) Tor traffic. A lot traffic that isn’t even being looked at by any human, not even any program. (Solutions such as Proposal: add Noisy to default software simply discard after fetching a website or don’t even store it.)

Remote servers (websites) would received increased load. This could be considered a DDOS.

There might be a legal risk in doing that.

Also courtesy may require not doing that.

Maybe it would be different if noisy(or some other implementation) adhered to robots.txt standard that allows for an opt-out for websites.

1 Like

Paper: Maybenot: A Framework for Traffic Analysis Defenses | Proceedings of the 22nd Workshop on Privacy in the Electronic Society

Source code: GitHub - maybenot-io/maybenot: a framework for traffic analysis defenses

2 Likes

Looks interesting, and complicated.

This seems like a feature too big for the Whonix project to implement. This should be integrated directly inside Tor. Not only in Whonix. If Tor implements this feature, then Whonix would also get access to this feature.

Unfortunately, on torproject.org issue tracker I cannot even find a feature request to implement a similar feature in Tor.

2 Likes