Tor Project + Namecoin intern working on ptrace-based proxy leak detector

Hi Whonix devs!

I wanted to give you a heads up that I’m mentoring an Outreachy intern, Robert Nganga, on a project to detect and block proxy leaks with ptrace. ptrace has major advantages over LD_PRELOAD (which torsocks uses), primarily that ptrace works for statically linked binaries such as Go programs. ptrace also doesn’t interfere with stream isolation like transproxying does. The project is mentored by The Namecoin Project (and was motivated by the unpleasantness I had to deal with when manually auditing the Namecoin integration in Tor Browser for proxy leaks), under the Outreachy umbrella of The Tor Project.

As with any Outreachy project, obviously there’s no guarantee that the project will produce a working final product, but I wanted to bring it to your attention so that if Whonix has any interesting requirements in this space, we can try to ensure that Robert’s project meets those requirements. It would be cool if Robert’s project could eventually be included in Whonix alongside stuff like torsocks.

Cheers!

2 Likes

Without having looked into it… Is ptrace the best / only better alternative? Wondering, might Linux namespaces be (more) capable / suitable as a torsocks version 3?

Other alternatives to consider:

Once upon a time there was torsocks version 1. If I remember right, version 2 was a complete rewrite which came with many improvements. If the current torsocks maintainer agrees, this might become torsocks version 3. That would help getting the tool into Debian, help its popularity and also help getting Whonix to use it.

Ideally also command line options compatible.

Good feedback, thanks @Patrick ! (As usual, you tend to ask very good questions.)

One of the reasons we picked ptrace is that (like LD_PRELOAD) it offers a lot of flexibility in terms of exactly how the SOCKSification is performed. Here’s a use case that we had in mind:

P2P nodes such as Bitcoin seeders need to open connections to a large number of peers, and need to resist Sybil attacks (whether by malicious peers or malicious/broken relays). A naive approach to this use case is to stream-isolate by destination address, so that no single Tor exit can fully control our view of the P2P network. However, this will generate hundreds or thousands of Tor circuits in the Bitcoin seeder use case, which would DoS the Tor network. Isolating by destination address also doesn’t prevent a broken exit relay from hiding our view of any P2P nodes that we happen to have assigned to that exit relay. A different approach is to use a new circuit per TCP connection. While fixing the latter problem, this magnifies the first problem, since if a P2P node connection fails, our node will constantly open new Tor circuits trying to get a connection. AFAIK the “right solution” (IIRC recommended by Isis Lovecruft to the Bitcoin Core devs many years ago) is to create a small number (~10) of SOCKS usernames for the affected application, and randomly assign each TCP connection to one of those SOCKS usernames. This guarantees that if a few of the Tor circuits are broken, the affected TCP connections will migrate over to the other Tor circuits, while preventing Sybil/eclipse attacks from malicious Tor exit relays, and doesn’t DoS the Tor network since it limits the circuit count to 10 circuits every 10 minutes.

ptrace and LD_PRELOAD allow handling this kind of logic, since they allow us to use custom code to construct the SOCKS username based on whatever input we want. As far as I know, Linux namespaces do not give us this kind of flexibility. Maybe I’ve missed some feature of Linux namespaces that would let us do this – if so, please let us know.

I’m familiar with orjail. Based on the fact that it uses the Trans/DNS ports rather than a SOCKS port, my understanding is that it is not compatible with the above use case (and since it uses Linux namespaces, my assumption is that Linux namespaces in general are not able to handle this use case).

I notice that the issue you linked mentions Go and Rust. Robert’s project is using Go, which has excellent memory safety (better than C), excellent performance (better than Bash or Python), and excellent bootstrapping security (better than Rust). We’re using the ptrace library from u-root, which is a well-respected project with highly competent developers and (we think) reasonably good auditing (u-root is a spinoff of Coreboot, and I’ve used it in Namecoin for unrelated things before).

I think we’d be favorable to this, but I’m hesitant to approach the torsocks maintainer until Robert’s code is more mature.

Good point. Right now we have a different command-line syntax due to different functionality goals (we intend the functionality to be a superset of torsocks eventually, but for now it’s partially disjoint), but I think we’d be favorable to having some kind of “torsocks compatibility mode” in the command-line interface so that it could be used as a drop-in replacement for the torsocks binary.

1 Like

How leak-proof do you think a ptrace based approach would be by comparison? Are there cases where it fails such as were both torsocks v1 and torsocks v2 (as well as most if not all other proxifiers) are failing where a ptrace based approach is functional? For example, could it be used to proxify Firefox / Chromium (not because that might be a great idea, just because that is a great test for a complex application).


Feature request (that’s hopefully lightweight):
Using “simple” https / socks4 / socks5 proxies.

  • (Initially) not supporting credentials if that’s a lot of extra work.
  • UDP support might be a lot extra work and not justified.
  • https proxy support might be a lot extra work and not justified.
  • socks4 support might not be worthwhile (why not use socks5 instead).
  • socks5 support might come “almost free” since to interface with Tor, I assume there’s no way around socks5 anyhow.

This might be nice for testing. (Using a local SSH socks5 tunnel port.)


torsocks feature parity wishlist:

  • AllowInbound
  • AllowOutboundLocalhost

[default off - same as torsocks]

These are on by default in Whonix and required for some applications that open server ports (onions) such as OnionShare.


I see you were doing your homework during preliminary research and selecting which technologies this should be based on and didn’t miss any prior work that I am aware of.

Trying to come up with a more clever idea…

I don’t know that. However, here’s a wild idea:
Could you do all of what you’re describing and then use Linux namespaces wrapped around all of that as a leak-shield?

According to Hugo Landau, there are various projects using ptrace to implement generic sandboxes (which I was able to confirm by a brief search, though I didn’t try to audit any of those projects). So it’s probably fairly leak-proof. Not sure exactly how well its leak-proofing compares to Linux namespaces; both ptrace and Linux namespaces are likely to be much better than LD_PRELOAD on this metric.

Robert’s code isn’t quite mature enough to test this yet, but I just ran both Firefox and Chromium in strace (which uses ptrace to simply log all the syscalls) and it worked fine (the following logs are from accessing https://www.namecoin.org/ in the browser):

user@debian:~$ strace -f -e trace=socket,getsockopt,setsockopt,getsockname,connect,bind,sendto,sendmsg,recvfrom,recvmsg,accept,shutdown,listen,getpeername,socketpair,accept4,recvmmsg,sendmmsg firefox 2>&1 | grep 91.219.237.223
[pid 126141] connect(147, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] getpeername(147, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] connect(74, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126141] connect(75, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 126141] connect(77, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126141] connect(78, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 126141] getpeername(74, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] getpeername(74, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] getpeername(75, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
[pid 126141] <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, [112->16]) = 0
user@debian:~$
user@debian:~$ strace -f -e trace=socket,getsockopt,setsockopt,getsockname,connect,bind,sendto,sendmsg,recvfrom,recvmsg,accept,shutdown,listen,getpeername,socketpair,accept4,recvmmsg,sendmmsg chromium 2>&1 | grep 91.219.237.223
[pid 126948] connect(35, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126948] connect(37, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126948] connect(41, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 126948] connect(42, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16 <unfinished ...>
[pid 126948] connect(43, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("91.219.237.223")}, 16) = -1 EINPROGRESS (Operation now in progress)
user@debian:~$

Yes, this will be necessary for Cirrus CI testing anyway, because Cirrus CI infrastructure blacklists Tor traffic (I think due to cryptocurrency mining abuse). The main difference between Tor and a generic SOCKS5 proxy is that Tor handles DNS differently, but that will hopefully not be too hard to handle. Tor Project is intending to add UDP support to Arti soon (it’s part of their TorVPN initiative), so we’ll need to handle UDP eventually, but yeah, stuff like UDP and HTTPS proxies are not a super high priority.

Allowing inbound connections was already on our radar; special-casing localhost wasn’t on our radar yet, but it probably would have gotten onto our radar when we inspect the torsocks config file format. Thanks for getting it onto our radar early.

Hmm, so the idea would be to use Linux namespaces to just drop any leaking traffic (contrasted to what orjail does with redirecting it)? Yeah, I think that should be doable (and provides some good defense-in-depth); good idea.

2 Likes

Early WIP code is here. Documentation is still WIP, and the code should not be used in production at this time (mainly because there are known leak vulnerabilities because it currently uses a syscall blacklist, which will soon be corrected to use a whitelist instead). But, if anyone here wants to experiment with it (or report bugs and/or do code review), you’re more than welcome to do so.

1 Like

What will be the project name so I can refer to it?

Is torsocks compatible with flatpak?

Would the ptrace-based proxy leak detector be compatible with flatpak?