Tor is not yet fully bootstrapped. 30 % done

Hi All,

I’m trying to set up whonix gateway in my KVM environment. The VM boots fine but I can’t get it to establish a tor connection. Instead it gets “stuck” at 30%.

More detail:

KVM is working for multiple other windows/linux VMs and is stable.

Whonix image imported per instructions.

All services started
user@host:~$ sudo systemctl list-units --failed
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use ‘systemctl list-unit-files’.

Time sync has been checked and is more or less correct (up to 2 minutes out).

Clearnet test works from gateway - I can curl the tor check page as per the troubleshooting instructions

sdwlog gets spammed with the following:
2020-01-19 04:27:32 - sdwdate - INFO - The clock is sane.
Within build timestamp Fri 22 Nov 2019 04:16:03 PM UTC and expiration timestamp Tue 17 May 2033 10:00:00 AM UTC.
2020-01-19 04:27:32 - sdwdate - WARNING - Tor is not yet fully bootstrapped. 30 % done.
Tor reports: NOTICE BOOTSTRAP PROGRESS=30 TAG=loading_status SUMMARY=“Loading networkstatus consensus”

Eventually the daemon times out with this message:
Tor reports: WARN BOOTSTRAP PROGRESS=30 TAG=loading_status SUMMARY="Loading networkstatus consensus" WARNING="Connection timed out" REASON=TIMEOUT COUNT=1

But it appears to keep trying anyway.

syslog/daemon/journald logs are spammed with this:
New control connection opened.

I’ve found that if I drop the firewall (iptables -F) and adjust the default policy for the chain to ACCEPT I get a connection.

Tor itself is not blocked in my country. I can use the tor-browser without problems.

Tor Documentation for Whonix Users

This is the only weird thing. Everything else is expected. The first such report in 7 years. To debug enable logging in Whonix firewall:

sudoedit /usr/bin/whonix-gateway-firewall

Search for
## Log

You’ll find:

#$iptables_cmd -A FORWARD -j LOG --log-prefix "Whonix blocked forward4: "
#$iptables_cmd -A OUTPUT -j LOG --log-prefix "Whonix blocked output4: "
#$iptables_cmd -A INPUT -j LOG --log-prefix "Whonix blocked input4: "

Comment these in by removing the # in front of it.

$iptables_cmd -A FORWARD -j LOG --log-prefix "Whonix blocked forward4: "
$iptables_cmd -A OUTPUT -j LOG --log-prefix "Whonix blocked output4: "
$iptables_cmd -A INPUT -j LOG --log-prefix "Whonix blocked input4: "

Then journal should show what’s blocked.

1 Like

I have this issue with a fresh install of Whonix-Gateway 16.0.9.8 running on a Debian 11 KVM host, hardened with KickSecure,

I found a fix is to update /usr/bin/whonix-gateway-firewall and ensure that both ESTABLISHED and RELATED connections are allowed in the INPUT iptables chain.

Line 383 can be changed from:

$iptables_cmd -A INPUT -m state --state ESTABLISHED -j ACCEPT

To:

$iptables_cmd -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
1 Like

For reference:

RELATED,ESTABLISHED → ESTABLISHED


1 Like

Hey Patrick,

I do find it quite interesting that it’s been 2014 to 2023 with your change made and no problem reports, yet despite days of trying I cannot find another way to have this set up work.

I’m pretty sure that I did not have to do this with a KVM host running stock Debian (without Kicksecure) so it’s possible that there could be an alternative fix at the host level.

If I have time to investigate further then I will try to set up a new bare metal box without Kicksecure at the host level and compare my experience.

1 Like

I have now re-tested with a stock KVM installation, no Kicksecure on the host, and still I have issue with Tor being stuck at 30%.

It looks like this could be related to:

https://forums.whonix.org/t/screenshot-of-gw-active-tor-and-correct-sdwdate-ws-with-systemcheck-error/16613

…But I don’t see any improvement by force setting the VMs date/time. So it seems not.

When I ran the below:

sudo iptables -I INPUT -m state —state RELATED -j ACCEPT

…Tor connects successfully.

It seems that the RELATED state appears to be required for the latest version of Whonix GW to work with KVM.

Software versions for reference:

  • Debian 11.7.0 (Note: with non-free NIC driver using bnx2 module)
  • qemu-system-x86 1:5.2+dfsg-11+deb11u2
  • Whonix-Gateway-XFCE-16.0.9.8

Happy to test anything else that could be useful.

I would suggest my PR could be helpful to provide an optional configuration option which may help alleviate this issue for other users:

Can anyone disprove my findings in their environment?

Thanks, merged!

1 Like

related:

1 Like

Added to Whonix firewall config file.

And now also documented here:
Troubleshooting - Whonix chapter RELATED Fix in Whonix wiki

1 Like

It appears this is very related actually.

My issue is fixed when adding a more restrictive rule:

IPTABLES -I INPUT -p icmp —icmp-type destination-unreachable -m state —state RELATED -j ACCEPT

That being the case, do you feel it would be worth my creating another PR and alternate config option?

Happy to do so as in my opinion the more restrictive we can be the better.

1 Like

PR is welcome however Whonix firewall is currently being ported to nftables. (Help welcome with that btw.)

https://phabricator.whonix.org/T509

Optional: perhaps also send PR for the nftables version using iptables-translate?

1 Like

Ahh yes I wasn’t aware of the active nftables works. I’ll look to submit a working version for nftables.

I may have a better (or preferable to some?) solution in mind (MSS clamping) rather than allowing enough ICMP through the firewall for PMTU to take place.

Investigating on my end and will post back with potentially multiple options. I guess it will be worth my writing up in a way that could form a section on the wiki page regarding troubleshooting connectivity on a connection with a reduced MTU (PPPoE, GRE, other tunnels).

Interestingly Tails doesn’t have such issue (I think) so I should have a look at their iptables rules and see if they’re doing any magic to support lower MTUs out the box.

Update:

Investigating on my end and will post back with potentially multiple options.

I found that applying MSS clamping at the KVM host level seems to work well.

Just double checking which of the many iptables rules I tested with did the trick.

I do find it quite interesting that it’s been 2014 to 2023 with your change made and no problem reports, yet despite days of trying I cannot find another way to have this set up work.

Well, I’m not sure why it didn’t come to mind sooner as this is something I’ve had to deal with in other circumstances - but now it makes sense.

The issue is only going to present itself for users on connections with reduced MTUs. So this won’t be all that common for most users :slight_smile:

1 Like

I think we can add this rule to firewall before the icmp drop rule to fix this problem:
sudo iptables -I INPUT -p icmp --icmp-type fragmentation-needed -m state --state RELATED -j ACCEPT

1 Like

I’ve created pull request (can’t include link normally since it’s a new account):

2 Likes

This isn’t a good solution in my opinion.

This issue affects hardly any users so enabling this for everyone is not good.

What implications does this have on anonymity of the gateway? What leaks does it allow?

It’s better to fix this at the router level. All consumer routers will MSS clamp before the MTU and anyone with an enterprise network can configure this.

This issue only becomes a problem when using a NAT gateway on KVM host. So should really be fixed at the KVM level.

Slightly less optimal is to make it optional and only affected users can enable it.

Please do not abolish the GATEWAY_ALLOW_INCOMING_ICMP feature.

Fixed.
(Kicksecure ™ Forums Usage Instructions, Best Practices and FAQ chapter Posting Links for New Users in Kicksecure wiki)
(Whonix is based on Kicksecure.)

It’s not default.

The beauty of the Whonix design is that even without the Whonix-Gateway firewall, there’s still no connections from Whonix-Workstation to clearnet possible. →
Technical Introduction chapter Security Overview in Whonix wiki

Hey Patrick,

I’m not suggesting we could be introducing a leak from the workstation itself, though it may aid in correlation of end to end traffic similarly to what I describe momentarily.

More importantly I would say there are instances where the gateway itself is an important component.

For example, where a user is using the gateway to host onion services if we are to allow PMTUD to take place we could stand out more. This “fragmentation required” response won’t be coming from our local router/gateway, this is sent from the guard node and gives them the ability to influence the size of packets we then send.

This can be used to work around an MTU issue but equally a compromised guard node could request each client replies with a different MTU and begin to profile us.

This reduced MTU size would apply end to end through the tor network and actually could allow correlation from guard to end user.

I’ve been giving this a lot of thought and I feel there could be major implications of using anything but the standard 1500 MTU when it comes to anonymity.

Or do you think I’m way off?

A compromised guard relay has many ways to tamper with the traffic. MTU isn’t required. One way would be to introduce arbitrarily delay the traffic thereby introducing a recognizable pattern.

Another way to think about this is as per Generic Bug Reproduction. The Tor Project to my knowledge doesn’t have a strong, generalized recommendation to block all ICMP. Also Free Haven's Selected Papers in Anonymity doesn’t mention ICMP. I am not aware of any other Tor anonymity research mentioning ICMP either.

1 Like

Hey Patrick,

Thanks for the literature. I will most certainly consume it.

I suspect that I am probably being overly cautious then. My interest in Tor is from an academic perspective so I certainly could be over thinking this.

Just so I’m clear about my point regarding MTU, whilst the ICMP protocol allows dynamically scaling back the MTU that is sent based on the request, the data would then be sent over TCP for the rest of that stream.

But I guess if none of the Tor project literature suggests this is a concern then I’m almost certainly over thinking it :slight_smile:

As for some thoughts on a fix…

If the MTU issue is on the Whonix users side (where the user is connecting from) then I would personally consider it best that the user configures their system to work around that with a static MTU. This would likely be beyond the knowledge of the average user though, especially someone new to Linux, so perhaps automation is required to allow a seamless connection for those users.

[On that note I feel a command could be run during the KVM set up that would ascertain the users max MTU size and configure this before the first time the gateway is run, which would alleviate the users problem.]

If the MTU issue is on the other end (that the user is connecting to), which in this case would be a guard node, then it would require PMTUD to agree on the MTU size, or fallback to attempting to connect to another guard node. However I suspect most (if not all) guard nodes will be hosted on a datacenter connection where they will have a full 1500 MTU size.

What I’ve looked at since my last post 8 days or so ago

I checked out the iptables rules that Tails is using and there is no ICMP allowed on INPUT, yet it’s able to connect just fine.

My thoughts are that Tails does not require this because users with a lower MTU have a router performing MSS Clamping which takes care of the MTU at the users network side without necessitating any ICMP.

What I’ll do (outstanding still to do)

I’ll look to commit my conditional RELATED fix with the nftables rules to git.

Can I ask, is there a confirmed way that I can get nftables running to test this?

What I’ll do (further testing)

I should have some time this coming weekend to confirm if this issue is specifically KVM related or if it also applies to a gateway running via VirtualBox and/or physical isolation on bare metal.

If it is purely an issue that users on KVM will experience then I think a solution to fix it during the KVM provisioning stage is best, but if it affects users across the board then a fix at the Whonix gateway level is likely to be more desirable.

I note that we now have two proposed ICMP solutions as well.

Solution 1:

IPTABLES -I INPUT -p icmp —icmp-type destination-unreachable -m state —state RELATED -j ACCEPT

Solution 2:

iptables -I INPUT -p icmp --icmp-type fragmentation-needed -m state --state RELATED -j ACCEPT

I’m not immediately sure which is best, or if both should become optional conditions and leave it up to the user.

I’ll re-run tests on my end to ensure that both the destination-unreachable and fragmentation-needed rules do fix the issue that I found and reported here.

1 Like

The docs stated that ICMP Fix is applicable to fix this issue with MTU so I thought that I can just replace it:

Using a dial-up connection? Or is Tor bootstrapping stuck at 45%?

http://www.dds6qkxpwdeubwucdiaord2xgbbeyds25rbsgr73tbfpqpt4a6vjwsyd.onion/wiki/Troubleshooting#ICMP_Fix
If it’d be needed for something else as well then I’ll rewrite the pull request and leave this option.

MSS clamp is a hack and not a clean solution:

This target is used to overcome criminally braindead ISPs or servers which block ICMP Fragmentation Needed packets. The symptoms of this problem are that everything works fine from your Linux firewall/router, but machines behind it can never exchange large packets:

https://linux.die.net/man/8/iptables
More info:
https://serverfault.com/a/376757

It’s not safe to assume that everyone can support the standard 1500 MTU. If someones ISP uses PPTP or the user is connected to VPN before Whonix-Gateway then they will have a problem.

destination-unreachable is a broad ICMP type and fragmentation-needed is a destination-unreachable type message with code 0x4 (fragmentation needed and DF set).
Check the page 4 Destination Unreachable Message:
https://www.rfc-editor.org/rfc/rfc792

1 Like