Boot Clock Randomization - bootclockrandomization

Patrick · March 8, 2016, 10:05pm

BTW why 0-5 is excluded in bootclockrandomization? I think this leaks some data. For example if attacker have already some selected set of data to correlate (like “all Tor users in area X”), he/she can easily further narrow the search by eliminating those with time almost in sync (+/-5sec). Given popularity of NTP, which (when works) is pretty good at keeping the time in sync, this allows to easily eliminate all such users, leaving probably only those using Whonix, or maybe even more precise correlation…

bootclockrandomization/usr/share/bootclockrandomization/start at 77d3e620017dd067e1660e7bebf582f042484c47 · Kicksecure/bootclockrandomization · GitHub

Patrick · March 10, 2016, 1:03am

boot clock randomization can indeed use some more discussion and scrutiny. Here is what it’s about…

(Trying to speak more generally, less Qubes specific here.)

Assumption: Lots of users are either

a) using NTP and have usually correct quite precise clocks
b) using NTP and currently having a slow or fast clock about 1, 2, … etc. seconds
c) some have buggy setups and clocks that are far off

There are various local clock leaks. We try/recommend to prevent all of them, but it’s impossible to be sure we are aware of all of them.

Let’s take a not so uncommon use case. A user wants to visit a website that does not work in Tor Browser or requires some special plugin that is blocked by Tor Browser. In such cases users switch to other browsers such as iceweasel. And iceweasel will leak the local VM clock through javascript [and more].

Before sdwdate finished setting the clock after boot to unlink it from the host [in Qubes terms: from sys-net and other non-Tor VMs] or in case sdwdate fails, there is at least bootclockrandomization to archive that. [The long term is to block all networking but sdwdate until sdwdate is done, but it takes a while until we get there. → sdwdate-gui) [and for sdwdate fetches itself it is desireable to at least have bootclockrandomization just in case]

In the wiki the only rationale explanation currently is the following.

Using Boot Clock Randomization, i.e. after boot, the clock is set randomly between 5 and 180 seconds into the past or future. This is useful to enforce the design goal, that the host clock and Whonix-Workstation clock should always slightly differ. It’s also useful to obfuscate the clock when sdwdate itself is running, because naturally at this time, sdwdate hasn’t finished.

Unlinking is the keyword.
It’s sane to assume that a non-torified host or other VMs may leak the local clock. (local clock leaks)
It’s also sane to assume, that a Whonix VM may leak the local clock. (local clock leaks)
It’s purpose is to prevent correlation / anonymity set reduction by comparing a local clock leak from let’s say for example iceweasel in a non-Tor VM and a local clock leak from let’s say iceweasel from within a Whonix VM.

Now, how big is group a), b) and c)? It’s impossible to say. I am not aware of any research of that and we neither have the resources to do that research.

Let’s assume only 10% of users have a clock that is 4 seconds slow. If boot clock randomization added only +1, it seems to me the anonymity set reduction may still work. Or if it randomly picked 0, it would not help at all.

However, it’s right, there is a point. For users using Whonix VMs and doing stuff suffering from some local clock leak, we may indeed add some of the group a) users “with a perfectly synced clock” to an artificial “unnecessarily” fingerprintable attribute that otherwise would not exist. On other other hand, users of group b) and c) would be better off. And since it’s public knowledge, that Whonix uses bootclockrandomization and sdwdate, all Whonix users can blend into the group of Whonix users. Thereby gain anonymity. And by the obfuscation of local clock state is leaked from within the VM, hopefully such clock correlation attacks become unattractive.

While I am at it… What is sdwdate good for then… Mostly useful for users of group b) and c), it sets the clock to a time that is as securely obtained and as correct as it can get. While still being independent from the host and other non-Tor VMs. And keeps it that way during long running sessions. The time set by sdwdate should then be similar enough for all Whonix users to make clock correlation attacks unattractive.

In an ideal world, we would require neither boot clock randomization nor sdwdate. The host would always boot would a perfectly synchronization time to begin with. And everyone would always have a perfectly synchronized time always. And online time syncing would be impossible to manipulate with by man-in-the-middle attacks.

@marmarek wrote:

For example if attacker have already some selected set of data to
correlate (like “all Tor users in area X”), he/she can easily further
narrow the search by eliminating those with time almost in sync
(+/-5sec).

Note, that this +/-5sec (emitted from within Whonix VMs) should only be observable at Tor exit relays, destination websites and onion servers. Not at ISP level. (ISP might observe local clock leaks by the host or other non-Tor VMs.)

//cc @HulaHoop

marmarek · March 10, 2016, 1:54am

In an ideal world, we would require neither boot clock randomization nor sdwdate. The host would always boot would a perfectly synchronization time to begin with. And everyone would always have a perfectly synchronized time always. And online time syncing would be impossible to manipulate with by man-in-the-middle attacks.

Yes.

@marmarek wrote:

For example if attacker have already some selected set of data to
correlate (like “all Tor users in area X”), he/she can easily further
narrow the search by eliminating those with time almost in sync
(+/-5sec).

Note, that this +/-5sec (emitted from within Whonix VMs) should only be observable at Tor exit relays, destination websites and onion servers. Not at ISP level. (ISP might observe local clock leaks by the host or other non-Tor VMs.)

Exactly the point. This gives negative correlation between data gathered
at local ISP level, with data from target server/exit relays.
Additionally, if you have some other way to link multiple sessions of
the same person, on the target server (like using the same pseudonym),
you’ll gain much in terms of host clock leak, just because this 10s
range exclusion. Because every session (boot clock randomization run)
gives you information about what 10s range is surely not the user host
clock. After a while you’ll exclude pretty much of the whole ±180s
range giving you quite precise approximation of host clock. In extreme
situation, 18 sessions would be enough (each excluding some 10s range).
Then you need to correlate it with ISP level data.

The point is, when you want to unlink Whonix time from host time, you
need to use as little host time as possible. If you just
choose randomly from ±180s, choosing offset of 0s would be
indistinguishable from some other user randomly choosing offset of 30s
and having host time -30s off (or any other value, as long as both are
the same). But if you exclude ±5s range (or if fact any range), you
help with reducing anonymity by excluding those “clearnet hosts” with time
in ±5s range of the Whonix-ws leaked one.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

marmarek · March 10, 2016, 10:50am

In other words:
If the attacker knows that the time is strictly randomized (±180s without
exclusions), any clock leak is untrustworthy; if the attacker knows that
time is randomized by ±5-180s, some times can be excluded and in fact we
leak more information that before.

Full randomization: true machine time has 361 equivalently possible values
(approx; this doesn’t count miliseconds randomization, but it’s a passable simplification)
Randomization with exclusion: true machine time has 350 equivalently
possible values.

Assuming two times leaked, say 00:04:15 and 00:05:01,we have:
full randomization: real time is between 00:02:01 and 00:07:15, that is, we
have 314 possible times to choose from
randomization with exclusion: real time is between 00:02:01 and 00:07:15,
with the exclusion of periods 00:04:10-00:04:20 and 00:04:56-00:05:11,
which gives 292 possible times.

In layman’s terms, it’s the difference between “I rolled a die and adjusted
the result by 1,0 or -1, now I have a three” and “I rolled a die and
adjusted the result by 1 or -1 and now I have a three”. Although the second
method does never result in the starting roll, it actually leaks more, not
less information about it.

HulaHoop · March 10, 2016, 11:30pm

Patrick I agree.

The choice is between passively leaking to exit relays and beyond that the WS clock is a little off to NTP vs leaking something very close to the host clock. The latter is much worse and protected against by the randomization. This creates an anonymity set (all Whonix users).

This is similar to how all Tor designed applications like TBB have a fingerprint different than everything else but provide the same safe defaults across the board.

marmarek · March 11, 2016, 12:06am

The choice is between passively leaking to exit relays and beyond that the WS clock is a little off to NTP vs leaking something very close to the host clock. The latter is much worse and protected against by the randomization. This creates an anonymity set (all Whonix users).

The attacker have no idea how close leaked time is to the host time.
Unless you announce it.
Following your reasoning to the extreme, adding always +3min to the host
clock would be the best, because it would always be different than host
clock. But in fact (together with public information about this
implementation) it leaks precise host clock.

Just excluding -5-+5 range, doesn’t leak that precise information, but
leaks more than real random offset (as explained in the previous
message).

This is similar to how all Tor designed applications like TBB have a fingerprint different than everything else but provide the same safe defaults across the board.

This would make any sense only if everyone would have precisely the same
host time (in which case time randomization wouldn’t be needed at
all…). Otherwise it isn’t true that every user will look the same - it
will still depend on host clock, which may be different.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Patrick · March 11, 2016, 12:18am

Your suggestion is to:

remove the minimum +/- 5 seconds thing
keep randomizing nanoseconds (we really don’t want a high res local clock leak to aid correlation)
otherwise keep boot clock randomization as is

?

marmarek · March 11, 2016, 12:28am

Your suggestion is to:

remove the minimum +/- 5 seconds thing

keep randomizing nanoseconds (we really don’t want a high res local clock leak to aid correlation)

otherwise keep boot clock randomization as is

Yes, exactly.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Patrick · March 13, 2016, 12:50am

Agreed!

Patrick · March 31, 2020, 11:24am

bootclockrandomization fails inside a systemd-nspawn chroot.

Mar 31 06:51:19 host systemd[1]: Starting Boot Clock Randomization...
Mar 31 06:51:19 host start[102]: + 129 262440680
Mar 31 06:51:19 host start[102]: date: cannot set date: Operation not permitted
Mar 31 06:51:19 host start[102]: ERROR: exit_code: 1 | BASH_COMMAND: date --set "@$NEW_TIME.$NANOSECONDS" > /dev
Mar 31 06:51:19 host systemd[1]: bootclockrandomization.service: Main process exited, code=exited, status=1/FAIL
Mar 31 06:51:19 host systemd[1]: bootclockrandomization.service: Failed with result 'exit-code'.
Mar 31 06:51:19 host systemd[1]: Failed to start Boot Clock Randomization.

Same holds true for sdwdate.

This is expected because Quote systemd-nspawn

The host’s network interfaces and the system clock may not be changed from within the container.

What should bootclockrandomization do in that case? Ignore the error? I guess not a great solution.

Maybe not start in these cases at all using ConditionVirtualization=!systemd-nspawn?

HulaHoop · April 1, 2020, 4:40pm

The clock support in namespaces is a special case that was recently added this year:

Patrick · November 20, 2021, 6:39pm

Patrick · November 20, 2021, 6:46pm

Due to issues reported in whonix-ws-16 Template fails to update due to timing issue…

How useful is Boot Clock Randomization in Qubes-Whonix Templates?

In other words, what is the chance of the VM time from inside a Qubes-Whonix Template being leaked to a Tor exit relay? (Local Clock Leaks)

Qubes Templates do not have “full” networking. APT is running through Qubes UpdatesProxy.

Some users might be running other networked applications from Qubes Templates. For example Operating System Software and Updates suggests:

Flatpak Update

This step is only required if the user previously manually installed any software using flatpak. Can be skipped otherwise.

Qubes-Whonix ™ Template:
http_proxy=http://127.0.0.1:8082 flatpak update

I’ve also seen http_proxy=http://127.0.0.1:8082 gpg --recv-keys [...].
Tor Browser downloader by Whonix developers (tb-updater).

deeplow · December 24, 2021, 12:03pm

The boot clock randomization was leading to some instability in the automated testing

github.com/Kicksecure/bootclockrandomization

Make delay_plus_or_minus overridable via env var

Kicksecure:master ← deeplow:delay-as-env-var

opened 09:38AM - 24 Dec 21 UTC

deeplow

+1 -1

## The problem Whonix tests are now non-deterministically failing in two cases:… 1. [failing in splitgpg tests](https://openqa.qubes-os.org/tests/26118#step/TC_10_Thunderbird_whonix-ws-16/1). I've nailed this down to the fact that the backwards time leap is messing up the dovecot imap service we now use with thunderbird 91+ testing. 2. It also seems to be the root cause of [invalid GPG signatures](https://openqa.qubes-os.org/tests/26805#step/TC_10_Thunderbird_whonix-ws-16/3) ![error_signature](https://user-images.githubusercontent.com/47065258/147340527-30b7cb1e-af75-4dad-9e5c-cd64f9a8ab3d.png) ## Proposed solution Having this var overridable by an env var so in the test suites we can reduce it to something like 5 seconds ## Testing I've ran locally 140 tests with the delay set to 5 instead of 180 and this made it pass 140/140 splitgpg tests locally (quite impressive!), whereas before it used to fail quite more often.

Patrick · September 25, 2023, 1:33pm

Needs to be re-considered.

Patrick · December 6, 2023, 7:36am

Patrick · December 6, 2023, 7:40am

@marmarek in Qubes templates build command gpg error: `Signature by [key] was created after the --not-after date.` · Issue #8520 · QubesOS/qubes-issues · GitHub

Generally, this all feels like a Whonix bug - yet another issue caused by such aggressive time randomization (next to apt seeing InRelease signed in the future). I don’t think adding workarounds for that left and right is the way to go, better fix the root issue and either drastically reduce time range in which clock is randomized