improving compression of Whonix image downloads

Firefox via Whonix Forum:

Well the principles are the same for both, KVM and Virtualbox. KVM and Virtualbox are only Virtual machines with their specific VM image format and both understand the raw image format or can convert from raw to their VM specific image format.

Thus building 3 raw images and converting them to qcow2 and vdi should be all that is needed to make them VM specific.

VirtualBox: This would have to be done on the user’s machine. Something
simple (ova image to import) would be converted into something more
complex and platform specific. A script, one for linux, another one for
Windows, and perhaps another one for mac. Rather than “just import the
ova using virtualbox” it’s “extract, make the script executable, run the
script” (or in case of Windows perhaps an installer). Possible but no
development resources for that.

Common vm image also seems risky form a security point of view. Which VM would be authorized to make changes to it (update it)?

No, the common vm image format is only shared for step 1 which is the step of downloading the files. The common vm image file makes the download size smaller by removing the extra size for duplicates, that’s the whole point of this thread.

I see.

In step 2 the user clones the common image for whonix workspace and renames the other for whonix gateway, or vice versa. Like described above.
Thus in step 3, when the user boots whonix workspace and whonix gateway both have their own image file for /usr.
So to sum it up, at runtime and later usage these are individual independent images.

That is also needed because in step 4, when the workspace or gateway images are booted for the first time, a script automatically installs the none duplicate deb packages as described above from /var/apt/cache/apt/archives/ making image file gateway_usr.qcow2 and whonix_usr.qcow2 completely independent and different from each other.

I see.

The only things that are to keep in mind are:
A. by cloning the common.qcow2 file the UUID of the disk image changes
B. The virtual machine config file may require a change to the new UUID if it is UUID and not filename based.
C. inside of workspace or gateway the /etc/fstab file must be changed according to the UUID. But this should be doable by the script that also installs the deb files when booting the first time.

Steps A and B should be doable by a script the user has to run after download.
Step C can run automatically when the images are booted the first time.

A lot code to be written.

Most open source software offer a way to provide a destination for the installation during the create makefile process.

Whonix packages using genmkfile also support setting DESTDIR. However,
make install is only used during the package creation process.
Software by Whonix is installed through packages. These are using the
default paths as per FHS. Packages install to / as per usual. No package
/ just using make install → no good upgrade path. Packages provide a
very good upgrade path. Having the packages install to /opt would be a
lot work updating any paths, new bugs. Also package installation
requires their dependencies being installed already. So Whonix packages
could go to the same directory for initial installation. (To resolve
dependencies, to install in right order, a local apt repository would be
required.)

In theory it’s all doable but super complex and error prone. I would
veto such changes and suggest to fork Whonix instead if this is desired.

I also don’t think Whonix needs to invent something new as complex as
this here. Whonix isn’t an outsider by using VM images. These are very
popular in data centers. These need to backups and transfer files. So
someone must have sorted out deduplication and compression of VM images
in a generic way already.

These two blog posts indicate that just switching to another compression
algorithm could do the trick.

http://www.doublecloud.org/2012/06/best-tool-to-compress-virtual-machines/

If that does not suffice, for duplication we could also research huge
compression algorithm dictionary sizes and/or preprocessors to remove
duplication.

1 Like