Reducing size of ova images

Patrick · April 30, 2018, 7:33am

If you want more output and sanity checking:

progress indicator (libvirt_compress uses pv to invent a progress
indicator, I think, might be suitable for adjustment / copy / paste,
otherwise “mygrep” for pv to see other use examples of pv)
check size before running (and output)
check size after running
output human readable summary
output size in GB X.XX or so
Should be smaller. If not, virt-sparsify failed, obviously. In that
case, make the build script fail. Add an error. (Might happen in
future versions of virt-sparsify or its dependencies or due to build
script bugs.)

But all of that is very optional. (Meaning: can merge either way even as
is.) Seems pretty good. Perhaps I’ll include even in Whonix 14.

Maybe even decreases build times (libvirt compression)? (Unimportant
question.)

Decreases upload time due to smaller sizes, and download times for
everyone, which is amazing!

onion_knight:

Patrick:

Looks really good! Without having tested it, I don’t see anything wrong

with it at first look.

Nice! What is the procedure now? Do I suggest this script on phabricator? Or do we continue the discussion here?

A phabricator ticket would be good as reminder (used as todo list).
Description can be minimal. With link to this forum discussion.

Script on phabricator: no need.

Script:
Could you post a github pull request (or git push a git branch somewhere
(github if you don’t mind)) please?

Discussion can continue here.

Patrick:

probably adding debugging

I guess no need or not possible.

I would expect the virt-sparsify command to work every time and when it

fails (such as if file not exists…?), it exits non-zero, right? Should

be enough for now. If it makes trouble, which I doubt, we can think

about this more.

It exits non-zero if file doesn’t exist, yes. The command can be run in verbose mode (-v option) which might be helpful.

-v sounds good in theory. How does the output look like?

Users love some sort of progress indicator. Otherwise they have a
nervous ctrl+c finger.

Note that virt-sparsify can be run against any kind of file, it doesn’t seem to check if the file is a virtual disk image. Not sure of the implications if the file is not a virtual disk, probably none (it runs normally and then quits with “Sparsify in-place operation completed with no errors”, although the shrinking is not performed.

Ok. Can live with that.

Patrick:

error handling capacities?

No need. This is generally sorted out already. (As a fallback:)

set -e

Yes, but I was thinking more of the checks performed in the main() function ( if [ "$WHONIX_BUILD_FLAVOR" = "whonix-gateway" ]; then ), do we need to add more checking?

Ah. I guess no. Should be quite similar to
2300_run-chroot-scripts-post-d. So if it is as good as that one,
should be alright.

Regarding the unmounting bug, I will try to give a few more tries and open a new thread. But maybe worth taking into account the following:

All the build is done in a virtual machine (virtualbox)

The VM vdi is located on an external hdd which is formatted in NTFS (maybe kpartx doesnt like that?)

This might make a difference indeed even if it should not.

the VM already has a user “user” with password “changeme”

For now, the only solution I found is to reboot the machine… It only happened with the Workstation, not t he Gateway.
With and without the added 2350 script.

Yes. I doubt the new script is fault.

onion_knight · April 30, 2018, 9:25am

Ok, then I’ll let it as it is.

It seems to decrease the .ova build time, probably also libvirt. But it needs more thorough benchmarking. From the end user side, ova importation is however much faster, as it is twice lighter.

I don’t know how to do that yet. I’ll look into the documentation.

virt-sparsify does have a built-in progress indicator:

Regarding the verbose output. It is indeed very verbose. Please tell me where I can upload the output file, it will be more convenient for you to read than copy-pasting all the lines here. For the peace of mind, I think it ought to be reviewed by you and/or other Whonix devs.

Output log shows that the raw image seems to be booted with qemu-system-x86_64 during the shrinking operation. Is it bad news? Does it leave logs? Is there anything dangerous taking place? Does it change anything inside the virtual disk apart from removing unused space? How can one verify that?

Here is the qemu-system command used by virt-sparsify --in-place:

 /usr/bin/qemu-system-x86_64 \
    -global virtio-blk-pci.scsi=off \
    -nodefconfig \
    -enable-fips \
    -nodefaults \
    -display none \
    -machine accel=kvm:tcg \
    -m 500 \
    -no-reboot \
    -rtc driftfix=slew \
    -no-hpet \
    -global kvm-pit.lost_tick_policy=discard \
    -kernel /var/tmp/.guestfs-1000/appliance.d/kernel \
    -initrd /var/tmp/.guestfs-1000/appliance.d/initrd \
    -object rng-random,filename=/dev/urandom,id=rng0 \
    -device virtio-rng-pci,rng=rng0 \
    -device virtio-scsi-pci,id=scsi \
    -drive file=/home/user/whonix_binary/Whonix-Workstation-14.0.0.6.9-13-g58ed9c2b63c8baddc3ebb0b65fbdd9c50d679cf7.raw,cache=writeback,discard=unmap,format=raw,id=hd0,if=none \
    -device scsi-hd,drive=hd0 \
    -drive file=/var/tmp/.guestfs-1000/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \
    -device scsi-hd,drive=appliance \
    -device virtio-serial-pci \
    -serial stdio \
    -device sga \
    -chardev socket,path=/run/user/1000/libguestfsgb8EHA/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -append 'panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color'

NOTE: I have imported in VirtualBox the Whonix-Workstation ova file produced after virt-sparsify and inspected the content of /var/log. I see no logs other than the ones produced on first boot with VirtualBox. wtmp log only shows the first boot with VirtualBox, nothing seems to have been left by the qemu-system command run by virt-sparsify. Where else should I look?

Patrick · April 30, 2018, 9:39am

Shrinking (without booting during shrinking) could even help with Verifiable Builds - Whonix (if it is deterministic).

Will for sure be reviewed / tested by me after git branch or pull request. (Even without can’t let this drop on the floor but would really help the “usual” way.)

Very bad news. - If it boots the actual Whonix raw image and not a separate helper image or so. Maybe it boots it in a way not modifying anything.

Yes.

Quite possibly. Could result in starting services such as the systemd entropy that creates a seed file, which should not be published and/or shared among all Whonix users. In worst case starts Tor service and prepopulates /var/lib/tor. Booting the image and making changes really ought to be avoided - very unprofessional.

We can look inside"the images. Compare files and contents. Build script with --report true to have it analyzed. Build twice. Compare the reports.

Or manually build twice and then use help-steps/analyze_image.

Context:

onion_knight · April 30, 2018, 9:42am

Where could I upload the output log so you can have a look on it?

Patrick · April 30, 2018, 9:46am

phabricator. But no need. Since the image gets mounted we ought to analyze the image anyhow.

Any tool that doesn’t boot the image to shrink it?

onion_knight · April 30, 2018, 9:49am

Maybe by looking at the output log you would see if something unclean happens?

virt-sparsify can be run without the --in-place option. I will try it now, see if it changes anything/doesnt require booting the image.

Patrick · April 30, 2018, 9:51am

I can try to look but booting an image is pretty bad then we ought to check carefully. Better be avoided at all cost.

Yes. That might be better.

Patrick · April 30, 2018, 9:54am

Brings us back to the dd method suggested by @Algernon.

Maybe it can be implemented here…

https://github.com/Whonix/whonix-initializer/blob/master/usr/lib/anon-dist/chroot-scripts-post.d/80_cleanup#L421-L432

You see… We had it before - very first Whonix relases, very long time ago when images were still booted to build them). Broke when changing to chroot image building method. Maybe that could be fixed.

onion_knight · April 30, 2018, 9:58am

Yes, dd or zerofree. The dd command during the chroot script requires a lot of available space in the building machine. I will try it again, but slow and heavy.

Zerofree is quicker and lighter but requires mounting, which is prone to errors.

Please have a look at the output log… Maybe I missed or misinterpreted something and it is not that bad?

https://phabricator.whonix.org/T790

Patrick · April 30, 2018, 10:13am

Creates a machine_id.

100 GB? Pretty bad! Best avoided.

Hm… Still preferred.

Or we need to consolidate the amount of times the image gets mounted / unmounted. But not great. Breaks the logical flow of the separate steps.

Or better… Fix the root cause… Why unmounting fails. Rehashing the old bug report. Fixing the root cause is long term always best.

onion_knight · April 30, 2018, 11:52am

The ouput log without the --in-place option shows that the image is nevertheless booted with qemu-system.

So we agree to drop the virt-sparsify solution? It’s a pity, but agreed booting images is not tolerable.

I’ll do a few more tests with zerofree… Simple dd’ing takes too much place and time.

onion_knight · April 30, 2018, 1:13pm

This a new attempt of a 2350 script, this time with zerofree (2350_zerofree-raw). So far I had success with whonix-gateway, manually launching scripts 2350 to 2700. Zerofree takes about 5 minutes. ova image is still 934 MB

Now I will try the script for full builds for both images.

#!/bin/bash

## Copyright (C) 2012 - 2018 ENCRYPTED SUPPORT LP <adrelanos@riseup.net>
## See the file COPYING for copying conditions.

set -x
set -e

true "INFO: Currently running script: $BASH_SOURCE $@"

MYDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

cd "$MYDIR"
cd ..
cd help-steps

source pre
source colors
source variables

error_handler_mount-raw() {
   : echo "
${red}${bold}BASH_COMMAND${reset}: $BASH_COMMAND
${red}${bold}ERROR $BASH_SOURCE: | caller: $(caller)${reset}
"
   exit 1
}

errorhandlerunmount-raw() {
   true "${red}${bold}BASH_COMMAND${reset}: $BASH_COMMAND
${red}${bold}ERROR $BASH_SOURCE: | caller: $(caller)${reset}"
   exit 1
}

mount_raw_read_only() {
   trap "error_handler_mount-raw" ERR INT TERM

   if [ "$mount_folder" = "" ]; then
      true
   else
      ## hack for help-steps/analyze-image
      CHROOT_FOLDER="$mount_folder"
   fi

   sync

   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
      local img="$binary_image_raw"
   else
      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
   fi

   ## Debugging.
   losetup --all
   sync

   sleep 2 &
   wait "$!"

   ## Better not use this, because this can lead to a kpartx bug:
   ## "ioctl: LOOP_CLR_FD: Device or resource busy"
   ## Difficult to reproduce.
   ## Debugging.
   #kpartx -l -s -v "$img"
   #sync

   local kpartx_output a b device
   kpartx_output="$(kpartx -a -s -v "$img" 2>&1)"
   sync

   if [ "$kpartx_output" = "" ]; then
      local msg="kpartx did not output anything."
      error "$msg"
   fi

   ## Debugging.
   losetup --all
   sync

   read a b device _ <<< "$kpartx_output"
   dev_mapper_device="/dev/mapper/$device"

}


zerofree_raw() {

    zerofree "$dev_mapper_device"
}

unmount_raw_read_only() {
   trap "errorhandlerunmount-raw" ERR INT TERM

   sync

   ## Sleep to work around some obscure bug.
   ## http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734794
   sleep 2 &
   wait "$!"
   sync

   ## Debugging.
   losetup --all
   sync

   kpartx -d -s -v "$img"
   sync

   ## Debugging.
   losetup --all
   sync

}


main() {
   root_check
   mount_raw_read_only
   zerofree_raw
   unmount_raw_read_only
}

main "$@"

NOTE: I think this part is useless in the script

   if [ "$mount_folder" = "" ]; then
      true
   else
      ## hack for help-steps/analyze-image
      CHROOT_FOLDER="$mount_folder"
   fi

   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
      local img="$binary_image_raw"
   else
      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
   fi

Patrick · April 30, 2018, 2:40pm

mygrep

cat ~/bin/mygrep

#!/bin/bash
set -x
exec \
grep \
--exclude=README.md \
--exclude=GPLv2 \
--exclude=GPLv3 \
--exclude=COPYING \
--exclude=changelog.upstream-old1 \
--exclude-dir=mnt \
--exclude-dir=qubes-src/linux-template-builder/mnt \
--exclude=changelog.upstream \
--exclude-dir=".git" \
--exclude-dir=chroot-debian \
--exclude-dir=chroot-jessie "$@"

Patrick · April 30, 2018, 2:42pm

WHONIX_BUILD_MOUNT_RAW_FILE is not useless.

But I agree, the implementation is not pretty and should be improved.

mygrep -r WHONIX_BUILD_MOUNT_RAW_FILE
+ exec grep --exclude=README.md --exclude=GPLv2 --exclude=GPLv3 --exclude=COPYING --exclude=changelog.upstream-old1 --exclude-dir=mnt --exclude-dir=qubes-src/linux-template-builder/mnt --exclude=changelog.upstream --exclude-dir=.git --exclude-dir=chroot-debian --exclude-dir=chroot-jessie -r WHONIX_BUILD_MOUNT_RAW_FILE
help-steps/mount-raw:   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
help-steps/mount-raw:      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
help-steps/unmount-raw:   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
help-steps/unmount-raw:      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
help-steps/analyze_image:   ## WHONIX_BUILD_MOUNT_RAW_FILE us read by help-steps/mount-raw
help-steps/analyze_image:   export WHONIX_BUILD_MOUNT_RAW_FILE="$raw_file_short_link"
help-steps/analyze_image:   ## WHONIX_BUILD_MOUNT_RAW_FILE us read by help-steps/mount-raw
help-steps/analyze_image:   export WHONIX_BUILD_MOUNT_RAW_FILE="$raw_file_short_link

mygrep -r mount_folder=

Patrick · April 30, 2018, 2:45pm

Could you remove the code duplication please?

Perhaps move shared code into a new file into functions (in help-steps folder (ok for now) or perhaps a new folder for things to be sourced not actually executed (should do long term)).

And then call the out sourced code from the two scripts using that code?

onion_knight · April 30, 2018, 2:50pm

Yes I agree

   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
      local img="$binary_image_raw"
   else
      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
   fi

is NOT useless. Actually mounting/unmounting fails without this part (it defines $img variable). I thought I had edited my comment, sorry. I left this part in the script. Please see github, I did a pull request. Or is it too early?

I confirm it works with whonix-workstation too (1.1GB ova file, as expected).

You mean removing the mount/unmount functions from the script and putting them in separate files in the help-steps folder? Yes.

Patrick · April 30, 2018, 3:01pm

Yes.

As you feel comfortable. No too fixed progress. Too much rules deter contributors. So I try keep things easy. For me github pull requests are comfortable. Easy to review. However, if they create a burden on your side, also other things can work.

onion_knight · April 30, 2018, 3:09pm

OK. Pull requests are OK for me. But maybe a problem for you if I pull too early and must add a number of commits later? Like it will happen now

Patrick · April 30, 2018, 3:54pm

onion_knight:

But maybe a problem for you if I pull too early

No problem here. Well, if you pull to early, I just wait with the merge.
Then either improve the current pull (by modifying that git branch) or
close the pull request and send a new one.

and must add a number of commits later? Like it will happen now

I don’t mind to add changes on top at all. That’s what collaborative
development is for.

onion_knight · April 30, 2018, 6:16pm

OK, I need some help here.

So I removed the mount/unmount functions from the 2350 script and put them in new separate scripts in help-steps directory. So far so good.The function looks like this now:

zerofree_raw() {

   "$WHONIX_SOURCE_HELP_STEPS_FOLDER"/mount-raw-nochroot

    zerofree -v "$dev_mapper_device"

   "$WHONIX_SOURCE_HELP_STEPS_FOLDER"/unmount-raw-nochroot
}

But of course it fails because the variable "$dev_mapper_device" is not defined here. I need to import its value defined in the "$WHONIX_SOURCE_HELP_STEPS_FOLDER"/mount-raw-nochroot script, in the following function:

mount_raw_nochroot() {
   trap "error_handler_mount-raw" ERR INT TERM

   if [ "$WHONIX_BUILD_MOUNT_RAW_FILE" = "" ]; then
      local img="$binary_image_raw"
   else
      local img="$WHONIX_BUILD_MOUNT_RAW_FILE"
   fi

   ## Debugging.
   losetup --all
   sync

   sleep 2 &
   wait "$!"

   ## Better not use this, because this can lead to a kpartx bug:
   ## "ioctl: LOOP_CLR_FD: Device or resource busy"
   ## Difficult to reproduce.
   ## Debugging.
   #kpartx -l -s -v "$img"
   #sync

   local kpartx_output a b device
   kpartx_output="$(kpartx -a -s -v "$img" 2>&1)"
   sync

   if [ "$kpartx_output" = "" ]; then
      local msg="kpartx did not output anything."
      error "$msg"
   fi

   ## Debugging.
   losetup --all
   sync

   read a b device _ <<< "$kpartx_output"
   dev_mapper_device="/dev/mapper/$device"

}

How do I “export” this dev_mapper_device variable in my 2350 script? I have no idea, I’ve been trying for a few hours now with export commands without success.

EDIT: I found a workaround:
added export dev_mapper_device in mount-raw-nochroot script
added source "$WHONIX_SOURCE_HELP_STEPS_FOLDER"/mount-raw-nochroot in 2350_zerofree-raw script