scripts to shrink vbox and kvm images

ok. i will remove the option.

1 Like

so, i created a quick test vm with 16 gigs total disk space to simulate a more conservative external drive. i can confirm that virt-sparsify stalled out once the hard drive was full in that set up while processing the first whonix disk. however, the method documented above with qemu-img worked fine.

1 Like

Please run through shellcheck.

1 Like

it’s not quite there yet. i’m aware that it’s quite dirty. i’ll tidy it up with the appropriate quoting and such once a final version is ready. still looking into whether it can be used to control a vm to boot in recovery mode and do a zerofree pass.

1 Like

here’s the first beta version. it comes up clean on shellcheck.net.

i have given up on trying to script booting the vms into recovery mode and running the zerofree pass for now. however, i did make some additions:

  • the script will sort disks based on whether they are a primary disk (/dev/vda1) or an additional disk (/dev/vdb1, etc.).

  • if the disk is a primary disk, the script will check if a backing image is present for the disk based on the ending filename extension of “.backing.image.” if a backing image is not present, the script will create a backing image for the disk and then create a thin provisioned disk to be used going forward. the advantage of this are that the backing image is read only and can offer a means of restoration if something unforeseen happens, whether it’s the thin disk getting corrupted or the user making an error in persistent mode. an additional advantage is that, due to the smaller size of the thin provisioned disk, less disk reads/writes are needed for future passes of zerofree and qemu-img, due to the fact that the backing image is not read.

  • if the disk is an additional disk, which are only used for persistent storage in my usage, the “qemu-img convert” command is used to create a new image of the disk. since the files stored on such disks tend to be important, i think it makes more sense to keep the data in one disk image, rather than being fractured in thin provisioned images with the additional disk as a backing image.

  • the script creates a backup image of the disk image first and then creates a new disk image from the backup image. after the qemu-img pass is complete, the user is prompted to boot the corresponding virtual machine to make sure the process was successful and did not result in corruption. the user is then given the option to remove the backup file. if something bad happened during the conversion, which is discovered on booting the vm, the user has the option of restoring the image from the backup file.

i will likely add some colors to the script later in order to make it easier to read and clean up the stdout formatting. but, if anyone who is using qemu/kvm wants to test this, please give it a try.

#!/bin/bash
#
# Version 1.0.Beta1
#
# This script will search for installed virtual machines and attempt to reduce
# their size.
#
ORIGINALIFS="$IFS"
IFS=$'\n'
# Get list of installed and inactive virtual disks and write to a hidden temporary file.
echo "You must shut down your virtual machines for the shrink process to work." 
read -r -e -p "Please shut down your virtual machines and press [enter] to continue."
tempvmlistnp=$(for i in $(sudo virsh list --name --inactive); do sudo virsh domblklist "$i"; done| grep / | grep -E "sda|vda|hda" | awk -F" {2,}" '{print $2}' | sed -e 's|[[:space:]]|\ |g'; )
tempvmlistp=$(for i in $(sudo virsh list --name --inactive); do sudo virsh domblklist "$i"; done| grep / | grep -E -v "sda|vda|hda" | awk -F" {2,}" '{print $2}' | sed -e 's|[[:space:]]|\ |g'; )
# Read from list of vms, check if the disk is non-persistent and filter files that have not been modified in the past 24 hours.
tempvmlistnp2=$(for f in $tempvmlistnp; do sudo find "$f" -type f -mtime -1 -size +10M | sed -e 's|[[:space:]]|\\ |g'; done )
# Read from list of vms, check if the disk is persistent and filter files that have not been modified in the past 24 hours.
tempvmlistp2=$(for f in $tempvmlistp; do sudo find "$f" -type f -mtime -1 -size +10M | sed -e 's|[[:space:]]|\\ |g'; done )
for f in $tempvmlistnp2
do
	# Ask if user wants to shrink the disk.
	read -r -e -p "Shrink $f? [y/N] " choice
	if [[ "$choice" == [Yy]* ]]
	   # If yes, make a backup copy
	   then
		   sudo mv "$f" "$f".backup
			# Check if a backing image exists. If not, create one and create an external snapshot."
			        if [ ! -e "$f".backing.image ]
					then
				        	echo "No backing image datected for $f."
				        	echo "Creating a backing image for $f."
				        	sudo qemu-img convert -O qcow2 -p "$f".backup "$f".backing.image
				        	sudo qemu-img create -f qcow2 -b "$f".backing.image "$f"
						echo Original file size = "$(sudo du -h "$f".backup |cut -f1)"
						echo Backing image file size = "$(sudo du -h "$f".backing.image |cut -f1)"
						backingcreate="y"
				fi
        	if [[ "$backingcreate" != [y] ]]
			then
				# Shrink the disk
				sudo qemu-img convert -B "$f".backing.image -O qcow2 -p "$f".backup "$f"
				# Show file sizes after virt-sparsify procedure is complete.
				echo Original file size = "$(sudo du -h "$f".backup |cut -f1)"
				echo Shrunken file size = "$(sudo du -h "$f" |cut -f1)"
		fi
		# Prompt user to test their virtual disk by starting the corresponding cirtual machine. 
		# Ask if the user wants to delete the backup file.
		echo "Start virt-manager to check if you can boot the virtual machine" 
		echo "that uses the disk image entitled"
		echo "$f." 
		echo "If you reach the desktop, it is probably safe to remove" 
		echo "$f.backup."
		read -r -e -p "Remove $f.backup? [y/N] " choice3
		if [[ "$choice3" == [Yy]* ]]
			# If yes, delete the backup file.
			then
				sudo rm "$f".backup
		elif [[ "$choice3" != [Yy]* ]]
				# If no, ask user if they want to restore the backup.
				then
					read -r -e -p "Restore backup? [y/N] " choice4
					if [[ "$choice4" == [Yy]* ]]
						# If yes, restore the backup file.
						then
							sudo mv "$f".backup "$f"
							# If a backing image was created, delete it.
							if [[ "$backingcreate" == [y] ]] 
								then
									sudo rm "$f".backing.image
							fi
					fi
		fi
	fi
backingcreate=n
done
for f in $tempvmlistp2
do
	# Ask if user wants to shrink the disk.
	read -r -e -p "Shrink $f? [y/N] " choice
	if [[ "$choice" == [Yy]* ]]
		# If yes, make a backup copy
		then
			sudo mv "$f" "$f".backup
			# Shrink the disk
			sudo qemu-img convert -O qcow2 -p "$f".backup "$f"
			# Show file sizes after the shirnk procedure is complete.
			echo Original file size = "$(sudo du -h "$f".backup |cut -f1)"
			echo Shrunken file size = "$(sudo du -h "$f" |cut -f1)"
			# Prompt user to test their virtual disk by starting the corresponding cirtual machine. 
			# Ask if the user wants to delete the backup file.
			echo "Start virt-manager to check the intregrity of the storge disk for the virtual machine" 
			echo "that uses the disk image entitled"
			echo "$f."
			echo "If you are able to read the files on it, it is probably safe to remove" 
			echo "$f.backup."
			read -r -e -p "Remove $f.backup? [y/N] " choice3
				if [[ "$choice3" == [Yy]* ]]
					# If yes, delete the backup file.
					then
						sudo rm "$f".backup
				elif [[ "$choice3" != [Yy]* ]]
					# If no, ask user if they want to restore the backup.
						then
							read -r -e -p "Restore backup? [y/N] " choice4
								if [[ "$choice4" == [Yy]* ]]
									# If yes, restore the backup file.
										then
										sudo mv "$f".backup "$f"
								fi
				fi


	fi
done
IFS="$ORIGINALIFS"
1 Like

Could you please post an example of the commands in which order the script would modify a file? This is to allow others to verify if this procedure makes sense. Many will not be able to read the script and imagine what commands it will be actually running.

What happens in case the user doesn’t delete the backup image ("Remove $f.backup? [y/N] ") (probably not will think this is a good idea) and then re-runs the script? It would overwrite the already existing backup? ("Remove $f.backup? [y/N] ")
Maybe then that shouldn’t be called a backup but temporary file? After all, the purpose of this script isn’t backups but shrinking. Mixing the two purposes / terminology seems wrong.

What is a backing image? Versus backup image?

use of sudo: Probably better to check if the script is running as root and abort otherwise? Better than running sudo inside the script?

The ident style seems off here:

if [ ! -e “$f”.backing.image ]


Could use some newlines.

These long lines are pretty difficult to grasp.

tempvmlistnp=$(for i in $(sudo virsh list --name --inactive); do sudo virsh domblklist "$i"; done| grep / | grep -E "sda|vda|hda" | awk -F" {2,}" '{print $2}' | sed -e 's|[[:space:]]|\ |g'; )

Would be more easy to read by splitting the long step into multiple smaller steps also also providing examples for variable contents? Example from genmkfie:

   while read -r -d $'\n' dpkg_line; do
      ## Example dpkg_line:
      ## Version: 0.1-1
      read -r first second _ <<< "$dpkg_line"
      ## Example first:
      ## Version:
      ## Example second:
      ## 3:0.1-1
      first="${first,,}"
      ## Example first:
      ## version
      if [ "$first" = "version:" ]; then
         make_changelog_version="$second"
         ## Example make_changelog_version:
         ## 3:0.1-1
         make_pkg_revision="${second#*-}"
         ## Example make_pkg_revision:
         ## 1
         temp="${second%-*}"
         ## Example temp:
         ## 3:0.1
         make_pkg_version="${temp#*:}"
         ## Example make_pkg_version:
         ## 0.1
         make_epoch="${second%:*}"
         ## Example make_epoch:
         ## 3
         break
      fi
   done < <( dpkg-parsechangelog )

minor, some typos:

datected
cirtual
intregrity
storge
backing → backup?

  		        	echo "No backing image datected for $f."
  		        	echo "Creating a backing image for $f."
  		        	sudo qemu-img convert -O qcow2 -p "$f".backup "$f".backing.image

Why convert the backup image, which is a qcow2 image to another qcow image?

  		        	sudo qemu-img create -f qcow2 -b "$f".backing.image "$f"

What is being created here?

Could you please simplify the following:

backingcreate="y"

if [[ "$backingcreate" != [y] ]]

if [ “$backingcreate” = “y” ]

or

if [ ! “$backingcreate” = “y” ]


I find this style rather hard to follow:

     if [[ "$choice3" == [Yy]* ]]
        # If yes, delete the backup file.
        then
           sudo rm "$f".backup

An intend for then followed for another intend for the actual command sudo rm "$f".backup.


Seems not used:

backingcreate=n

Read from list of vms, check if the disk is non-persistent and filter files that have not been modified in the past 24 hours.

Why care about past 24 hours?

Get list of installed and inactive virtual disks and write to a hidden temporary file.

Seems like no temporary file used.


Read from list of vms, check if the disk is non-persistent and filter files that have not been modified in the past 24 hours.

Read from list of vms, check if the disk is persistent and filter files that have not been modified in the past 24 hours.

Why distinguish between non-persistent and persistent disks?


Could use set -e or trap ERR and set -o pipefail.


if [ ! -e “$f”.backing.image ]; then

Ok, creates only if not existing.
Why would "$f".backing.image already exist and would it be correct to continue then?

elif [[ “$choice3” != [Yy]* ]]

This could be a simpler else.

1 Like

i will get to the whole of this later. but, here is what i can adress quickly right now.

a good point. and, yes, if the script is run again with a “backup” file that hasn’t been restored, it will be overwritten.

a “backing image” is a disk image that thin provisioned images use as the base file. so, changes made to the thin provisioned disk image are not written to the “backing image.” aside from protecting the initial base whonix install from accidental error, it requires less data to be written in the future, since the new images created by “qemu-img convert” that use a backing file will tend to be in the range of megabytes rather than gigabytes.

additional format cleaning will be done.

“qemu-img convert” does not appear to do in place conversion. when it converts the temporary “backup” image into the new image, it removes all the sectors that were written with zeros by zerofree, thus reducing the file size.

this creates a thin provisioned disk image that use “$f.backing.image” as the backing image. the thin provisioned image will be nearly empty upon creation. future disk writes go to the thin provisioned image.

this will be disappearing from the script. it’s a variable i was using for debugging at one point. there are other means to achieve the same thing without a variable.

because if one has not used the virtual machine in the past 24 hours, it likely does not have any data written to it that requires the time and disk writes to shrink the image. ideally, one will run the script after they have installed operating system updates in the virtual machine.

because the persistent disks tend to have more personalized data on it. so, i implemented a separate routine for those disks in order to keep everything confined to one disk image. if i used the method that implements a backing image, then additional data would get written to a thin provisioned disk, and if one wanted to copy that virtual disk to another machine, they’d have to execute additional commands to have all of the data contained in one virtual disk.

a “backing image” for thinly provisioned clients only needs to be created once. the script will create one if it isn’t there. thus, if it exists, the steps that would otherwise create a “backing image” are skipped. if it didn’t check for the existence and created a new thin provisioned disk with the same name as the current thin provisioned image created before, it would overwrite the data in the old thin provisioned disk. easiest way to think of these are as snapshots. the thin provisioned disks are essentially external snapshots that rely on the “backing image” to provide the base of the data needed to boot the system.

1 Like

These are good comments. A lot suitable as script comments.

1 Like

here’s the updated version. i took a number of the points into consideration.

  1. the initial long commands to populate the variables were not something i was able to shorten unfortunately. however, i broke them up so that they are easier to read.

  2. sudo has been removed. sudo was there because, before this blew up into a fairly long script, it was intended to work as a function in .bashrc, similar to the more simplistic way virtual disk shrinking can be done with virtualbox. unfortunately, that’s not currently a reality with kvm. the script will check if it is being run with root privileges and, if it isn’t, will prompt a user to use sudo to execute the script and exit.

  3. comments are now at nearly every step to explain what the script is doing.

  4. the text is all visible in nano from a standard sized terminal window. however, it still may be a little annoying to view it with the scrolls bars in this forum. so, while i’m going to paste it here, it may be easier to review from the pastebin link i created at https://pastebin.com/tNz06ETB

one of the more annoying lack of user friendly features with kvm at the moment is creating snapshots. unfortunately, if one uses the default method of internal snapshots with virt-manager, which is supposedly deprecated, the creation of a new slimmer disk image will not include the snapshots. thus, any snapshots become ghost entries that don’t work. while this was not intended to be a “backup” script, i could probably play with it further to create unique “snapshot” folders for the temporary external thin provisioned images. it would probably be easy enough to create a filename structure based on disk image name, date and time. but, i have not tried that yet.

#!/bin/bash
#
# QEMU/KVM Anti-Ballooning Disk Image Shrink Script version 1.0.Beta2
#
# This script will search for installed QEMU/KVM virtual machines 
# and attempt to reduce their size.
#

# Check if script is run with root privileges.  Require root privileges.
if [ "$EUID" -ne 0 ]
  then
  # If the script is not run with root privileges, echo messages instructing
  # the user to run the script with sudo and then exit the script.
  echo ""
  echo "This script must be run with root privileges."
  echo "Please use sudo to execute the script."
  echo ""
  exit
fi

# Backup original IFS environment settings to variable.
ORIGINALIFS="$IFS"

# Set IFS to terminate on new lines. This prevents
# problems with files and directories that contain whitespaces.
IFS=$'\n'

# Inform the user that zerofree must be run in recovery mode
# before this script will work. Prompt the user to shutdown the virtual machines.
# The script will not work on any active virtual machine.
echo ""
echo "QEMU/KVM Anti-Ballooning Disk Image Shrink Script version 1.0.beta2"
echo ""
echo "THIS IS A BETA RELEASE AND IS FOR TESTING ONLY. IT MAY INADVERTANTLY"
echo "CAUSE DAMAGE TO YOUR DISK IMAGES."
echo ""
echo "Ballooning is a term for the scenario where the size of virtual disk"
echo "images continue to increase, despite the fact that a lot of the data"
echo "that contributes to the file size increase is discarded data and no longer of"
echo "any use.  With the use of the zerofree utility while a virtual disk image"
echo "is mounted as read only in the Linux recovery mode, the discarded data"
echo "can be converted to zeros.  This script uses the tools provided by"
echo "QEMU/KVM to create new disk images while discarding all data that"
echo "that constitutes a zero, thus saving considerable disk space at times."
echo ""
echo "The shrink function will only work if you ran the zerofree utility"
echo "on your virtual machine disks while booted in Recovery Mode. Additionally,"
echo "you must shut down your virtual machines for the shrink process to work." 
echo ""
read -r -e -p "Please shut down your virtual machines and press [enter] to continue."
echo ""

# Create a variable that contains the paths file names
# for the primary virtual disks of all inactive QEMU/KVM
# virtual machines. The primary disks are intended to be
# booted in "Live Mode" in most circumstaces and contain
# the majority of the operating system files. This function
# will enable the script to create a "backing image" which is
# a read only file that a thin provisioned image uses as a base
# to access the majority of the required operating system data.
# The advantage it offers are that the thin provisioned disk
# images are smaller in size and can be process faster in the
# future. Additionally, for users who regularly boot a virtual
# machine in a persistent mode, the backing image can be used
# as a safe restore point if something goes wrong.
tempvmlistnp=$(for i in $(virsh list --name --inactive); do 

	     # Parse the output to obtain the disk device name,
             # path and file name for a virtual disk connected
             # to a virtual machine.
	     virsh domblklist "$i"; done |

             # Parse the output for a leading directory "/" character.
             # This prevents $tempvmlistnp from being populated
             # with useless text from "virsh domblklist."
             grep / |

             # Parse the output for the common device names used
             # for primary disks. This ensures that $tempvmlistnp
             # is only populated with the path and file names of
             # virtual disks that contain the majority of the operating
             # system files. This will enable the script to create a 
             # backing image and thin provisioned imagefor the virtual disk
             # later, which reduces disk writes for the script in the future.
	     grep -E "sda|vda|hda" |

             # Print the field from the output that contains the full
             # path and file name of the virtual disk. This will add
             # the full path and file name of the virtual disk to
             # $tempvmlistnp.
             awk -F" {2,}" '{print $2}'; )

# Create a variable that contains the paths file names
# for any additional virtual disks connected to inactive QEMU/KVM
# virtual machines. The additional disks are for persistent
# storage of files. Therefore, the creation of a backing image
# and thin provisioned image is avoided because it may make
# copying the file less user friendly.
tempvmlistp=$(for i in $(virsh list --name --inactive); do 

            # Parse the output to obtain the disk device name,
            # path and file name for a virtual disk connected
            # to a virtual machine.
            virsh domblklist "$i"; done|

            # Parse the output for a leading directory "/" character.
            # This prevents $tempvmlistnp from being populated
            # with useless text from "virsh domblklist."
            grep / |

            # Parse the output for the common device names that are
            # not used for primary disks. This ensures that $tempvmlistp
            # is only populated with the path and file names of
            # virtual disks that are used for persistent storage.
            # This will enable the script to ensure that all the files
            # that are intended to remain unchanged after a "Live Mode"
            # virtual machine is shut down are contained in one virtual 
            # disk file.
            grep -E -v "sda|vda|hda" |

            # Print the field from the output that contains the full
            # path and file name of the virtual disk. This will add
            # the full path and file name of the virtual disk to
            # $tempvmlistp.
            awk -F" {2,}" '{print $2}'; )

# Read from the list of primary virtual disks in $tempvmlistnp
# to populate a new variable that contains the full path and
# file names of virtual disks that are larger than 10 megabytes
# and have been altered in the past 24 hours. Disks that are booted
# in "Live Mode" with the "readonly" flag set in the virtual do not
# have a new date of modification applied to them on boot. Therefore,
# this function prevents the script from processing files that are unlikely
# to achieve any chnage in size upon being processed.
tempvmlistnp2=$(for f in $tempvmlistnp; do 

              # Use the find command to populate $tempvmlistnp2 with
              # the full paths and file names of disk images that are larger
              # than 10 megabytes and were modified in the past 24 hours.
              find "$f" -type f -mtime -1 -size +10M; done )

# Read from the list of primary virtual disks in $tempvmlistp
# to populate a new variable that contains the full path and
# file names of virtual disks that are larger than 10 megabytes
# and have been altered in the past 24 hours. This will populate
# the variable to be parsed in the function that will not create
# backing images and thin provisioned images for persistent
# storage disks.
tempvmlistp2=$(for f in $tempvmlistp; do 

             # Use the find command to populate $tempvmlistp2 with
             # the full paths and file names of disk images that are larger
             # than 10 megabytes and were modified in the past 24 hours.
             find "$f" -type f -mtime -1 -size +10M; done )

# Start the for loop to process the primary virtual disk images.
for f in $tempvmlistnp2
do

  # Ask if user wants to shrink the disk.
  read -r -e -p "Shrink $f? [y/N] " choice
  echo ""

  # Start of If Statement NP1
  # Any input text that starts with Y or y will be read as "yes."
  if [[ "$choice" == [Yy]* ]]

    # If yes, move the disk image to a temporary backup copy.
    # The temporary backup copy will be used to create the shrunken images
    # and can also be used to revert to the old disk image in case the virtual
    # disk is corrupted after the process.
    then
    mv "$f" "$f".temp

    # Start of If Statement NP2
    # Check if a backing image exists. If not, create one and
    # and create a supporting thin provisioned disk image.
    if [ ! -e "$f".backing.image ]
      then
      echo "This script will now create a backing image and thin provisioned"
      echo "image for $f."
      echo ""
      echo "A backing image will only be created the first time a primary"
      echo "disk is processed by this script. It is a read only file"
      echo "that will not be altered by your use of the virtual machine"
      echo "to which it is connected. Rather, all disk writes go to a smaller"
      echo "thin provisioned disk image. The use of a backing image will result"
      echo "in future shrink operations running much faster because there is"
      echo "less data to process."
      echo ""
      echo "Also, in an emergency, the backing image can serve to restore the"
      echo "virtual machine disk to its original state by typing the following"
      echo "command:"
      echo ""
      echo "sudo mv $f.backing.image $f"
      echo ""
      echo "Creating backing image for $f."

      # Create the reduced size backing image from the temporary backup file.
      qemu-img convert -O qcow2 -p "$f".temp "$f".backing.image
      echo ""

      # Create the thin provisioned file.
      qemu-img create -f qcow2 -b "$f".backing.image "$f"
      echo ""

      # Show the differences in file sizes between
      echo Old file size = "$(du -h "$f".temp |cut -f1)"
      echo New backing image file size = "$(du -h "$f".backing.image |cut -f1)"

      # Create a variable to mark that a backing image was created by the script.
      # This will be used later to ensure that the backing image does not get
      # deleted if a user decides that they want to restore a temporary backup
      # file to its original file name.
      backingcreate="y"

    # If a backing image already exists, the following steps will be executed.
    # This will save time and ensure that data in a previously created
    # thin provisioned disk image is not lost. 
    else

      # Create a new thin provisioned disk image that will copy the data
      # from the temporary backup file and rely on the backing image as the
      # base file.
      qemu-img convert -B "$f".backing.image -O qcow2 -p "$f".temp "$f"

      # Show the file sizes of the images after the procedure is complete.
      echo ""
      echo Old file size = "$(du -h "$f".temp |cut -f1)"
      echo New shrunken file size = "$(du -h "$f" |cut -f1)"
    # End of If Statement NP2
    fi

    # Prompt user to test their virtual disk by starting the corresponding
    # virtual machine.  Ask if the user wants to delete the temporary
    # backup file.
    echo ""
    echo "Start the virtual machine in virt-manager to check if you can"
    echo "boot the virtual machine that uses the disk image entitled"
    echo "$f."
    echo "If you reach the desktop, it is probably safe to remove" 
    echo "$f.temp."
    echo ""
    echo "If you encounter a problem booting the virtual machine,"
    echo "do not remove the temporary file. If you choose not to"
    echo "remove the temporary file, you will be given the option to"
    echo "restore the temporary file to its original name which will"
    echo "fix the problems you are experiencing."
    echo ""

    # Ask the user if they want to remove the temporary backup file.
    read -r -e -p "Remove $f.temp? [y/N] " choice3
    echo ""

    # Start of If Statement NP3
    # Any input text that starts with Y or y will be read as "yes."
    if [[ "$choice3" == [Yy]* ]]
       then

       # If yes, delete the temporary backup file.
       rm "$f".temp

       # If the choice is not Y or y, ask the user
       # if they want to restore the the temporary backup.
       else
       read -r -e -p "Restore the virtual disk to its original state? [y/N] " choice4
       echo ""

       # Start of If Statement NP4
       # Any input text that starts with Y or y will be read as "yes."
       if [[ "$choice4" == [Yy]* ]]
          then

          # If yes, restore the temporary backup file to its original name.
          mv "$f".temp "$f"

          # Start of If Statement NP5
          # If a backing image was created during this script run, delete it.
          if [ "$backingcreate" = "y" ]
            then
            rm "$f".backing.image
          #End of If Statement NP5
          fi
       # End of If Statement NP4
       fi
    #End of If Statement NP3
    fi
  # Enf of If Statement NP1
  fi

# Reset the $backingcreate variable to "n".  This will prevent the script
# from deleting a backing image that corresponds to a disk when a backing
# image was not created during the execution of this script as the for loop
# progresses.
backingcreate=n

# End the for loop that corresponds to primary disk images.
done

# Start the for loop to process the additional virtual disk images that
# will not implement a backing image.
for f in $tempvmlistp2
do

  # Ask if user wants to shrink the disk image.
  read -r -e -p "Shrink $f? [y/N] " choice
  echo ""

  # Start of If Statement P1
  # Any input text that starts with Y or y will be read as "yes."
  if [[ "$choice" == [Yy]* ]]

    # If yes, move the disk image to a temporary backup copy.
    # The temporary backup copy will be used to create the shrunken images
    # and can also be used to revert to the old disk image in case the virtual
    # disk is corrupted after the process.
    then
    mv "$f" "$f".temp

    # Create a new virtual disk image and remove the zeros to conserve space.
    qemu-img convert -O qcow2 -p "$f".temp "$f"

    # Show file sizes after the procedure is complete.
    echo Old file size = "$(du -h "$f".temp |cut -f1)"
    echo Shrunken file size = "$(du -h "$f" |cut -f1)"
    echo ""

    # Prompt user to test their virtual disk by starting the corresponding
    # virtual machine.  Ask if the user wants to delete the temporary
    # backup file.
    echo "Start virt-manager to check the integrity of the persistent disk for"
    echo "the virtual machine that uses the disk image entitled"
    echo "$f."
    echo "If you are able to read the files on it, it is probably safe to remove" 
    echo "$f.temp."
    echo ""
    echo "If you encounter a problem reading files from the persistent"
    echo "virtual disk, do not remove the temporary file. If you choose not to"
    echo "remove the temporary file, you will be given the option to"
    echo "restore the temporary file to its original name which will"
    echo "fix the problems you are experiencing."
    echo ""

    # Ask the user if they want to remove the temporary backup file.
    read -r -e -p "Remove $f.temp? [y/N] " choice3

    # Start of If Statement P2
    # Any input text that starts with Y or y will be read as "yes."
    if [[ "$choice3" == [Yy]* ]]
      then

      # If yes, delete the temporary backup file.
      rm "$f".temp

      # If the choice is not Y or y, ask the user
      # if they want to restore the the temporary backup.
      else
      # If no, ask user if they want to restore the backup.
      read -r -e -p "Restore the virtual disk to its original state? [y/N] " choice4
      echo ""

      # Start of If Statement P3
      # Any input text that starts with Y or y will be read as "yes."
      if [[ "$choice4" == [Yy]* ]]
        then

        # If yes, restore the temporary backup file to its original name.
        mv "$f".temp "$f"

      #End of If Statement P3
      fi

    #End of If Statement P2
    fi

  # End of If Statement P1
  fi

# End the for loop to process the additional virtual disk images that
# will not implement a backing image.
done

# Restore original IFS environment settings.
IFS="$ORIGINALIFS"
1 Like

What if the backing image exists but is broken? Such as when the machine crashed, was powered off, rebooted or the process interrupted last time the script was run?


Currently:

If no backing image exists:

qemu-img convert -O qcow2 -p "$f".temp "$f".backing.image
qemu-img create -f qcow2 -b "$f".backing.image "$f"

If backing image exists:

qemu-img convert -B "$f".backing.image -O qcow2 -p "$f".temp "$f"

I don’t understand why these are two different procedures / commands.

Shouldn’t that be…?

if no backing image exists:
    create backing image

continue steps that assume existence of backing image (out side the if)

Why care about past 24 hours?

That I find a leap. Take a simpler use case. Someone using only Whonix-Gateway and Whonix-Workstation. The runs the shrink script. Nothing happens and the user doesn’t even know why. Maybe this would be a useful optional feature but a non-configurable default feature seems surprising, confusing.

I am trying to follow principle of least astonishment here.

Why distinguish between non-persistent and persistent disks?

That is also a very personalized conclusion. Many users probably are not using secondary disk images. Only the defaults and have everything there.

If that sounds cumbersome additional disks then this is also cumbersome for primary disks. I guess there is a huge number (absolute majority) of users only using primary and no additional disks.


# Reset the $backingcreate variable to "n".  This will prevent the script
# from deleting a backing image that corresponds to a disk when a backing
# image was not created during the execution of this script as the for loop
# progresses.
backingcreate=n

Really no need for that since there is still no use of $backingcreate after that point. That variable will be unset after the script finished nonetheless. It will not linger around on subsequent script runs or for other commands run on the same console. It would only linger around if exported and another child script called from that script.

if [[ "$choice" == [Yy]* ]]

Instead you coudl use

if [[ ! "$choice" == [Yy]* ]]; then
  continue
fi

This would imo simplify the code and to remove 1 level indentation.

script comments:
While it might be interesting to have 1 (or a few) scripts with comments on everything for educational purposes, this shouldn’t become the standard. The usual rule that made sense to be was the following:

  • The ideal goal is to have code being so clear that it doesn’t need any comments.
  • If you failed to express yourself in code, if something won’t be clear by reading the script, add a comment.
  • Obvious things such as

## echo output to user
echo "$output"

shouldn’t be commented for brevity.

1 Like

since the backing image is read only, this should not matter. virtual machine crashes won’t affect the state of the backing image. absent disk corruption, the backing image should be rock solid. additionally, since the disk images, per instructions, should be in a read only state in order to get the full advantages of “live mode,” a new image created from a disk image that would be problematic from such crashes would have to involve a crash where someone was executing the zerofree run after an upgrade procedure or the installation of new software in a persistent mode setting, since the vm should always be run in “live mode” with the “readonly” flag set to “true” on the virtual disk otherwise. for this reason, after new images are created, the user is prompted to run the vm to make sure it boots and the data is readable before removing the temp file. if it is not bootable, or data is not readable, the user is given the option to restore the temp file, which will return the vm to its state before such a problem occurred. if such a problematic crash occurred before the zerofree run and the running of the script, it wouldn’t be script related and i’m not sure how this script could address it. instead, this would require instructions informing the user to backup their image before they put it in persistent mode to either do a dist-upgrade or install new software. but, that would be a whonix qemu/kvm use problem that exists outside of this script.

the first part of the above creates a backing image and creates a new thin provisioned image that refers to the backing image. the second part, which executes only if a backing image exists, the script converts a previously existing thin provisioned image to a new thin provisioned image that copies the data from the old thin provisioned image, which exists as the main disk that kvm boots. based on how the existing kvm config works, it refers to one unique specific file name for booting whonix. the script is written to work seamlessly with that in mind.

if the former executed every time, it would treat the thin provisioned image as the main disk, and then make it the backing image from which a new thin provisioned image would be created. this would result in data loss, since the majority of the data that makes up whonix would be erased, as a thin provisioned image does not include all the data of a backing image.

the commands involved are different. when no backing image is detected, a routine is run to remove the zero written space from the existing image and making it a backing image. then a command is executed to create a new thin provisioned image from there. the part that runs when a backing image already exists copies the existing thin provisioned image to a temp file and creates a new thin provisioned from there, which includes all the data in the old thin provisioned image, and refers to the backing image as the base.

to simplify the above, when a backing image does not exist, a backing image is created and a thin provisioned disk is CREATED that refers to the the backing image.

when a backing image and thin provisioned image does exist, the thin provisioned image is CONVERTED into a new image that refers to the backing image.

no. because, when the backing image doesn’t exist, the backing image is created and a new thin provisioned image is created that uses it. when the backing image already exists with a corresponding thin provisioned image, the script converts the old thin provisioned image into a new one, which preserves the data from the old thin provisioned image. if it created a thin provisioned image from the exsting backing image every time, rather than the existing thin provisioned image, all the data in the existing thin provisioned image would be lost, since the backing image would be used is as the base, and thus upgrades contained in the existing thin provisioned image that were implemented at times would be lost. perhaps i’m overlooking something here (my sleep and work has been off lately). but, the way it is scripted does not pose the risk of data loss, which would require more downloads to make whonix current with the latest upgrades and require more disk writes.

these are separate and distinct commands. when no backing image exists, a thinner backing image is created and a brand new thin provisioned image that refers to it is created. when the backing image already exists, the existing thin provision image is used to create a new thin provisioned image that refers to the backing image.

first and foremost, i am not taking any of this personally. critique as much as possible. i am working on this in my free time which is limited. thus, there is room for much perfection here that i will miss. also, i wrote this script with instructions that i will be providing to users that reference my guide in mind. so, that will color this script. if i am making problematic errors in that regard, they absolutely need to be adressed. :slight_smile:

on this section, the goal is to prevent someone from executing unneeded disk writes. while i could leave as an option involved, perhaps it would make more sense if i simply explained in output to the user from the script that images not modified in the past 24 hours will not be touched? i could tweak the script to give a message that “this image has not been modified in the past 24 hours and likely does not need to be processed.” or, i could provide a prompt that states something like "if you did not run zerofree in recovery mode on this image since you last ran this script, you can choose “no.” with those, i could remove the time restriction if you think it is more beneficial. but, especially for an ssd drive, if someone simply ignores it and runs the process, it will result in additional disk writes that will result in no benefit.

since the script will only run on an image where someone has set it in persistent mode in the past 24 hours, they will arguably run it within that time frame after they have manually done a zerofree run. again, this is something i could explain in the script. but, it makes more sense to explain in the instructions on how to put the vm into a state where the script will be useful. if they don’t run the script within 24 hours of a manual zerofree run in persistent mode, the next time they run a zerofree run in persistent mode, the same beneficial results will occur if they run the script within 24 hours. so, the only downside here is an extra length of time for a larger disk i believe.

this is arguably a scenario where virt-sparisfy work as a better tool, since it would take all user guessing out of it, doesn’t require a manual zerofree run, and could be set to simply run on every image located. but, based on my tests, it involves mcuh greater writing of data that is not required, and may pose problems for people using smaller disks, particularly for people using a msall usb thumb drive that is not affected in the same way based on how the script is currently written.

if they do not have a secondary disk, this part of the script will not appear. for people who use a secondary disk to store persistent data, such as emails, gpg keys, ssh keys, etc, it will. it’s personalized in the sense that i’ve taken into consideration that users may have followed the instructions i’ve written to preserve email, keepassxc, ssh, gpg, etc, related files on a persistent hd, which would otherwise get forgotten in a straight “all files on one virtual hd” live mode config with kvm. thus, it offers the ability to have live mode forget any changes to the core / files on reboot, while preserving others that may be changed, and need to be persistent, in common use.

this may warrant another discussion. unless i’ve missed something, the only other method i’ve seen discussed to store persistent data is to open the host hd to saving files in a shared folder. aside from opening the host hd to potential malware writes, i believe there are also issues involving snapshotting or other things if users remove files from a shared directory from inside the vm. so, using a secondary persistent virtual disk seemed cleaner. but, i could be mistaken there. the persistent virtual disk is also obviously susceptible to malware writes. but, it made more sense to me to confine such a threat to a virtual disk rather than the host disk.

true. however, perhaps it is a false assumption on my part. i’m envisioning a scenario where someone had an issue with the base whonix os image, and wanted to reinstall a new version. in that regard, they are unlikely to copy the primary disk over. rather, they will create a new vm from a fresh whonix install. for the secondary disks, which contain custom personalized data, it’s simply easier to add that as an addtional disk, and mount it from the /etc/fstab in the new whonix image. if the secondary disks used the backing image method, they’d first need to commit the thin provisioned disk image to the backing image if they make a copy, which is an extra step that people might forget, and might result in data loss. ideally, everything on the primary disk should be relatively default, which makes a reinstall in that regard less of an issue compared to a secondary persistent disk which contains custom data that will be very problematic to lose.

check the end of the if statement that creates the backing image. it is set to “y” at that point. if the for loop catches other disks where a backing image is not created, but a backing image exists, if the user chooses to remove the “temp” file due to a boot issue, it will also remove the backing image without this flag set, unless i am mistaken. the point of this variable is to deal with a situation where a new backing image is created, the user attempts to boot per the instructions after the backing image is created, something goes wrong, and they choose to restore the temp file to it’s original state. with the flag set, it will copy the temp file back to the original file name, thus restoring the original data, and delete the backing file. in a situation without this flag set, if the issue is with a newly created thin image from a previous thin provisioned image, the backing file will be deleted and result in extreme data loss, which will require a fresh install since the backing image is gone. without the flag, the option is left to either result in the script potentially removing the backing image, or requiring the user to remove an uneeded multi-gigabyte backing image manually.

i’ll play with that.

since this is a beta copy for review, i decided to document every step. since it is bash, some steps may not be obvious. when it comes closer to a final form, i will remove comments involving obvious steps like “echo” and such.

thanks again for taking the time to review. keep in mind, this script is envisioned only for someone using the kvm whonix images on a debian install. so, it’s fairly specific to one use scenario.

on another note, should i consider adding steps to it to backup the temp thin provisioned images as snapshots? i could probably do that and create another script for snapshot restoration purposes that currently aren’t available in virt-manager with this process.

tempest via Whonix Forum:

but, absent a warning in instructions, i’m not quite sure how to address it.

Me neither.

Generally, I might not be getting the full context of this. I wasn’t
considering live mode or the context in which this might appear in the
guide. I was looking at it from a very generalized perspective. Such as
for inclusion into the usability-misc package. For users who don’t read
your guide, don’t use live mode and just use the default images for
everything. For these an /easy/tool/to/compact-vm would be cool. One
tool could be /path/to/compact-this-disk-image. Or
/path/to/compact-disk-images-of-vm-by-vm-name. Also also compact-all-vms.

As per murphy’s law, anything can go wrong, will go wrong. At some
point, someone will cancel/out of memory killed/power loss within an
operation (such as running qemu-img) and then lead to unexpected
results. At some point such users are going to ask about it in the forums.

Therefore I am trying to foresee such situations and prevent these if
possible. Scripts should be idempotent.

Ideally, no matter where a previous run of the script might have been
interrupted, a re-run of the script shouldn’t make assumptions about
previous runs. Otherwise murphy’s law is going to hit in future.

Sometimes a script can make assumptions about previous runs. For
example, if an action (such as qemu-img something) succeeded, a success
file can be created. Next time the script is run, use of that success
file would be reliable. (Well, it wouldn’t cover cases in which data
corruption damaged the file. (echo "" > /path/to/file) Ideally there
would be a sanity test but I doubt that exists. md5sum might be
sufficient here since fast and not a security relevant test?

I am also wondering if parts of this could be refactored. A guided
compact all images could also call an external script
/path/to/compact/just/one/image.

first and foremost, i am not taking any of this personally.

It’s certainly not personal.

critique as much as possible. i am working on this in my free time which is limited. thus, there is room for much perfection here that i will miss.

Sure.

perhaps it would make more sense if i simply explained in output to the user from the script that images not modified in the past 24 hours will not be touched?

I think so.

check the end of the if statement that creates the backing image. it is set to “y” at that point. if the for loop catches other disks where a backimg image is not created, but a backing image exists, if the user chooses to remove the “temp” file, it will also remove the backing image without this flag set, unless i am mistaken.

Ah. Indeed. backingcreate=n happens inside the loop. What confused me
about:

backingcreate=n

# End the for loop that corresponds to primary disk images.
done

is that backingcreate=n is not properly intend’ed (needs spaces in
front of it to indicate it’s part of something (something could be a
loop or function) and not standalone at the “top level” (outside of any
function or loop).

i could probably do that and create another script for snapshot restoration purposes that currently aren’t available in virt-manager with this process.

Probably best discussed in separate thread since different functionality?

the script should work as intended for someone who is using a default kvm whonix package in persistent mode. in fact, to get it to work for someone using the “live mode” function generally with the flag in virt-manager set to “readonly” on the disk, additional steps outside of the script need to be taken. however, i will remove the time restriction and explore having the script inform the user that a file has not been recently modified. that may make more sense.

true. the first step of the script that physically alters a file is when the image is moved to the “.temp” file. the final run of the routine, assuming a user decides that the process was successful, removes the the “.temp” file. so, a simplistic means of addressing such crash scenarios could be to have the script check if a “.temp” file exists before proceeding and, if it does, inform the user about it while mentioning the above referenced crash scenarios, and ask if they want to restore from it. i can play with this to attempt to be more cautious about such scenarios.

i will put a couple spaces with it to line it up with the first if statement. that will make it more clear that it is part of the corresponding for loop.

it is a different functionality, yes. however, it would be simple enough to add an additional routine that asks “do you want to create a snapshot of the disk image’s current disk state.” something else to mull over.

1 Like

tempest via Whonix Forum:

however, i will remove the time restriction and explore having the script inform the user that a file has not been recently modified. that may make more sense.

Nice.

true. the first step of the script that physically alters a file is when the image is moved to the “.temp” file. the final run of the routine, assuming a user decides that the process was successful, removes the the “.temp” file. so, a simplistic means of addressing such crash scenarios could be to have the script check if a “.temp” file exists before proceeding and, if it does, inform the user about it while mentioning the above referenced crash scenarios, and ask if they want to restore from it.

Well, if the “.temp” status file still exists which indicates that the
process previously did not complete, then there is even no need to ask
the user? Could this be handled the same way as “did not exist yet”?

1 Like

based on the options that are currently presented to a user, there is the potential for a user to choose not to delete the .temp file, even if everything is working. so, the presence of a .temp file may not be evidence of a crash or something else going wrong. for example, it may involve a user who wanted to play with the recently changed vm that wants to keep the .temp file as a backup in case they want to roll back the vm to a previous state. or, it simply could involve a scenario where the user pressed “enter” to quickly when asked if they want to remove the .temp file and, since the script defaults to treat the choice as “no,” the .temp file is simply a useless extra file.

i could set the script to automatically run from the conversion steps if a .temp file is located. however, if a user had recently made changes to the current disk image that they intend to keep, automatically converting from the old .temp image would erase those changes.

so, i think presenting the user with a prompt about the presence of the .temp images, and then offering options that may include “restore the temp file,” “create a new shrunken image from the temp file,” or “delete the temp file” is probably the safer option.

1 Like

I am suggesting a lockfile mechanism.

sudo touch /path/to/started-some-action
run long running command that might be aborted or fail
rm -f /path/to/started-some-action

Next time the script runs it can check if /path/to/started-some-action exists. If yes, you know that “run long running command that might be aborted or fail” previously did not complete. Therefore you can run “run long running command that might be aborted or fail” without asking the user.

1 Like

i think i can do that.

if $f.lock file is present, assume crash and use old $f.temp file for process.

if $f.lock file is not present, but old $f.temp is present, ask the user what they want to do with $f.temp (restore, erase, etc.).

btw, to clarify the statement i made earlier about not taking this personally, since that likely came off as confusing, i misread the “principle of least astonishment” as “the principle of least admonishment” and had visions of some odd linus torvalds folk rule. lol! my apologies for any confusion.

1 Like

That is a weird but possible corner case.

Again without me fully understanding it:
Would it be sane not asking the user and just overwrite $f.temp?

That’s a new word in my non-native English speaker brain dictionary. :slight_smile:

1 Like

it would not be sane if the user is keeping it as a rollback file. it’s unlikely, yes. however, i could better address this if i made the storing of the .temp file as a snapshot image and then create another user script to manage them. i just haven’t moved that far into the project yet.

with the current script, a temp “copy” is not made. rather, the original image to be shrunken is renamed to $f.temp. this was done to minimize the amount of disk writes and disk space needed. in that regard, the “$f.temp” file could be treated as the lock file as well. i could rewrite the script to automatically remove $f.temp if a user decides that they don’t want to restore from $f.temp after they boot the virtual machine.

it means “to scold.” lol.

1 Like

i’m debugging a new “alpha” version right now. it’s a little more of a complex process than i expected.

so far, it will create the thin images. if a user chooses to create a snapshot of the primary disk images, it will check for a “deballoon-snapshots” directory in the same directory where the disk image is located, create a unique subdirectory named after the disk image in the “deballoon-snapshots” directory, and create a snapshot file in the subdirectory that includes the date and time in the file name. if the snapshot directories do not exist, the script will create them. i named the main snapshot directory “deballoon-snapshots” so that nobody will confuse it with a snapshot directory created by virt-manager.

also, various lock files are created at different steps in the script to determine where a crash occurred or where a user hit ctrl-c. the checking for the lock files throughout the script is where it has gotten a little complex.

1 Like