Derivative Maker Automated CI Builder

If the build script and/or the CI scripts check for stray mounts, I guess is best if you decide.

The build script will check for stray mounts anyhow because non-CI builds should get a build error early when there is a stray mount. That test can stay independent from any potential test by the CI scripts.

The tests that I am currently using:

Usually a CI sets environment variable CI=true. And depending on that, the build script could perform some action it does not do outside of CI.

But if the build script is a clean place to say “please reboot the CI” that is highly questionable. Could be a rather unclean hack.

I guess -l, –lazy won’t help much here. The mount is probably staying forever until reboot.

Sure, if that works.

An additional test that would probably be useful in the CI scripts (similar to the test already in the build script):

   losetup_output=$($SUDO_TO_ROOT losetup --all)

   if [ "$losetup_output" = "" ]; then
      true "INFO: Output of losetup_output is empty. No stray loop devices, OK."
      return 0
   else
      error TODO bail out here and reboot CI server
   fi

That is probably ideal and better than what I said above.

Does it work for you if you run that command manually? I mean, the command will exit without error but will the mount point be unmounted indeed? If yes, I’ll happily add it to the build script.
Actually, I might just try that now.

Yup

ansible@host:~$ df -h
Filesystem           Size  Used Avail Use% Mounted on
udev                 974M     0  974M   0% /dev
tmpfs                199M  540K  198M   1% /run
/dev/vda1             50G   17G   31G  35% /
tmpfs                992M     0  992M   0% /dev/shm
tmpfs                5.0M     0  5.0M   0% /run/lock
/dev/vda15           124M  5.9M  118M   5% /boot/efi
/dev/mapper/loop0p1   98G  2.8G   91G   4% /home/ansible/derivative-binary/Whonix-Gateway-XFCE_image
tmpfs                199M     0  199M   0% /run/user/1001

ansible@host:~$ sudo umount -l /home/ansible/derivative-binary/Whonix-Gateway-XFCE_image

ansible@host:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            974M     0  974M   0% /dev
tmpfs           199M  540K  198M   1% /run
/dev/vda1        50G   17G   31G  35% /
tmpfs           992M     0  992M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/vda15      124M  5.9M  118M   5% /boot/efi
tmpfs           199M     0  199M   0% /run/user/1001
1 Like

I am very surprised a normal umount doesn’t work but umount with -l / --lazy works. Yeah, then that would mean there’s some sort of bug in umount or some other low level tool or even the kernel.

Therefore I’ve already rewritten for more defensive unmount.
derivative-maker/unmount-helper at master · derivative-maker/derivative-maker · GitHub

It also attempts umount --lazy.

Not yet tested but I guess that’s what the CI is for.

yes it is :slight_smile:

Thanks !

1 Like

As for file automated_builder/tasks/create_vm.yml may I suggest keeping the actual clean and build commands out of the CI?

There are various CI scirpts in derivative-maker/help-steps at master · Mycobee/derivative-maker · GitHub (those starting with ci_) but these can probably all be deleted since nowadays defunct.

The help-steps folder however could be the place where we a short or several short scripts which contain the actual build command.

Or better, let’s add a ci sub folder in the derivative-maker repository where the small scripts that define the actual clean and build command reside?

Reason: if the mount continues to make issues, I would use simpler build commands to just focus on the mount issue in quicker iteration.

1 Like

The following…

- name: Build new gateway VM
  shell: "dist_build_non_interactive=true /home/ansible/derivative-maker/derivative-maker --flavor whonix-gateway-xfce --target virtualbox --build >> /home/ansible/build.log 2>&1"

Would stay mostly the same. Just it would do something like this:

- name: Build new gateway VM
  shell: "/home/ansible/derivative-maker/derivative-maker/ci/build-gateway >> /home/ansible/build.log 2>&1"

(dist_build_non_interactive=true would be set within ci/build-gateway.)

1 Like

Yup I will do that :slight_smile:

1 Like
---
- name: Clean existing gateway VM
  shell: "dist_build_non_interactive=true /home/ansible/derivative-maker/derivative-maker --flavor whonix-gateway-xfce --target virtualbox --clean > /home/ansible/build.log 2>&1"

- name: Clean existing workstation VM
  shell: "dist_build_non_interactive=true /home/ansible/derivative-maker/derivative-maker --flavor whonix-workstation-xfce --target virtualbox --clean >> /home/ansible/build.log 2>&1"

- name: Reboot VPS for stray loop devices
  reboot:
    reboot_timeout: 60
  become: true

- name: Build new gateway VM
  shell: "dist_build_non_interactive=true /home/ansible/derivative-maker/derivative-maker --flavor whonix-gateway-xfce --target virtualbox --build >> /home/ansible/build.log 2>&1"

- name: Build new workstation VM
  shell: "dist_build_non_interactive=true /home/ansible/derivative-maker/derivative-maker --flavor whonix-workstation-xfce --target virtualbox --build >> /home/ansible/build.log 2>&1"

Not sure it would be wise to combine all of them to a single script in ci folder?

Except - name: Reboot VPS for stray loop devices that might make more sense for the CI to take care of.

How about:

  1. ansible calls script in ci folder to clean vms
  2. ansible reboots machine for loop devices
  3. ansible call script in ci folder to build vms

I agree that leaving reboot functionality in ansible is a good idea so ansible knows to expect connection to the machine to break during reboot

For longer term maintenance, I have a question.

Is there any way to speed up the builds to only run the relevant steps that have been affected in the code changes?

i.e. - Do we need to build ../monero-gui_0.18.1.0-1_all.deb when you only change a few things in the help steps or something?

Would be nice to have a “light” build or something for iterating more quickly. Currently it takes over 1.5 hours to build fresh workstation and gateway VMs. I’d love it if we could make it where you have the ability to get feedback more quickly

I guess though when troubleshooting you can always just SSH in to the VPS and run the troublesome build step manually and see what is causing the problem. Just a thought though, I want this CI feature to make your life easier :man_shrugging:

1 Like

We have this thingy here: Whonix build script now optionally supports installing packages from Whonix remote repository rather than building packages locally

So just by adding…

--remote-derivative-packages true

No packages should be built and all packages would be downloaded from the Whonix binary repository. It would skip all the lengthy package creation.

How does that sounds?

How often will we use --remote-derivative-packages true? Maybe for git commits, use it. For git tags, do it “proper” and drop it?

Though, when using --remote-derivative-packages true we would not notice when package builds fail. But that isn’t very likely since if packages are updated, I need to build them locally anyhow.

Absolutely makes sense. In my previous build, a rookie mistake for forgetting $SUDO_TO_ROOT has lead to a failed build. Another 40 minutes to wait for me now until I can see if that is fixed now - unless I do a local build with local hacks which with the CI we’re trying to avoid.

That’s also why I suggested Derivative Maker Automated CI Builder - #74 by Patrick - because then I would hack the build command to a much simpler various to a point where only a minimal raw image gets created with even nothing useful inside just to test various mount / umount to quickly get that fixed.

1 Like

My latest commit fix · derivative-maker/derivative-maker@03f6496 · GitHub didn’t get picked up by the CI on https://github.com/Mycobee/derivative-maker/actions yet.

Last time that was faster, I think. I am not complaining about the speed. Just thinking if that commit got lost, not there yet, it will never come.

Maybe the automated CI reboots could leak to some commits overlooked by the CI?

This happened because I rebased and pushed pretty quickly after you commented, and didn’t give enough time for you to finish your stuff.

It isn’t an issue with CI or anything, simply me being a bit too trigger happy and pushing without rebasing 03f64961 yet

1 Like

The below text is to outline longer term goals, and initially I think we can ship this stripped down but getting all this running shouldnt be too heavy of lift

Conditional logic for CI builds

# if ci_trigger == commit
  # run build suite using --remote-derivative-packages
  # send logs as artifacts to github actions and notify success or failure
  # nuke excess VM data (OVAs, VDI, etc.) to save space on VPS
# elsif ci_trigger == tag
  # run build suite without using remote derivative packages
  # load and start VMs in to VBoxManage 
  # push OVAs to S3 storage bucket so Devs/Testers can download and experiment 
  # Allow VNC access to VPS and VMs for quicker testing
  # Run the WATS suite on the Tagged VM

Thoughts @Patrick?/

@Patrick it is unclear to me what happened with this build

https://github.com/Mycobee/derivative-maker/actions/runs/3023870954

Any chance we could hop on a Jitsi call at your convenience and debug together? Would love to speed up our iteration a bit

Perfect!

Except…

This one will probably not be used. Could as well as save storage / storage costs.

(Btw also the build logs can be trimmed. We probably don’t want to keep (all) build logs for each build forever once these take significant space. Also an occasionally manual wipe would probably be ok.)

Using S3 and cloud hosting for development-only purposes without a strong dependency and without introducing any risk that a compromised CI could break security for users is OK.

However, if any testers would have to download something form S3, that would probably be criticized. Not as bad, but kinda similar to Whonix adding Google Analytics to its website (not going to happen!). Might backfire.

Found it already.

++ realpath /home/ansible/derivative-binary/Whonix-Gateway-XFCE_image/etc/network/interfaces
realpath: /home/ansible/derivative-binary/Whonix-Gateway-XFCE_image/etc/network/interfaces: No such file or directory

Just a test that I added for debugging the mount issue which might be no longer needed caused a non-zero exit code.

But indeed. Finding the error from the log is non-trivial due to the log size. In the latest commit, I prefixed error: or ERROR: (best to search case-insensitive). To find this issue, I rather searched the log for “detected” than “error”.

To make the build log contain the word “error” less often, I’ll refactor, rename some functions (the error handlers). But I do that only once some builds are completely passing.

Already fixed in git.

Not sure it would be faster. I am using the CI now to debug this. Now waiting for the new git commit to show up under https://github.com/Mycobee/derivative-maker/actions to see if the last bug was squashed now. Feel free to contact me on telegram (as previously added). For actual call, many apps will work for me.

1 Like

Fixed for me locally.

As of git commit bf65075d4cf4edc2f3c0291dee37c80c0207c5b7 same as git tag 16.0.7.6-developers-only if I make a local build with --remote-derivative-packages true there is no build issue and no stray mounts. Checked. Both gw and ws build were successful.

(I am building the tag, not commit. There is a slim chance this might make a difference due to perhaps triggering a bug due to too long file names.)

The umount bug avoidance strategy of avoiding to umount non-existing mount points as well as umount --lazy seems to be functional.

1 Like

Git tag 16.0.7.7-developers-only has some build script cosmetic improvements only. (Less unnecessary lsof output.)

1 Like

True. In the event you needed it for some reason you could scp it from the VPS to a machine.

yes, I was going to keep the bucket password protected, so that CI builds were only traded by people using it for situations where it is known not to be “secure” build. But since it is not needed, it doesn’t matter. Also I was using the digital ocean equivalent, not quite as bad as amazon data collection wise I’d imagine, but who knows what companies do behind the curtains :man_shrugging:

The logs delete each time a CI build run occurs. If you notice the first run in the create_vm.yml file, it redirects stdout and err to build.log, and additional build steps are append that file. The next time a build runs, the redirect with > overwrites the log. It is a lot of output but storage is no concern, as it heals itself.

If you want to trim the logs, I am open to suggestions on how best to do it.

I have to rebase my branch on your upstream master and force push my commit manually to trigger the build, but once my automated_builder it is merged in to your master and configured in the GH, when you push to derivative-maker it will automatically trigger the build. (I pushed the rebased changes btw)

1 Like