Continuous Integration (CI) Whonix Testing - Automated Test Suite

  1. Does the server have a physical graphic card (GPU) (or CPU built-in GPU)? To debug, run on the host operating system:

    lspci

It’s very conceivable that the server has no GPU to safe costs because most server workflows I suppose are over SSH and then no graphics on the server are needed, never any monitor connected (maybe a serial console but that doesn’t need a graphics card either).

If the server does not have a GPU, that might cause some issues. I don’t know if VirtualBox can emulate a virtual graphic card while the host operating system has no real GPU. Even if VirtualBox can do this, this might cause some issues. All speculation based.


  1. Does the host operating system have a functional graphical desktop environment such as Xfce?

How could you even test that? Not trivial. A VNC server is not really the same. A VNC server can be run without a GPU. (Running a VNC client without X - Server Fault) Using framebuffer. But that’s not really the same as having a real GPU and setting in front of it. Meaning, there could be issues in corner cases.

Using framebuffer it might be the case that applications using hardware acceleration (which likely includes browsers) will fail. → linux - How can OpenGL graphics be displayed remotely using VNC? - Server Fault

Getting all packages for a functional desktop environment (including xserver) is non-trivial. A functional shortcut could be installing kicksecure-xfce-host on the host operating system. (instructions: Install Kicksecure ™ inside Debian)

Getting to the stage of a functional graphical desktop environment such as Xfce on the host operating system might require a local physical testing system (to get the package selection right).

  1. Does the host operating system run xserver? If it does not have xserver, that might also confuse VirtualBox and in turn Tor Browser.

  1. The VNC server software.

Even if the host operating system already has a functional A) physical GPU, B) xserver and C) desktop environment (such as Xfce), and D) VNC server applications using hardware acceleration such as browser might still fail.

I am using x11vnc on a remote computer that has A, B, C and D and can confirm that browsers are functional on both, on the host operating system and inside (Kicksecure / Whonix) VMs.

tightvnc might not work for this purpose.

To test VNC, a computer you have physical access to (such as in LAN) might be handy.

For commands to set up x11vnc, start the server and connect the client to, ask me. I can look that up. (Although I never manged to make it fully non-interactive when logging into a VNC session but then I would have a reason to look into that again.)

Also don’t know if x11vnc would be compatible with the way WATS is currently expecting the VNC to function.


  1. Try running WATS on the host operating system. This is to figure out if this is a VM specific issue or already happening on the host operating system. The idea is, if it fails on the host operating system, there’s a good chance it will fail within the VM.

  1. Maybe not start with a complex GUI application such as Tor Browser first? Maybe it’s unspecific to Tor Browser?

Try some simpler GUI application such as mousepad, xfce4-terminal first. Just open an close using WATS. Let’s see if that works.

Then try Chromium.

Then try Firefox.


  1. Is there any way to do a video recording of the host operating system?

  1. Qubes uses openQA for automated GUI testing. I don’t know how Qubes / openQA handles (hardware accelerated) graphical tests. Are we re-inventing openQA?

  1. If Tor Browser (or all hardware accelerated graphical tests), all of this is getting too complex to fix, just skip it for now.

Ah this makes more sense now.

I did install XFCE on the host, but it has always been super laggy and weird. I have observed WATS run on the VMs using VNC, but again it is very glitchy and laggy

Upon further research, it seems as though Digital Ocean might not be a solution we can use for any GUI testing, since they do not offer graphics with their instances. Whatever is working on the machine I imagine is virtualized in some way, hence the problems

OpenQA seems like a GREAT tool and I think we should aim to use it long term, since WATS is flakey and not the best honestly. That is not to knock the work that the capstone project did, but I do not think it is a good way to test an OS through a GUI long term. OpenQA seems more maintainable for the long term usage of automated testing.

OpenQA will not solve our short term problem of needing a machine with GPU in order to effectively test the Whonix GUI

1 Like

I see 3 potential solutions

Dedicated Home Machine

Solution 1 is for me to acquire a dedicated machine and set it up in my house, leaving it online 24/7. CI on github actions would trigger a build, which would run WATS on the tower

Pros:

  1. CI will run faster. Every time a build happens, there will be no need to install things like VirtualBox, apt packages, XFCE, VNC, etc.
  2. No long term dependence on cloud providers.
  3. Quicker iteration and debugging. No need to spin up a cloud server every time something needs troubleshooting.
  4. More realistic. Whonix users by and large are using the operating system on laptops and desktops.

Cons:

  1. “Linus doesn’t scale”…if I have a tower machine in my house, what happens if I get in a car wreck and die? What happens if I quit being able to maintain or I am displaced from my housing? I will build things in a way where it can easily be spun up by someone else (documentation, automation, etc). But there is no guarantee that someone will be willing and available to do it.
  2. I would have to buy a dedicated tower machine…500ish usd?
  3. I will have to set it up where that machine can be SSH’d to from the open internet. This is not a huge deal, but a bit of a security concern putting it on my home network…perhaps I setup a VPN and run it on that? But this adds additional complexity and networking is certainly not my area of expertise.
  4. My ladyfriend will not be super excited about another computer in the house, but she can get over it lol

Cloud based machine with GPU

Some of the larger cloud providers offer servers with GPU. Switching the pipeline to build with a cloud server that has a GPU could be a solution. AWS EC2 G4 seems like it would work.

Pros:

  1. AWS isn’t going anywhere if I die, give up, or become homeless. They will likely just consume the whole world as we spiral further and further in to dystopia. No downtime if something happens to me. Business as usual
  2. On demand and reliable. It can spin up or down at a moments notice, with no worries about networking, snowstorms, or hardware failure on our end.
  3. No security risk of opening up a CI pipeline to my home network.

Cons:

  1. AWS is AWS. It is expensive, no privacy, project funds supporting amazon, etc.
  2. There are unknowns, perhaps more complicated configurations, and additional overhead programmatically provisioning resources. I could use terraform for this no problemo, but it still adds a bit of complexity. That said, I am comfortable taking on this complexity
  3. If Whonix scaled in a way where lots of people were pushing to this, it would become more expensive. In the short term, it is less expensive than buying a dedicated computer to run at my house.

Forgo GUI testing

I do not like this option, especially because I spent so much time trying to get WATS running on a cloud server. It finally runs, and this GPU curveball occurs.

Pros:

  1. Frees me up to work on other projects (Whonix-Native or whatever else)
  2. CI builds still in place. CLI testing is possible, and additional build testing is no problem…we could test KVM builds, VirtualBox, Whonix-CLI, Whonix-XFCE…everything except automated clicking through the operating system

Cons:

  1. Unable to catch GUI bugs without manually testing
  2. It will hurt my ego a good bit (non-consequential to the project)
Thoughts

I honestly am torn between all three options. I would love some input and guidance @Patrick or anyone else who might be reading this (long shot). Which direction do you think we should go?

1 Like

I wouldn’t worry about this worst case too much.

Ok.

Cannot comment on your personal situation. Certainly choose a solution that is comfortable to live with.

How much are we talking about approximately?

We don’t need that many builds. Often I work on very axxiluary things such as mediawiki-shell (for wiki backups). Editing some script in that package has a almost 0% chance of breaking Whonix’s build process.

There are other type of changes where breakage is more likely but even then while the build script might look daunting it, the build issues are not all that complex that every commit needs to be CI tested. Often for example it’s just detail enhancements such as better written code comments.

I guess costs will be manageable so if this is the more comfortable solution for you, let’s go this path.

Acceptable too, of course but also not the nicest solution.

I have been a bit busy with life, but this GUI testing is high on my priority list. It was a bit disappointing to run in to these issues with Digital Ocean VPS, because lots of time has been invested, but it is okay these things happen.

For the time being, all commit builds are still working and reliable.

The Plan

I am going to utilize AWS, it makes the most sense for long term testing and maintenance. I dont want any more computer parts in my house anyway…to many half assembled thinkpads as it is lol

Steps

  1. Create shared AWS account like our digitalocean account. We will use my credit card for the time being, same as digitalocean. This work is my donation to our cause.

  2. Create Terraform suite to dynamically create and destroy the resources that run our CI builds on AWS. This should be comparable in costs to our DigitalOcean builds.

  3. Create Terraform logic that provisions and destroys GPU servers to run our GUI tag builds

  4. Run WATS in the new GPU server. This code already exists, it just doesn’t work on DigitalOcean

  5. Rebuild the test suite with OpenQA. WATS currently does not do much beyond verifying the Tor browser works. If OpenQA works for our use case, I think it will be much quicker to test all the things and add new tests as it grows.

1 Like

FWIW I routinely run Whonix on my Talos II, which only has an AST2500 GPU (which is just an unaccelerated framebuffer, should be comparable to no GPU at all in terms of what API’s are available to a VM guest). Everything works fine graphically; pretty much any modern GNU/Linux distro will automatically fall back to LLVMpipe if no accelerated GPU is present. LLVMpipe won’t work well for high-end games since there are certain recent OpenGL features that it doesn’t implement, but for standard GUI stuff (including web browsers as long as you’re not using WebGL) it generally works fine (even games like OpenArena work OK; one of the Talos community members reports ~30 fps in OpenArena with LLVMpipe). My guess is that whatever issues you’re running into are caused by not having graphics-related packages installed (and therefore Firefox can’t initialize LLVMpipe). @Patrick 's suggestion of installing kicksecure-xfce-host is likely to fix it, or maybe installing whatever meta-package Debian uses for XFCE desktop.

(If you already figured out the above, apologies for the noise.)

2 Likes

Building Whonix for Debian bookworm on Debian bookworm is still a bit difficult because VirtualBox is neither available from Debian stable repository, Debian fasttrack nor Oracle repository.

At the moment it can only be installed from Debian sid which is good enough at least for development purposes and CI. Since this is a moving target (now it’s Debian sid, in the future hopefully at least fasttrack), could you please leave VirtualBox installation on the CI server to a script that I would maintain?

This command should work for VirtualBox installation on Debian bookworm:

~/derivative-maker/packages/kicksecure/usability-misc/usr/bin/installer-dist --non-interactive --virtualbox-only --log-level=info

And if not, then I would fix it.

(related: Whonix Linux Installer - Development Discussion)

Aight @Patrick, it took a few hours but the CI should be running on Debian 12 and installing VirtualBox from the installer-dist script

I am heads down in my ecommerce work for the next few weeks still, but will come back in full force and make package_parser more performant and satisfy html requirements. Then we can work on getting OpenQA running for GUI testing

1 Like

Build successful with debian 12 and the install script

Remove tasks to delete existing VMs since VPS is newly created · Whonix/derivative-maker@3a045b1 · GitHub

1 Like

Excellent! Thank you, most helpful!

1 Like

State of Automated GUI Testing

Posting so I remember historical context later of all the shit I did. It might be a while until this gets revisted, due to all the work that must be done with whonix/ks package site builder

Current Progress

  • Pulled WATS in to a repo I could work on GitHub - Mycobee/whonix_automated_test_suite: Whonix Automated GUI Testing Suite for CI
  • Set up automated builder CI pipeline to have two different paths…GUI builds happen when a tag is pushed, and headless builds happen when a commit is pushed. Ideally, in the future GUI testing will occur any time a tag is pushed.
  • made some of progress installing a GUI on the VPS host here and here
  • ran in to glitchy bugs using GUI and VNC on remote debian VPS host

Steps to Move Forward

  1. Fix GUI host bugs…hopefully with Jeremy’s suggestion…otherwise may have to use terraform and AWS to remotely provision ec2 instance with GPU installed
  2. Have CI run existing flaky wats suite for proof of concept (small lift once GUI bugs are ironed out…see previous PRs listed above)
  3. Implement OpenQA testing…once this is in place building automated gui testing should be much more stable and dependable…iteration should be quicker.
1 Like

CI broken.

Sorry for the delays. I was travelling, then I got sick as shit. Finally got around to fixing it this morning

1 Like

Do we need to gather_build_logs.yml or could we just use github’s default log function?

Maybe github has a maximum log size function?

Githubs default log function is very useful for the contaimer they spin up, but it does not give us much observability on our digital ocean VM where the derivative maker building occurs. The ansible log gathering logic gives us much more detailed output when things break on the build step itself

1 Like

CI Roadmap

Moving forward with CI, here are some useful improvements to our CI

Preconfigured Runner VPS

We should utilize an existing img file in digitalocean instead of installing and configuring the box every time CI runs. We should split the ansible tasks out to have a build_machine_image role. The role should install all needed depedencies including virtualbox, and do all configuration unrelated to the actual code changes in derivative-maker

The role should conditionally run, ONLY when git changes are in the ./automated_builder directory. It should tag the new image in a sane way, and publish the image to digital ocean.

When a code change happens in any directory that is not ./automated_builder, the CI should load this image and run the steps to install the source code, and build the whonix images

Parallelize Builds

The workstation and the gateway do not need to run on the same machine. CI should spin up two VPS machines and run the builds in parallel. The logs should be updated to reflect these separate builds, and tagging of machines should be updated in digitalocean. Teardown should successfully destroy all these machines, and run regardless of any behavior.

We should assess the CPU and Memory usage of our images, and then decide whether or not we can scale down based on that. We should only use as much machine as we need to run the build efficiently, to save costs. CI is cheap as it is, but the cheaper we can make things, the better.

This might mean making different sized images for commit builds (headless), and tag builds (GUI + full depedencies)

Upon further research, both builds must be on same host. Investigate running these with gnu parallel on the same host, and determine how big of a box is needed to run as fast as possible without being over provisioned.

Documentation Overhaul

As CI becomes more leveraged, we need good documentation so that it can be more maintainable and understood. The wiki should be updated and we should ensure that we have some sort of visual representation of how the system works.

OpenQA Setup in CI

This one is going to be a big fucking task, but super important to the future of both Whonix and KS. We need a useful OpenQA setup. This probably will need to be split in to multiple subtasks.

Exploration of Further CI Usage

Take stock of other things we can automatically build with CI to ensure integrity

VirtualBox installer script? Etc. Some of this can likely be done with OpenQA, but there might be areas of our codebase we can utilize CI to make life easier for us all, and improve the development cycle.

1 Like

Sounds all good!

This is nice however the prepare-release step requires both images to be present on the same hard drive.

1 Like

Ahh good to know. That would suck to do all that work and get blocked by that.

1 Like