Systemcheck fails for unclear reason

Some of the tests started failing recently. Initially it was due to privleap issue, but this one is fixed already. Yet, systemcheck still exits with code 1. Here is the full log: https://openqa.qubes-os.org/tests/131604/file/suspend-whonixcheck-sys-whonix.log
I see there:

[WARNING] [systemcheck] systemd journal check Result:

yet, the quoted lines don’t really look like failures, but rather normal audit messages (which incidentally include “fail” word due to presence of pam_faillock module). There are also apparmor messages, but all are “ALLOWED” (but contain also denied_mask="r" - maybe that confuses systemcheck?). There is also a long tor/sdwdate output, yet it completes with Exiting with exit_code ‘0’ indicating ‘success’.`).

I’m confused, is it some actual issue, or just false positive? Please help…

1 Like

I wanted to make systemcheck --verbose (“developer mode”) more useful. Therefore, I made any message in the journal that matches error, warning, fail, and AppArmor allowed/denied exit non-zero. That is, unless it’s an expected message that has a grep match to ignore it.

(There are many messages that I’d rather be gone (false positives) or fixed, but that’s unfortunately unrealistic.)

The purpose of that is to remind me to add more and more messages. Ignore the unfixable ones, fix and/or report the fixable ones.

I didn’t keep in mind that Qubes QA also uses the --verbose option.

Potential solutions:

Thoughts?

B/C seems to be cleanest approach.

1 Like
1 Like

Merged.

ignore_pattern_add “AVC apparmor="ALLOWED"”

I am not sure these should be ignored. AppArmor profiles in permissive mode throw “allowed” messages so we can fix these with the goal of at some point in the future hopefully set to enforce mode.

What are the “allowed” messages? If by third-party packages, is it feasbile to ignore these specifically?

See the full log I linked above

1 Like

So, you mean those apparmor messages are about things that would be denied in enforcing mode? Indeed I see denied_mask there too. In that case, I agree, those shouldn’t be ignored, and instead the apparmor profile should be fixed.

1 Like

That’s correct.

Todays test run has only those:

   Mar 12 05:35:02 host kernel: RETBleed: WARNING: Spectre v2 mitigation leaves CPU vulnerable to RETBleed attacks, data leaks possible!
Mar 12 05:35:02 host (udev-worker)[364]: Error running install command '/usr/bin/disabled-intelpmt-by-security-misc' for module pmt_class: retcode 1
Mar 12 05:35:02 host (udev-worker)[363]: Error running install command '/usr/bin/disabled-intelpmt-by-security-misc' for module pmt_class: retcode 1
Mar 12 05:35:03 host augenrules[633]: failure 1
Mar 12 05:35:03 host augenrules[633]: failure 1
Mar 12 05:35:03 host augenrules[633]: failure 1
Mar 12 05:35:05 host PAM_tmpdir[880]: /tmp/user/1000 owned by uid 0 instead of uid 1000. Failed to create safe $TMPDIR

The last one looks like real issue. The augenrules I don’t know. The intelpmt is probably missing error handling (PMT isn’t visible in VM, so that should be no-op). And the first one may or may not be false positive, but this check is likely not reliable in openQA (with runs Xen inside KVM)…
Is there a way to ignore specific messages via systemcheck config? I’d add ignoring the first one in openqa runs.

1 Like

I just checked, this is most likely related to openQA env. I don’t see this message on real hardware. And also, in openQA tests we disable some of the speculative mitigations for performance reasons.

1 Like

There isn’t yet but I’ll work next on making that configurable.

Journal ignore patterns can now be configured.

You could create a file /etc/systemcheck.d/40_qubes_openqa_autogenerated.conf with the follwoing contents:

## Mar 05 12:22:50 host sdwdate[1319]: ['error', 'ok', 'ok']
journal_ignore_pattern_add "sdwdate.*\['.*error.*'\]"
1 Like

This is now in the testers repository.

Seems to be working, thanks!

In the meantime, I’ve got another one that may be a real issue, after system suspend:

   Mar 13 02:40:42 host sdwdate-start-anondate-set-file-watcher[3785]: /usr/sbin/anondate: ERROR: Variable 'vstart' is empty or contains only whitespace/newlines.
Mar 13 02:40:42 host anondate[3849]: /usr/sbin/anondate: ERROR: Variable 'vstart' is empty or contains only whitespace/newlines.

full output: https://openqa.qubes-os.org/tests/132359/file/suspend-whonixcheck-sys-whonix.log
And if you’d need full jourrnal of sys-whonix, it’s in https://openqa.qubes-os.org/tests/132359/file/suspend-var_log.tar.gz (log/xen/console/guest-sys-whonix.log file)

1 Like

As for Intel PMT, when looking at bigger context, it actually looks like intentional error:

[2025-03-12 21:38:51] [    3.358990] systemd-udevd[408]: /usr/bin/disabled-intelpmt-by-security-misc: ALERT: This Intel Platform Monitoring Technology (PMT) Telemetry kernel module is disabled by package security-misc by default. See the configuration file /etc/modprobe.d/30_security-misc_disable.conf for details. | args:
[2025-03-12 21:38:51] [    3.359266] systemd-udevd[409]: /usr/bin/disabled-intelpmt-by-security-misc: ALERT: This Intel Platform Monitoring Technology (PMT) Telemetry kernel module is disabled by package security-misc by default. See the configuration file /etc/modprobe.d/30_security-misc_disable.conf for details. | args:
[2025-03-12 21:38:51] [    3.359879] (udev-worker)[351]: Error running install command '/usr/bin/disabled-intelpmt-by-security-misc' for module pmt_class: retcode 1
[2025-03-12 21:38:51] [    3.360172] (udev-worker)[342]: Error running install command '/usr/bin/disabled-intelpmt-by-security-misc' for module pmt_class: retcode 1

Is exit 1 in /usr/bin/disabled-firewire-by-security-misc actually needed? Man page of modprobe.conf says the install command is run instead of loading the module, so it looks like exit 1 isn’t needed to prevent loading. Or is the intention to have modprobe exit with 1 too? In that case, something needs to be done about the udev message, at least ignored in systemcheck, but better to avoid the message somehow.

1 Like

Created

for it.

Will disable.

Will investigate.

All fixed. Uploaded to testers repository just now.

Now journal check is clean.

But I’ve got this:

[WARNING] [systemcheck] Check Logs Result: /home/user/.msgcollector/msgdispatcher-error.log exists. Please consider reporting any bugs inside this log.

I assume this causes non-zero exit code, right? Unfortunately, I don’t have that msgcollector log preserved in test. I’ll check it manually on next failure.

root@dom0:~# qvm-run -p sys-whonix cat /home/user/.msgcollector/msgdispatcher-error.log
msgdispatcher: BASH_COMMAND: wait "$inotifywait_pid" | exit_code: