GUYS thank you, this is fire
Thanks to @maltfield for pointing this.
Hey @nurmagoz, I just wanted to let you know that I’ve updated the article.
There were a bunch of errors in the GitHub Actions workflow (for automatically checking for unicode in new PRs). I’ve fixed them and added much better debugging and error reporting. The updated workflow can be found in the repo:
@Patrick I’m not sure if you rolled this out for the Whonix repos, but you might want to look into it.
What has to be done? Just copy/paste that file into the repository?
Possible to add this as organization wide github action?
Seems Managing rulesets for repositories in your organization - GitHub Enterprise Cloud Docs (repository rules) is only for Enterprise plan.
https://github.com/Kicksecure/org/settings/rules
Organization rulesets won’t be enforced until you upgrade this organization account to GitHub Enterprise.
Oh, it’s free. I don’t know about org-wide, but to add this workflow (which will automatically have the github bot leave a comment on all PRs, letting you know if Unicode was detected in it), you create a file in .github/workflows/
in your repo with this contents:
You might need to enable GitHub Actions (it’s free) if you didn’t already.
There’s another potentially malicious “invisible” character that is quite commonplace - trailing whitespace. In most instances this is harmless or only mildly annoying, but there are real issues it can cause.
Any language construct that requires a particular character to be the last character in a line will behave unexpectedly if trailing whitespace is involved. The easiest example I can think of here is backslash-escaping of newlines, for instance in Bash:
#!/bin/bash
echo 'No whitespace after backslash'
for i in 'abc' 'def' 'ghi' \
'jkl' 'mno'; do
echo "$i";
done
echo 'Whitespace after backslash'
# there's a space right v here
for i in 'abc' 'def' 'ghi' \
'jkl' 'mno'; do
echo "$i";
done
Executing this script results in the following:
No whitespace after backslash
abc
def
ghi
jkl
mno
Whitespace after backslash
./whitespace.sh: line 12: syntax error near unexpected token `'jkl''
./whitespace.sh: line 12: ` 'jkl' 'mno'; do'
In some contexts this would probably just result in a syntax error and be easily caught. Subtle syntax errors can come with serious consequences though - in the xz-utils attack, the threat actor inserted a single period in an inconspicuous location that resulted in Landlock sandbox support being unconditionally disabled. Trailing whitespace can sometimes do the same kind of thing.
Another way trailing whitespace can go wrong in Bash is with heredocs:
#!/bin/bash
cat <<EOF
Test one
EOF
cat <<EOF
Test two
EOF
# ^ there's a space here
This one’s particularly insidious because it causes the “comment” (and everything that comes after it, up until a true heredoc end marker is found) to become part of the heredoc itself, as can be seen when executing it:
Test one
whitespace.sh: line 9: warning: here-document at line 6 delimited by end-of-file (wanted `EOF')
Test two
EOF
# ^ there's a space here
Python also supports newline escaping, and suffers from the exact same problems as Bash in this regard:
# Good function
def sum_ints(int_a, int_b, \
int_c, int_d):
print(int_a + int_b + int_c + int_d)
# Bad function
# there's a space right v here
def sum_ints(int_a, int_b, \
int_c, int_d):
print(int_a + int_b + int_c + int_d)
# Running the bad function throws the following error:
# File "/path/to/file.py", line 1
# def sum_ints(int_a, int_b, \
# ^
#
# SyntaxError: unexpected character after line continuation character
I haven’t figured out a way to break C with this - gcc does state warning: backslash and newline separated by space
when compiling code with an escaped space where you’d expect an escaped newline, but the actual executable produced seems to work without problems. I tried backslash-escaped space tricks in both actual C code and in a macro, and the compiler outsmarted me on both attempts.
Rust on the other hand gets quite confused:
fn main() {
// space hidden v here
println!("Hello \
World!");
}
┌─╴aaron@kf-m2g5:~
└─╴$ rustc whitespace.rs
error: unknown character escape: ` `
--> whitespace.rs:3:20
|
3 | println!("Hello \
| ^ unknown character escape
|
= help: for more information, visit <https://doc.rust-lang.org/reference/tokens.html#literals>
help: if you meant to write a literal backslash (perhaps escaping in a regular expression), consider a raw string literal
|
3 ~ println!(r"Hello \
4 ~ World!");
|
error: aborting due to previous error
(And yes, the version of this without the offending space compiles without problems and the produced executable prints Hello World!
)
There’s probably other languages that handle this scenario poorly, but I think this proves my point well enough. Whether this can be easily used maliciously in a real-world context will depend on the situation, but in build systems that compile bits of code to check for certain features or bugs, this could potentially be used to cause trouble.
dm-check-unicode
(to check derivative-maker, Kicksecure, Whonix sourcecode) has been improved:
- now also detects ASCII control characters
- detects trailing spaces
- detects missing newlines at the end of the file
- verifies all git symlinks