Exif data view and remove

torbrowser · May 30, 2018, 1:26pm

I want to make sure images I prepare or edit do not contain any EXIF data. Is there a tool included in Whonix for that? if not, anything recommended?

tempest · May 30, 2018, 1:57pm

mat is still a tool used by many for this and is included with whonix. however, it may not be up to the task any more. the developer announced they were not giving it priority for awhile due to health issues and the last update was 3 january, 2016.

https://mat.boum.org

Current status
The MAT maintenance and development is currently on hold, mostly for health reasons. I might go back to it at some point in the future.

exiftool can also do this to a degree and will also be installed in the whonix workstation. Remove EXIF Metadata from Photos with exiftool » Linux Magazine

however, best bet is to set whatever software you are using to create the images to not save any metadata at all if possible, or as little as possible.

torbrowser · May 30, 2018, 2:50pm

Right. I will use GIMP so I think everything should actually appear in File - properties.

I might be especially dumb today, I don’t see any printer configuration option in KDE’s System Settings.
I use a USB printer, it appears in the “Devices -> USB” menu of VirtualBox, checked. It works on the host. When I try to print from kwrite for example, I only have option to print to a pdf.

Patrick · May 30, 2018, 3:01pm

Metadata - Whonix

torbrowser · May 30, 2018, 3:09pm

Cool, the CLI of MAT is quick and easy. Still struggling with setting a printer.

HulaHoop · June 2, 2018, 6:02pm

EXIF is a small part of the story. Your camera sensors leave a detectable fingerprint traced back to you. MAT and EXIF removal won’t save you. You need a new offline burner phone camera for this purpose

Patrick · June 3, 2018, 5:18am

Surfing Posting Blogging - Whonix

torbrowser · June 4, 2018, 4:21am

Thanks for the information. In the case I’m dealing with I don’t share photos taken by myself. It’s actually scanned documents I receive (I don’t have control over the scanning device either). I need to redact some information using editing tools, clean them of any identifiable data, then share.

tempest · June 5, 2018, 1:45pm

question whether or not you actually need to share such documents rather than simply sharing the information you received by them. something as innocuous as the placement of a comma or 2 could correlate to a person who accessed and provided the documents. additionally, depending on where the document came from, access to such documents may be logged and closely guarded, which may narrow the pool of suspected leakers. imho, people share documentary proof from sources too easily when it’s not even needed, which leads to situations like reality winner’s. meanwhile, in that case, the publication of the actual documents didn’t actually add anything to the story, but has landed a source in confinement who may face additional years in prison.

HulaHoop · June 6, 2018, 3:18am

Scanners have the same pitfalls. For best results you want to run the docs thru an OCR and publish the (edited/redacted) text output as plain text.

tempest · June 6, 2018, 4:34am

ocr can still have the issue of exposing a leaker if a host was paranoid enough to offer different text versions of docs that correlate to id. i’ve honestly viewed the sharing of leaked docs as more of a journalistic fetish than something that is required. in the past, committed journalists could confirm what was leaked to them in docs by dogged questioning of other sources and mouth pieces. on top of which, i’ve found the lack of ethics around the publication of leaks especially troubling. given that publications have ridden this rodeo before, i don’t see why there isn’t some agreed upon version of a type of release in which a journalist informs a source of the potential risks of their employer publishing the leaked material, and then asking for permission to publish the leaked physical material. frankly, a bit amazed greenwald’s publication didn’t have such foresight, particularly given how they’ve claimed they can’t release all of snowden’s material based on such an agreement.

torbrowser · June 6, 2018, 6:54am

I am also concerned about PDF metadata. MAT does not support PDFs, and when I use exiftool:

exiftool -all= filename.pdf

It cleans the data, but warns:

Warning: [minor] ExifTool PDF edits are reversible. Deleted tags may be recovered!

And indeed, running

exiftool -PDF-update:all= filename.pdf

recovers the metadata.

As stated in http://owl.phy.queensu.ca/~phil/exiftool/TagNames/PDF.html

All metadata edits are reversible. While this would normally be considered an advantage, it is a potential security problem because old information is never actually deleted from the file. (However, after running ExifTool the old information may be removed permanently using the “qpdf” utility with this command: “qpdf --linearize in.pdf out.pdf”.)

This looks good. After using qpdf I don’t manage to recover the data.

torbrowser · June 6, 2018, 6:56am

I must have a PDF with as output, with the original image included (after redacted).

tempest · June 6, 2018, 7:33am

yes. there are some added steps for removing metadata from a pdf with exiftool. it’s not simply a cli command. a config file needs to go with it. and then qdpf, iirc.

however, here’s another trick. again, the main concern should be not creating the problematic metadata. create a vm that is torfified. scrub your images (if you need any) before adding to a pdf format in the vm. create pdf and remove metadata from there. left over may be trivial and torified. but, if metadata is in image itself, and it’s traceable, game over.

update.
install programs with following command:

sudo apt-get install pdftk qpdf

here’s a script that can work to a degree. paste the following into a executable bash script.

cp $1 temp.delete.pdf
cp $1 backup.pdf
pdftk $1 dump_data > output.txt
nano output.txt
pdftk $1 update_info output.txt output temp.delete.pdf
exiftool -all:all= temp.delete.pdf
qpdf --linearize temp.delete.pdf temp.delete2.pdf
mv temp.delete2.pdf $1
rm output.txt temp.delete.pdf

the script will take an original pdf file and overwrite it with a new stripped pdf file. execute it with the name of your bash script and the pdf you want to work on. when nano opens, blank out all the metadata fields and save. when you are done, run a test for metadata with exiftool and mat. foolproof? no. but worked in the past. if it goes wrong, original pdf is saved as backup.pdf.

torbrowser · June 6, 2018, 7:38am

For what it’s worth:

#!/bin/bash

if [ $# -eq 0 ]
then
echo “No arguments supplied”
else

exiftool -all= $1 # Clean all metadata
qpdf --linearize $1 $1 # Linearize pdf
exiftool -PDF-update:all= $1 # Test: try to recover metadata

echo “”
echo “Result:”
echo “=======”
set -x
exiftool $1 # Verify final state

fi