Offline Documentation Discussion

hope we can use something more to freedom/privacy respect like gitlab or so. i wish there will be no more github usage.

@mig5 Thank You! This has long been a PITA for us.

Can using an HTML minifier and/or running the images into a tool that has lossy-compression before they are incorporated, help?

EDIT: Is there a way to dedupe the images by having all pages point to one common copy of the asset?

If I had a way to do that I would obviously do it :slight_smile: thatā€™s the conundrum. wget canā€™t fetch the assets as static assets because MediaWiki is stupid design. webpage2html is the complete opposite: it has a way to fetch those assets, but it can only load them entirely inline per URL.

Iā€™ll try and experiment some more with just bash/sed/awk/cut/perl hacks, to try and ā€˜grabā€™ the <style> tag stuff and maybe load them into their own stylesheet or at least some ā€˜header.htmlā€™ and then see if I can somehow ā€˜includeā€™ that in each html file.

Total bespoke job and Iā€™m keen to try and avoid spending too much of Patrickā€™s money on a bandaid fix (IMO the money better spent paying someone to move all the content out of mediawiki entirely and into markdown)

2 Likes

Indeed. :slight_smile:

Size asideā€¦ I am uncomfortable to create a deb of this and install this by default. All my packages are built from source code except for packages installed by apt-get. For html offline documentation it would be built on the server (lower trust level than my local machine) so could be compromised. Does that make sense? @HulaHoop

The only way to reach the same security level would be to build from markdown (which can be verified to not include any strange character sequences which could exploit vulnerabilities). Then there is also images. And some content generator markdown to something would be required. Tons of work.

1 Like

while we can browse whonix documentation inside whonix-workstation or inside the host or ā€¦etc. but the question is how can we browse the documentation inside whonix-gateway ? like if someone want to copy paste the commands (because they r 2 long to type) or there is no host nor workstation just the gateway; so i was thinking why dont we save whonix documentation as an offline wiki inside the gateway and it can be updated by apt-get dist upgrade or with each new whonix version 12 13 14 ā€¦etc or the updates going to be manually by the reader (if he can).

i donno if its possible to do this step , but i find useful. also i donno any programs doing this or how to or is it easy or not ā€¦etc so if anyone can shine my knowledge with this i will be thankful.also i would like to hear ur suggestions about how to view the documentation inside the GW if the offline documentation or wiki is a bad idea.

i have found 3 methods doing this, by:-

1- https://www.httrack.com/

2- wget as per this link for example explaining it:- Downloading an Entire Web Site with wget | Linux Journal

3- as full screen screenshots for example by using this add-on:- FireShot: Full Web Page Screenshots (ā™„ā™„ā™„ā™„ā™„) ā€“ Get this Extension for šŸ¦Š Firefox (en-US)

(i will download the documentations as images so there is no need to view the websites with a browser)

the remaining questions is:- if i download the whole documentations and uploaded them on a server or ā€¦etc how can i make it possible that these images going to be inside whonix gateway .ova (i mean for the users)? i think i should take permission from whonix and how to put it ā€¦etc?

Sceenshots are bad. Not text anymore. Not searchable. Not clickable.

Conditions for copying Whonix wiki are already explained in the wiki footer.

[html] Unless otherwise noted above, content of this page is copyrighted and licensed under the same Free (as in speech) license as Whonix itself. [/html]

These tools are a hack, arenā€™t a clean workable solution for a distribution. Maybe Extension:Offline - MediaWiki, but lack of manpower.

Thanks for your honest opinion.

I see.

I have done some truly horrible things in my shell script that will haunt me til the end of my days (or at least until I find time to do it more elegantly, whichever comes first), but all the same, I have achieved that goal of loading in a ā€˜commonā€™ style.css instead of duplicating. (Well, it doesnā€™t quite load yet in TB, probably a bug, but I will see if I can fix that. The important thing is I can strip the css out and put it in a separate file. It works in Chromium for me so probably a TB thingā€¦)

This has dropped the repo from being over 900MB when unpacked, to about 168MB. I can probably even remove a whole 60MB or so by deleting/re-creating the git repo again, as the .git is mostly that big due to history regarding the previous versions of the files.

1 Like

Hahaha :slight_smile:

Bravo. Impressive indeed and although its not ideal/secure as the markdown solution, it delivers an offline and easy means to share the information.

Can this be further pruned by adding a switch to exclude the /Dev pages and /Deprecated ?

These I would guestimate would have marginal effect on size. The major
source of size would be duplicated contents as well as images.

Not a solution but a summary of the most similar we have for now:

Browsing Whonix ā„¢ Documentation Offline

1 Like

fix offline documentation - pdfbook
https://phabricator.whonix.org/T933


GitHub - WhonixBOT/whonix-wiki-html: HTML-only copy of the Whonix wiki outdated. Script https://github.com/WhonixBOT/whonix-wiki-html/blob/master/scrape-whonix-wiki.sh is broken.

fix whonix-wiki-html backup / fix scrape-whonix-wiki.sh
https://phabricator.whonix.org/T934

This was fixed.

No idea how to fix.

A post was split to a new topic: BlackArch Offline Documentation

Hello,

Would it be possible to publish a Github site with all of the Whonix documentation for offline viewing? Other privacy oriented projects like QubesOS have all the documentation published this way.

Thank you

Welcome to Whonix forums and thank you for your question!

Yes, that would be very good to have.

The current much less than ideal state of things is summarized in this post:
Offline Documentation Discussion - #73 by Patrick

Technical challenges and limited resources prevent it from being improved. See also discussion in this forum thread. To make it better than that, someone capable needs to help.

wget can be used to download the entire site, correct?

wget not easy. Would need to script it but other tools such as htttrack do this.