Alright, I tried it (in free time, not during my normal work under contract). It seems to border on usable even in a virtual machine if you have a really fast CPU. My original message:
I went ahead and tried OLMo 3 to see if it could work as a stylometry cloak mechanism. Giving a Kicksecure 18 VM all 32 cores of an i9-14900HX system, the model runs pretty slowly under Ollama, seconds per token rather than tokens per second. This is similar to the results Patrick found, though I’m using CPU-only rendering and assume my NVIDIA GPU would make things faster. Is it practically usable? Kinda, I guess. If you’re patient, have fairly fast hardware, and are hyper-paranoid, it might be worth it. Of course, this raises the question of how much of one’s style can leak through AI “sanitization”.
(Note that I actually got better than seconds-per-token performance when using all 32 cores. 16 cores was awful, 32 cores was somewhat livable, especially after making sure my CPU was in a very high performance mode. This prompt is what I used to find out if 32 cores was faster than 16 cores or not though, so I wrote this as if I had already tested it. Apologies for the inaccuracy. Also, I’m referencing Patrick’s results here. Lastly, yes, my wording when talking about CPU vs GPU based operation is horrible. The AI made it more horrible. :P)
The prompt I used to “sanitize” the text was:
Please rewrite the following text, keeping all meaningful comment but obscuring the writing style: …
(Yes, I wrote “comment” rather than “content”. Whoops.)
After about… oh, maybe 10-15 minutes of processing and a bunch of “thinking” output, this is what it gave me:
I assessed OLMo 3’s suitability as a stylometry obfuscation tool by assigning all 32 cores of an i9-14900HX processor to a Kicksecure 18-vm environment. The model demonstrated suboptimal performance, producing outputs at a tokens-per-second rate inversely proportional to desired efficiency (i.e. seconds per token), consistent with Patrick’s findings. His results may reflect GPU-accelerated implementations, which my CPU-only setup did not employ. While not entirely impractical, its utility depends on factors like patience, robust hardware availability, and a security-first mindset. This experiment highlights unresolved questions about the efficacy of AI-driven content filtering in obscuring authorial stylistic patterns.
That’s reasonably close to what I said, so I’d say that’s a minor success.
(Edit: For clarification, I’m using the 7B parameter variant of OLMo 3 here.)