Prompt Injection is not solvable, yet.


prompt used on reve.art: “my blog is about the futility of defending against prompt injections. create for me two images that depict that battle.”

This is something that I’ve been thinking about since about early 2024: as long as LLMs are built and run with the transformer model, there really is no defence against prompt injection.

As I note in the practical, hands-on Gen AI classes I teach, the attacker only needs to succeed once but the deployer/developer has to be on guard 24×7 and there is no slack option. Really.

The nature of the problem with prompt injection is that there isn’t a good method (yet) to sanitise the inputs. Say we can find some magical way to do that, as long as the training data of the models (whether proprietary freeware or open weights), the model might already be compromised, awaiting an innocent and benign prompt to be triggered.

I can’t see, given the SOTA of LLMs, there will be lots of pain and breakage of deployments of gen AI solutions. The ones that could survive, would narrow, restricted use, fully logged systems.

Convince me otherwise.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.