Srikanth Sastry

Encoded Guardrails

๐ŸŒฟ Budding ยท

Encoded guardrails are guardrails encoded into the software lifecycle that the agent can modify in situ, within the same codebase it is already changing. Linters, static analysis, unit tests, integration tests, and regression tests all belong to this class. The agent responds to them because violations produce errors that block progress, and errors are the contextual feedback the suggestible actor is most susceptible to. But the agent can satisfy them trivially: delete a failing test, drop a precondition check, or suppress a linter warning. The error is gone. The vulnerability is not.