Governance·Also: PHI redaction / Pre-inference redaction

PII redaction

The removal of personally identifiable information from prompts, logs and stored records - ideally before the data ever reaches the model.

PII redaction - and in healthcare, PHI redaction - is the practice of stripping sensitive personal fields before they reach the model or the log. In regulated AI, it's usually the compliance case for deployment.

Pre-inference, not post

The difference that matters: redact before the prompt hits the model, not after. Once the raw field has been sent, the redaction debate is already lost. Pre-inference redaction means the model cannot leak a field it never saw. The reversal key lives inside a policy-bounded context; the audit trail proves what was redacted and when.

What a production redaction layer usually includes

A configurable policy - which fields are redacted, with what replacement tokens, under what conditions.
Post-inference validation that no redacted field leaked back into the output.
An audit record of every redaction event - field, policy, timestamp.
A human-reviewable interface - someone in risk / compliance must be able to inspect the redaction policy without reading code.

Where it usually fails

The two common failure modes: regex-only redaction that misses context- sensitive identifiers (employee numbers, internal case IDs), and post-inference scrubbing that can't stop a model from already having used the field in its reasoning.

Both are solved by moving redaction upstream, and both are policy-as-code problems.

Pre-inference, not post

What a production redaction layer usually includes

Where it usually fails

Let’sbuildyoursystemnext.