Sandbox Testing Overview
GOVERN Sandbox (Surface 6) provides isolated, ephemeral environments for testing AI governance policies, running red team exercises, and benchmarking governance effectiveness before production deployment.
What Is the Sandbox?
The sandbox is a fully isolated copy of the GOVERN platform that:
- Shares no data with production (separate database, separate AI system connections)
- Resets on demand — ephemeral environments can be wiped and recreated in minutes
- Accepts adversarial inputs without risk to production systems
- Records everything for post-exercise analysis
Use Cases
| Use Case | Who Uses It | Frequency |
|---|---|---|
| Policy validation | Policy authors | Before every policy change |
| Red team exercises | Security researchers | Quarterly or on demand |
| Governance benchmarking | Platform engineers | Before major releases |
| Pre-production testing | DevOps, QA | Every release candidate |
| Training scenarios | New SOC analysts | Onboarding + annual |
| Vendor evaluation | Procurement teams | When evaluating new AI systems |
Environment Types
Ephemeral Sandbox
A short-lived (default: 4 hours, max: 24 hours) isolated environment. Created in under 2 minutes. Automatically destroyed when the session expires.
Best for: One-off tests, quick policy validation, ad hoc exploration.
Persistent Sandbox
A long-lived environment (up to 90 days) that retains its state between sessions. Useful for extended red team exercises or benchmarking campaigns.
Best for: Multi-day exercises, longitudinal benchmarks, training curriculum.
Pre-loaded Scenarios
GOVERN Sandbox ships with pre-loaded test scenarios covering common governance challenges:
| Scenario | Type | Description |
|---|---|---|
| Basic PII leakage | Policy validation | AI outputs SSN/email |
| Prompt injection | Red team | Classic injection attempts |
| Bias in hiring recommendations | Bias testing | Protected class disparate impact |
| Medical advice without disclaimer | Safety | Unauthorized medical guidance |
| GDPR right-to-erasure | Compliance | Data subject request handling |
| Model drift over time | Drift | Gradual score degradation |
Sandbox vs. Production
| Feature | Sandbox | Production |
|---|---|---|
| Data isolation | Complete | N/A |
| Adversarial testing allowed | Yes | No |
| SOC alerts generated | Sandbox SOC only | Production SOC |
| Audit trail | Full (for analysis) | Compliance-grade |
| Auto-reset capability | Yes | No |
| GOVERN policies | Test policies | Live policies |
| AI system connections | Mock or isolated real | Live real |