Deterministic AI Safety Governor for iOS — Seeking Feedback on App Review Approach

I've built an iOS app with a novel approach to AI safety: a deterministic, pre-inference validation layer called Newton Engine.

Instead of relying on the LLM to self-moderate, Newton validates every prompt BEFORE it reaches the model. It uses shape theory and semantic analysis to detect:

• Corrosive frames (self-harm language patterns) • Logical contradictions (requests that undermine themselves) • Delegation attempts (asking AI to make human decisions) • Jailbreak patterns (prompt injection, role-play escapes) • Hallucination triggers (requests for fabricated citations)

The system achieves a 96% adversarial catch rate across 847 test cases, with zero false positives on benign prompts.

Key technical details: • Pure Swift/SwiftUI, no external dependencies • Runs entirely on-device (no server calls for validation) • Deterministic (same input always produces same output) • Auditable (full trace logging for every validation)

I'm preparing to submit to the App Store and wanted to ask:

  1. Are there specific App Review guidelines I should reference for AI safety claims?
  2. Is there interest from Apple in deterministic governance layers for Apple Intelligence integration?
  3. Any recommendations for demonstrating safety compliance during review?

The app is called Ada, and the engine is open source at: github.com/jaredlewiswechs/ada-newton

Happy to share technical documentation or discuss the architecture with anyone interested.

See: parcri.net

Hi there!

Sorry for the long delay, at Apple we were on winter break. :)

Are there specific App Review guidelines I should reference for AI safety claims? Any recommendations for demonstrating safety compliance during review?

For these two questions, there are no one specific App Review guidelines for AI-safety, since the guidelines are meant to be broad enough to cover a wide range of technologies. Be sure to be familiar with Section 1. Safety and Section 2. Legal, since those topics tend to have the most overlap with AI safety.

Currently App review does fairly minimal testing of your model feature, so you don't need to provide any specific documentation, but the onus is on you as a developer to do due diligence in safety testing your feature.

Here are some resources that can help:

Is there interest from Apple in deterministic governance layers for Apple Intelligence integration?

Finally, when it comes to Apple's own approach to AI safety, you'll find a combination of deterministic and model-based mitigation strategies occur 1) before input reaches a model, 2) in the model safety alignment itself, and 3) on the model output before the output is given to a user. Here are some references to learn more:

Hope that helps!

Deterministic AI Safety Governor for iOS — Seeking Feedback on App Review Approach
 
 
Q