Valid and Reliable AI
- Jurisdiction
- US-Federal
- Issuer
- NIST
Valid and reliable AI represents a foundational characteristic of trustworthy AI systems. Validation confirms that AI systems fulfill requirements for their specific intended use, while reliability ensures consistent performance without failure over time under given conditions.
Key Components:
Accuracy: Closeness of AI system results to true or accepted values. Accuracy measurements should:
- Consider computational metrics (false positive/negative rates)
- Account for human-AI teaming scenarios
- Demonstrate external validity beyond training conditions
- Include disaggregated results for different data segments
- Be paired with clearly defined, representative test sets
Robustness/Generalizability: The ability to maintain performance across various circumstances, including unanticipated uses. Robust systems should:
- Function appropriately in broad conditions
- Minimize potential harms when operating in unexpected settings
- Maintain performance as conditions change over time
Ongoing Assessment: Validity and reliability for deployed systems require:
- Continuous testing and monitoring
- Confirmation of intended performance
- Prioritization based on potential harm from failures
- Human intervention capabilities for error correction
Challenges: AI systems may be trained on data that changes over time, affecting functionality in ways that are difficult to understand. The tension between accuracy and robustness requires careful balance based on system context and risk tolerance.
Valid and reliable AI forms the foundation for other trustworthy AI characteristics - without basic validity and reliability, other characteristics like fairness or safety cannot be meaningfully achieved.