AI Red-Teaming

AI red-teaming is defined in executive-order-14110 as "a structured testing effort to find flaws and vulnerabilities in an AI system, often in a controlled environment and in collaboration with developers of AI."

AI red-teaming is most often performed by dedicated "red teams" that adopt adversarial methods to identify:

Flaws and vulnerabilities
Harmful or discriminatory outputs from an AI system
Unforeseen or undesirable system behaviors
Limitations
Potential risks associated with the misuse of the system

Under executive-order-14110, companies developing dual-use foundation models must report results of AI red-teaming tests to the federal government, including testing related to:

Lowering barriers to biological weapons development
Software vulnerability discovery and exploit development
Use of software or tools to influence real or virtual events
Potential for self-replication or propagation

nist is directed to develop guidelines for AI red-teaming procedures and processes.

AI Red-Teaming

Backlinks (3)