According to news.microsoft.com, the new capabilities of generative AI come with new risks, which is prompting a new approach to how Microsoft’s AI red team works to identify and mitigate potential risks.
The Origins of the Red Team
The term “red team” was coined during the Cold War when the U.S. Department of Defense conducted simulation exercises with a red team representing the Soviet Union and a blue team representing the U.S. and its allies. The cybersecurity community adopted the language decades ago, creating a red team that acts as an adversary trying to destroy, damage, or misuse technology. The goal is to find and fix potential damage before it happens.
Microsoft AI Red Team Composition
In 2018, Siva Kumar formed Microsoft’s AI Red Team, following the traditional model of bringing together cybersecurity experts to proactively investigate vulnerabilities, as they do for all their products and services. Meanwhile, Foruff Pursabji led researchers across the company on a responsible AI perspective, looking into whether generative technologies could be harmful, either intentionally or due to systematic problems with models that were overlooked during training and evaluation.
Collaboration for comprehensive risk assessment
The groups quickly realized that they would be stronger together, and joined forces to create a broader red team that would assess both security and social harm risks. This new team included neuroscientists, linguists, national security experts, and many other experts from diverse backgrounds.
Adapt to new challenges
This collaboration represents a major shift in how red teams operate, incorporating a multidisciplinary approach to address the unique challenges posed by generative AI. By thinking like hackers, the team aims to identify vulnerabilities and mitigate risks before they are exploited in real-world scenarios.
This initiative is part of Microsoft’s broader effort to deploy AI responsibly, ensuring new capabilities don’t compromise safety or societal well-being.
Image source: Shutterstock