Shujaat Mirza | AI Safety and Security Researcher

I want to understand how frontier AI can fail in ways we don’t expect, and to catch those failures before they happen. On Microsoft’s AI Red Team, I probe these systems for emerging risks, focused on autonomy and loss of control.

I received my PhD from the NYU Center for Cyber Security, advised by Dr. Christina Pöpper. Previously: Research Manager at MATS, and privacy-preserving ML at Spotify Tech Research.

Interests

Emerging Risks
Autonomy and Loss of Control
Privacy Enhancing Technologies
Adversarial Machine Learning

Education

PhD in Computer Science New York University

MPhil in Computer Science NYU Courant Institute

BS in Computer Science NYU Abu Dhabi

Recent Highlights

Apr 2026 "None of us is as dangerous as all of us." We explored what breaks when AI agents become each other's infrastructure. Read the blog post →

Feb 2026 Serving on the Program Committee for USENIX Security 2026.

Aug 2025 Distinguished Artifact Award at USENIX Security 2025 for reverse-engineering safety filters in DALL·E.

Jul 2025 Red-teamed OpenAI's GPT-5 Reasoning model at Microsoft, evaluating autonomy, persuasion, and deception capabilities. Read more →

Dec 2024 Elevated to IEEE Senior Member.

Aug 2024 Defended PhD dissertation: Towards Responsible AI: Safeguarding Privacy, Integrity, and Fairness. From my advisor →

Jul 2024 Runner-up, Andreas Pfitzmann Best Paper Award at PETS 2024.

Mar 2024 Publication Chair, ACNS 2024.

Research

Red-teaming a Network of Agents: Understanding What Breaks When AI Agents Interact at Scale

Microsoft Research Blog, April 2026

Exposing the Guardrails: Reverse-Engineering and Jailbreaking Safety Filters in DALL·E Text-to-Image Pipelines

USENIX Security Symposium, Seattle, US, 2025 Distinguished Artifact Award

Unraveling a Decade of Privacy Discourse around the World

Privacy Enhancing Technologies (PETS), Bristol, UK, 2024 Pfitzmann Best Paper Runner-up

How Fair are Medical Imaging Foundation Models?

ML4H @ NeurIPS, New Orleans, US, 2023 Best Paper Award

CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot

USENIX Security Symposium, Anaheim, US, 2023

Tactics, Threats & Targets: Modeling Disinformation and its Mitigation

NDSS, San Diego, US, 2023

Complete list on Google Scholar