Seungju Han

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms

Published with Wowchemy — the free, open source website builder that empowers creators.