long paper

Can Language Models Reason about Individualistic Human Values and Preferences?

An Empirical Investigation of Machines' Capabilities for Moral Judgment with the Delphi Experiment

As AI systems become increasingly powerful and pervasive, there are growing concerns about machines' morality or a lack thereof. Yet, teaching morality to machines is a formidable task, as morality remains among the most intensely debated questions …

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of …

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries

While hallucinations of large language models (LLMs) prevail as a major challenge, existing evaluation benchmarks on factuality do not cover the diverse domains of knowledge that the real-world users of LLMs seek information about. To bridge this …

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs

CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

As the utilization of large language models (LLMs) has proliferated world-wide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models …

Information-Theoretic Distillation for Reference-less Summarization

The current winning recipe for automatic summarization is using proprietary large-scale language models (LLMs) such as ChatGPT as is, or imitation learning from them as teacher models. While increasingly ubiquitous dependence on such large-scale …