Tutorials

Tutorials Schedule

July 27, 2025
Time	Location	Session Title
09:00 - 12:30	Hall B	T1: Inverse Reinforcement Learning Meets Large Language Model Alignment
09:00 - 12:30	Hall M	T2: Eyetracking and NLP
09:00 - 12:30	Hall C	T3: Uncertainty Quantification for Large Language Models
09:00 - 12:30	Hall N	T4: Human-AI Collaboration: How AIs Augment Human Teammates
14:00 - 17:30	Hall M	T5: Navigating Ethical Challenges in NLP: Hands-on strategies for students and researchers
14:00 - 17:30	Hall N	T6: NLP for Counterspeech against Hate and Misinformation
14:00 - 17:30	Hall B	T7: Synthetic Data in the Era of Large Language Models
14:00 - 17:30	Hall C	T8: Guardrails and Security for LLMs: Safe, Secure, and Controllable Steering of LLM Applications

Tutorials Description

Navigating Ethical Challenges in NLP: Hands-on strategies for students and researchers
Luciana Benotti, Fanny Ducel, Karen Fort, Guido Ivetta, Zhijing Jin, Min-Yen Kan, Seunghun Lee, Margot Mieskes, Minzhi Li, and Adriana Pagano

With NLP research being rapidly productionized into real-world applications, it is important to be aware of and think through the consequences of our research. Such ethical considerations are important in both authoring and reviewing (e.g. privacy, consent, fairness, etc).
This tutorial will equip participants with basic guidelines for thinking deeply about ethical issues and review common considerations that recur in NLP research. The methodology is interactive and participatory, including case studies and working in groups. Participants will gain practical experience on when to flag a paper for ethics review and how to write an ethical consideration section, that will be shared with the broader community. Importantly, the participants will be co-creating the tutorial outcomes and extending tutorial materials to share as public outcomes.

Inverse Reinforcement Learning Meets Large Language Model Alignment
Mihaela van der Schaar and Hao Sun

Large Language Models (LLMs) are characterized as universal samplers or generators in the literature, yet maximizing their capabilities in different roles with post-training is a complex challenge. Previous efforts in the NLP community have extensively explored the diverse applications of LLMs across various domains, including enhancing chat abilities, solving mathematical problems, adopting LLMs for evaluation, generating synthetic data, improving Bayesian optimization, and designing external systems such as the reward function in reinforcement learning. Despite these advancements, key methods in post-training for improving LLM performance — such as prompt optimization, in-context learning, supervised fine-tuning, and reinforcement learning from human feedback — are typically studied in isolation. In this tutorial, we intend to use a unified Inverse Reinforcement Learning (IRL) perspective to characterize the applications of LLMs, and highlight the importance of task-specific post-training and alignment using the techniques of IRL, which has been shown empirically to achieve great success in the recent o1 model from OpenAI. We will demonstrate when and why post-training with IRL is necessary to ensure safe and human-centric AI systems. We will also introduce what would be the optimized practices in different applications under various data and knowledge availability.

Eyetracking and NLP
David Reich, Omer Shubi, Lena Jäger and Yevgeni Berzak

We propose a cutting-edge in CL/NLP tutorial on the growing research area that combines eyetracking during reading with NLP. The tutorial will outline how eye movements in reading can be leveraged for NLP, and, vice versa, how NLP methods can advance psycholinguistic modeling of eye movements in reading. We will cover four main themes: (i) fundamentals of eye movements in reading, (ii) experimental methodologies and available data, (iii) integrating eye movement data in NLP models, and (iv) using NLP for modeling eye movements in reading. The tutorial is tailored to NLP researchers and practitioners, and will provide attendees with the essential background for conducting research on the joint modeling of eye movements and text.

Guardrails and Security for LLMs: Safe, Secure, and Controllable Steering of LLM Applications
Traian Rebedea, Leon Derczynski, Shaona Ghosh, Makesh Narsimhan Sreedhar, Faeze Brahman, Liwei Jiang, Bo Li, Yulia Tsvetkov, Christopher Parisien and Yejin Choi

Pretrained generative models, especially large language models, provide novel ways for users to interact with computers. While generative NLP research and applications had previously aimed at very domain-specific or task-specific solutions, current LLMs and applications (e.g. dialogue systems, agents) are versatile across many tasks and domains. Despite being trained to be helpful and aligned with human preferences (e.g., harmlessness), enforcing robust guardrails on LLMs remains a challenge. And, even when protected against rudimentary attacks, just like other complex software, LLMs can be vulnerable to attacks using sophisticated adversarial inputs. This tutorial provides a comprehensive overview of key guardrail mechanisms developed for LLMs, along with evaluation methodologies and a detailed security assessment protocol – including auto red-teaming of LLM-powered applications. Our aim is to move beyond the discussion of single prompt attacks and evaluation frameworks towards addressing how guardrailing can be done in complex dialogue systems that employ LLMs.

NLP for Counterspeech against Hate and Misinformation
Daniel Russo, Helena Bonaldi, Yi-Ling Chung, Gavin Abercrombie and Marco Guerini

This tutorial aims to bring together research from different fields such as computer science and the social sciences and policy to show how counterspeech is currently used to tackle abuse and misinformation by individuals, activists and organisations, how Natural Language Processing (NLP) and Generation (NLG) can be applied to automate its production, and the implications of using large language models for this task. It will also address, but not be limited to, the questions of how to evaluate and measure the impacts of counterspeech, the importance of expert knowledge from civil society in the development of counterspeech datasets and taxonomies, and how to ensure fairness and mitigate the biases present in language models when generating counterspeech.
The tutorial will bring diverse multidisciplinary perspectives to safety research by including case studies from industry and public policy to share insights on the impact of counterspeech and social correction and the implications of applying NLP to important real-world problems. It will also go deeper into the challenging task of tackling hate and misinformation together, which represents an open research question yet to be addressed in NLP but gaining attention as a stand alone topic.

Human-AI Collaboration: How AIs Augment Human Teammates
Tongshuang Wu, Diyi Yang, Joseph Chang, Kyle Lo and Marti A. Hearst

The continuous, rapid development of general- purpose models like LLMs suggests the theoretical possibility of AI performing any human task. Yet, despite the potential and promise, these models are far from perfect, excelling at certain tasks while struggling with others. The tension between what is possible and a model’s limitations raises the general research question that has attracted attention from various disciplines: What is the best way to use AI to maximize its benefits? In this tutorial, we will review recent developments related to human-AI teaming and collabo- ration. To the best of our knowledge, our tutorial will be the first to provide a more integrated view from NLP, HCI, Computational Social Science, and Learning Science, etc., and highlight how different communities have identified the goals and societal impacts of such collaborations, both positive and negative. We will further discuss how to operationalize these Human-AI collaboration goals, and reflect on how state-of-the-art AI models should be evaluated and scaffolded to make them most useful in collaborative contexts.

Synthetic Data in the Era of Large Language Models
Vijay Viswanathan, Xiang Yue, Alisa Liu, Yizhong Wang and Graham Neubig

Progress in natural language processing has historically been driven by better data, and researchers today are increasingly using “synthetic data” - data generated with the assistance of large language models - to make dataset construction faster and cheaper. However, most synthetic data generation approaches are executed in an ad hoc manner and “reinvent the wheel” rather than build on prior foundations. This tutorial seeks to build a shared understanding of recent progress in synthetic data generation from NLP and related fields by grouping and describing major methods, applications, and open problems. Our tutorial will be divided into four main sections. First, we will describe algorithms for producing high-quality synthetic data. Second, we will describe how synthetic data can be used to advance the general-purpose development and study of language models. Third, we will demonstrate how to customize synthetic data generation to support scenario-specific applications. Finally, we will discuss open questions about the production and use of synthetic data that must be answered to overcome some of their current limitations. Our goal is that by unifying recent advances in this emerging research direction, we can build foundations upon which the community can improve the rigor, understanding, and effectiveness of synthetic data moving forward.

Uncertainty Quantification for Large Language Models
Artem Shelmanov, Maxim Panov, Ekaterina Sergeevna Fadeeva, Artem Vazhentsev, Roman Konstantinovich Vashurin and Timothy Baldwin

Uncertainty quantification (UQ) has gained increasing importance in natural language processing (NLP), providing a framework to address critical issues such as hallucinations in answers of large language models (LLMs), detection of low-quality responses, out-of-distribution detection, and reducing response latency, among others. While UQ for text classification models in NLP has been covered previously, applying UQ to LLMs presents a significantly greater challenge. This complexity arises from the fact that LLMs generate sequences of conditionally dependent predictions with varying levels of importance. As a result, many UQ techniques effective for classification models are either ineffective or not directly applicable to LLMs. In this tutorial, we cover foundational concepts of UQ for LLMs, present cutting-edge techniques, demonstrate practical applications of UQ in various tasks, and equip researchers and practitioners with tools for developing new UQ methods and harnessing uncertainty in various contexts. With this tutorial, we aim to lower the barrier to entry into UQ research and applications for individual researchers and developers.