JapanTech

[104th TrustML Young Scientist Seminar] Talk by John Robertson (UT Austin) "Language Model Control and Reliability: Understanding Steering Vectors and Agentic Aging"

A RIKEN AIP seminar with UT Austin's John Robertson on activation steering, concept granularity, and the longitudinal reliability of LLM agents.

When
Tue, June 23, 2026 · 14:00–15:00 JST
Where
Online + Meeting RoomB at Nihonbashi (AIP researchers only) · Hybrid
Region
Other
Organizer
RIKEN Center for Advanced Intelligence Project
Language
EN
Source
Doorkeeper
Summary
RIKEN AIP's TrustML Young Scientist Seminar series hosts John Robertson, a PhD student at UT Austin, for a talk on controlling and trusting large language models. Robertson opens with activation steering, a lightweight way to adjust model behavior without retraining, and argues that the wide variation in its effectiveness reflects search difficulty rather than a fundamental limit. He shows that the directional alignment of contrastive activations at the prompt boundary predicts where useful interventions emerge, letting geometry-guided optimization find them with roughly 40% fewer evaluations across three model families. The talk then introduces concept granularity, a measure of how much a steering direction rotates across input contexts. Computable from cached activations before any steering runs, it predicts both how hard a concept is to optimize and the quality ultimately achievable. Robertson closes by shifting from control to reliability over time, presenting AgingBench, a longitudinal benchmark that tracks how frozen-weight agents degrade as they compress history, retrieve from growing memory, and revise facts. The seminar runs online and in Meeting Room B at the Nihonbashi office, with the physical room open to AIP researchers only. The session is conducted in English.
About the community

The TrustML Young Scientist Seminar is a recurring research seminar series focused on the trustworthiness, reliability, and controllability of machine learning systems. Sessions feature early-career researchers presenting recent work, held online and at the Nihonbashi office, and are aimed at researchers and graduate students in machine learning.

#machine-learning#llm#interpretability#activation-steering#ai-agents#research-seminar