Dewi S.W. Gould

Astra Fellow, Redwood Research

prof_pic.jpg

I’m an Astra Fellow at Redwood Research, working on AI safety. My research is motivated by understanding and reducing risks from increasingly capable AI systems.

Previously I…

  • Was a post-doc at the Alan Turing Institute on Project Bluebird, working on synthetic scenario generation for autonomous air-traffic control. I gave a Pint of Science talk about it.
  • Did a PhD in mathematical physics at the University of Oxford. My thesis was on Generalized Symmetries in String Theory Realizations of Quantum Field Theories. I wrote a blog post about my research.

news

May 01, 2026 Our paper “A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior” was accepted to ICML 2026 :kr:.
Apr 27, 2026 Our work “A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior” was presented as a poster at the ICLR 2026 Trustworthy-AI Workshop :brazil:.
Jan 01, 2026 Started as an Astra Fellow at Redwood Research in Berkeley.

recent research

  1. positive_case.png
    H. Mayne, J. S. Kang, Dewi S. W. Gould, and 3 more authors
    arXiv preprint, 2026
  2. skate.png
    Dewi S. W. Gould, B. Mlodozeniec, and S. F. Brown
    arXiv preprint, 2025
  3. pac.png
    O. Bajgar, Dewi S. W. Gould, J. Liu, and 3 more authors
    In Reinforcement Learning Conference, 2025