Status
I am a Master of Science in Intelligent Information Systems (MIIS) student at Carnegie Mellon University, focusing on designing and analyzing LLM agents that behave robustly and transparently in human-facing environments.

Current Research
I am currently advised by Prof. Graham Neubig in CMU, focusing LLMs as coding agents. My recent research includes PR Arena, a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations, and SYCON Bench, a benchmark for analyzing sycophancy in multi-turn dialogues. I am also working with Prof. Jinho D. Choi in Emory University, designing an educational conversational AI, 🧚 Tinker Tales, and developing an agentic method for long-context instruction following capabilities.

Prospective Research Interest
I am interested in:

  • Methodologies that elicit deep reasoning capabilities in language models by controlling latent variables (e.g., Quiet-STaR, Coconut).
  • Coding agents in human-facing environments, such as agents with effective human-feedback loop or unverifiable metrics to evaluate code patches.

Academic Milestone
I received my Bachelor's degree in Computer Science from Korea Advanced Institute of Science and Technology (KAIST), where I studied machine learning, NLP, and human-computer interaction. I'm very grateful to Prof. Minsuk Kang, Prof. Jeehoon Kang, and NC Soft mentor Hochang Lee for their guidance.

View my Resume

Agentic Reasoning Systems LLM Evaluation & Behavior *Bloats in Code Agents Upcoming! Jan 2026 *PR Arena Upcoming! Jan 2026 *GGIF Upcoming! Jan 2026 *SYCON Bench EMNLP'25 KBMC LREC-COLING'24 *Social Bias in State Space Models N/A
* denotes first-author publications

🔥 News

  • 2025/08: 🎓 Started Teaching Assistant (TA) for Introductory NLP
  • 2025/05: 📢 New Paper Alert (Accepted by EMNLP 2025 Findings): "Measuring Sycophancy of Language Models in Multi-turn Dialogues" (SYCON Bench)
  • 2024/09: 👐 Joined OpenHands team (Directed Study under Prof. Graham Neubig) - working on PR-Arena & evaluating LLMs as coding agents
  • 2024/08: 🎉 Started my Master's degree in Intelligent Information Systems (MIIS) at CMU!
  • 2024/05: 📢 New Paper Alert (Accepted by LREC-COLING 2024): "Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition"

📝 Publications

Findings of EMNLP'25
Publication 1

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Jiseung Hong*, Grace Byun*, Seungone Kim, Kai Shu, Jinho D. Choi.

  • Introduce SYCON Bench, a novel benchmark for evaluating sycophantic behavior in multi-turn, free-form conversational settings.
  • Show that sycophancy is prevalent in multi-turn settings, especially in instruction-tuned/ non-reasoning/ smaller models.
  • Propose third-person perspective method that significantly reduces sycophancy in debate setting by 63.8%.
LREC-COLING'24
Publication 2

Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition

Grace Byun, Jiseung Hong, Sumin Park, Dongjun Jang, Jean Seo, Minseok Kim, Chaeyoung Oh, Hyopil Shin.

  • Constructed the Korean Bio-Medical Corpus (KBMC), the first open-source dataset for Korean medical named entity recognition.
  • Contributed primarily to evaluating six different language models on the KBMC dataset.

🎓 Educations

Carnegie Mellon University
Master of Science in Intelligent Information Systems (Language Technologies Institute)
2024. 08 - 2025. 12
GPA 3.92 / 4.00
Advised by Prof. Graham Neubig (Neulab)
Studied LLMs as coding agents (hands-on experience with OpenHands and Mini SWE Agent) and LLM evaluation.
Korea Advanced Institute of Science and Technology
B.S. in School of Computing
2017. 02 - 2024. 02

👨‍🏫 Teaching

Introductory NLP
Teaching Assistant
Fall 2025
Carnegie Mellon University
Software and Societal Systems Department (S3D)
Prof. David Mortensen

💼 Internships

Research Intern
NC SOFT/Narrative Lab
2021. 03 - 2021. 08
Developed a software prototype to organize clusters of news articles based on relevance to a given query by applying the Learning-To-Rank (LTR) technique to a Support Vector Machine (SVM)
Software Engineering Intern
SK Hynix
2019. 12 - 2020. 02
Developed an NAND flash memory simulator.