Status
I am a Master of Science in Intelligent Information Systems (MIIS) student at Carnegie Mellon University, focusing on designing and analyzing LLM agents that behave robustly and transparently in human-facing environments.
Current Research
I am currently advised by Prof. Graham Neubig in CMU, focusing LLMs as coding agents. My recent research includes PR Arena, a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations, and SYCON Bench, a benchmark for analyzing sycophancy in multi-turn dialogues. I am also working with Prof. Jinho D. Choi in Emory University, designing an educational conversational AI, 🧚 Tinker Tales, and developing an agentic method for long-context instruction following capabilities.
Prospective Research Interest
I am interested in:
- Methodologies that elicit deep reasoning capabilities in language models by controlling latent variables (e.g., Quiet-STaR, Coconut).
- Coding agents in human-facing environments, such as agents with effective human-feedback loop or unverifiable metrics to evaluate code patches.
Academic Milestone
I received my Bachelor's degree in Computer Science from Korea Advanced Institute of Science and Technology (KAIST), where I studied machine learning, NLP, and human-computer interaction. I'm very grateful to Prof. Minsuk Kang, Prof. Jeehoon Kang, and NC Soft mentor Hochang Lee for their guidance.
🔥 News
- 2025/08: 🎓 Started Teaching Assistant (TA) for Introductory NLP
- 2025/05: 📢 New Paper Alert (Accepted by EMNLP 2025 Findings): "Measuring Sycophancy of Language Models in Multi-turn Dialogues" (SYCON Bench)
- 2024/09: 👐 Joined OpenHands team (Directed Study under Prof. Graham Neubig) - working on PR-Arena & evaluating LLMs as coding agents
- 2024/08: 🎉 Started my Master's degree in Intelligent Information Systems (MIIS) at CMU!
- 2024/05: 📢 New Paper Alert (Accepted by LREC-COLING 2024): "Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition"
📝 Publications
Measuring Sycophancy of Language Models in Multi-turn Dialogues
Jiseung Hong*, Grace Byun*, Seungone Kim, Kai Shu, Jinho D. Choi.
- Introduce SYCON Bench, a novel benchmark for evaluating sycophantic behavior in multi-turn, free-form conversational settings.
- Show that sycophancy is prevalent in multi-turn settings, especially in instruction-tuned/ non-reasoning/ smaller models.
- Propose third-person perspective method that significantly reduces sycophancy in debate setting by 63.8%.
Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition
Grace Byun, Jiseung Hong, Sumin Park, Dongjun Jang, Jean Seo, Minseok Kim, Chaeyoung Oh, Hyopil Shin.
- Constructed the Korean Bio-Medical Corpus (KBMC), the first open-source dataset for Korean medical named entity recognition.
- Contributed primarily to evaluating six different language models on the KBMC dataset.
Reducing Bloats on Agent Generated Code Patches
- In Progress
🎓 Educations
Advised by Prof. Graham Neubig (Neulab)
Studied LLMs as coding agents (hands-on experience with OpenHands and Mini SWE Agent) and LLM evaluation.
👨🏫 Teaching
Software and Societal Systems Department (S3D)
Prof. David Mortensen