About me
Hi, my name is Eugene. I am a Ph.D. student at Northeastern University, advised by Terra Blevins. I’m interested in the many challenges involved with Multilingual NLP.
Right now I’m especially looking at:
- Tokenization: Many model issues can be traced to the tokenizer, a product of many design choices that impact the model in subtle ways. Can we creative both better tokenizers and model-tokenizer interactions?
Pre-Ph.D., I worked on AI for security as a research scientist at S2W Inc. I received my Bachelor’s and Master’s degrees at KAIST. You can find my CV here.
Recent News
2025 Sep - I am now in Boston, starting my Ph.D. at Northeastern!
2025 Aug - Our paper on Byte-level BPE tokenizer vulnerabilities has been accepted to EMNLP 2025!
2024 Nov - Our paper on drug jargon detection was accepted to KDD 2025!
2024 Oct - New paper on Byte-level BPE tokenizer vulnerabilities is now on arxiv!
2024 Sep - Our paper on security event detection from Tweets was accepted to NDSS 2025!
2024 Jul - Was interviewed by KBS on the topic of ChatGPT jailbreaks. My first major TV appearance!
Trivia
- My cat’s name is Squash. He has a stepbrother, Pumpkin.
- I used to write articles for the school’s English newspaper, usually complaining about something in the Society section.
- I enjoy playing electric guitar, but like music that’s too technical for my own good. I eagerly await for an AI system that can transcribe very fast solos from songs.