About me

Hi, my name is Eugene. Right now I work at Korean data intelligence startup S2W Inc., where I work as a researcher in the AI Team. I worked on AI for security: applying models to detect and interpret various forms of cybercrime [1,2,3], as well as automatatically processing cyber threat intelligence (CTI) [4]. More recently, I’m working on vice versa, security for AI.

Right now I’m looking at:

  • LLMs for cybersecurity: Cybersecurity domain expertise of LLMs are a very important dual-use capability that can benefit both attackers and defenders. The workflow of cybersecurity is complicated, can we demonstrate LLM expertise beyond domain knowledge?
  • Secure and Trustworthy NLP Applications: Models break more often than we want them to, given the right input. Can we understand the mechanisms of these failures and prevent them?
  • Tokenization: Many model issues can be traced to the tokenizer, a product of many design choices that impact the model in subtle ways. Can we creative both better tokenizers and model-tokenizer interactions?

Prior to all of this, I received my Bachelor’s and Master’s degrees at KAIST. During this time I engaged with a number of topics including media bias (Master’s), computer vision (Bachelor’s), and physics (minor).

I am actively looking for PhD programs for Fall 2025. Any advice/discussion is highly appreciated! You can find me CV here.

Recent News

2024 Nov - Our paper on drug jargon detection was accepted to KDD 2025!

2024 Oct - New paper on Byte-level BPE tokenizer vulnerabilities is now on arxiv!

2024 Sep - Our paper on security event detection from Tweets was accepted to NDSS 2025!

2024 Jul - Was interviewed by KBS on the topic of ChatGPT jailbreaks. My first major TV appearance!

Trivia

  1. My cat’s name is Squash. He has moved to Florida, where he now lives with his stepbrother Pumpkin.
  2. I used to write articles for the school’s English newspaper, usually complaining about something in the Society section.
  3. I enjoy playing electric guitar, but like music that’s too technical for my own good. I eagerly await for an AI system that can transcribe very fast solos from songs.