CV

Download CV (PDF)

Education

Ph.D. in Computer Science, University of Maryland, College Park, 2023 – 2027
- Dean’s Fellowship recipient. Advisor: Professor Hal Daumé III. GPA: 3.96/4.0.
- Dissertation focus: Safety, Robustness and Trustworthiness in AI Systems.
M.S. in Computer Science, University of Southern California, Los Angeles, 2018 – 2021
- Concentration: Data Science. GPA: 3.83/4.0.

Research Experience

Doctoral Student & Research Assistant, University of Maryland, College Park, 2023 – 2027
- AI Oversight and Governance: Design oversight methods to detect and mitigate misaligned or deceptive behaviors in deployed AI systems.
- LLM Safety & Alignment: Develop data-efficient, self-correcting LLM frameworks using preference optimization to improve robustness in content moderation.
- Bias in AI Systems: Mitigate social and systemic biases in LLMs for employment, education, and healthcare applications using causal inference.
- Causal Fairness: Develop counterfactual fairness methods with respect to protected attributes, enhancing transparency in AI decision-making.
- Explainable AI: Leverage causal inference and in-context learning to reduce stereotypes in generative explanations from large language models.
MATS 9.0 Fellow, Berkeley, 2026
- AI + Legal Alignment: Developing audit methods to evaluate, predict, and orient LLM and AI tools to legal requirements.
Researcher, Information Sciences Institute, Marina del Rey, 2019 – 2023
- Cryptocurrency Fraud Detection: Develop deep CNN & RNN models achieving <6% error margin in detecting pump-and-dump frauds using social and financial data.
- Content Moderation: Create crowdsourced annotation frameworks and improved anti-Asian hate speech detection by 8% F1-score using BERT & RoBERTa models.
- Meta-Learning for NLP: Design Prototypical-MAML classifiers achieving >70% accuracy in cross-dataset offensive speech detection with only 100 labels.
Data Scientist & Team Lead, Children’s Data Network, Los Angeles, 2017 – 2023
- Predictive Modeling for Social Good: Lead development of Random Forest & XGBoost models to optimize California’s child welfare system, ensuring model transparency through SHAP and LIME interpretability tools.
- Large-Scale Data Mining: Analyze historical administrative databases to build predictive classifiers for child abuse prevention, directly informing state and federal policy.
- Healthcare Analytics: Conduct mental health analyses using ICD-9/10 codings to improve Medicaid-insured children’s services.

Industry Experience

GenAI Research Intern, Oracle Cloud Infrastructure (OCI), Burlington, MA, 2025
- Build an iterative alignment framework for conversational medical assistants using preference optimization methods.
- Achieve up to 42% improvement in harmful query detection on the CARES-18K benchmark across multiple LLMs.
Graduate Intern, Capital One, McLean, VA, 2024
- Implement alignment techniques (SFT, preference optimization, KTO) to strengthen LLM safety guardrails, boosting defense against adversarial attacks by 250%.
- Deploy LLM prototypes (Llama 2/3, Mixtral), reducing inference latency by 75%.
- Manager-nominated Leadership Award for initiating impactful research collaborations.
Data Programmer, Edwards LifeSciences, Irvine, CA, 2015 – 2017
- Maintain ORACLE-based clinical database systems for FDA-compliant clinical trials.
- Create machine learning models to automate health record analysis and reporting.

Honors & Awards

MATS 9.0 Scholar – Acceptance rate below 7%, 2026
Outstanding Reviewer Award – IJCNLP-ACL 2025
Best Paper Award – ML4H Symposium, 2025
Outstanding Paper Award – AAAI PDLM Workshop, 2025
Leadership Award – Capital One (Manager-nominated), 2024
Dean’s Fellowship – University of Maryland, College Park, 2023 – 2027
Finalist – Data Science for Social Good Fellowship, Carnegie Mellon University, 2021
Awardee – UCSD Summer Training Academy for Research Success, 2021
Head of Programs – USC Graduate and Rising in Information and Data Science, 2020
Finalist – Center for Knowledge-Powered and Interdisciplinary Data Science, 2019
2nd Place – USC–Boeing Inaugural Data Hackathon, 2018

Publications

See my Google Scholar for a full list of publications.

Technical Skills

Programming: Python, SQL, SAS, JavaScript, MATLAB
ML/AI Frameworks: PyTorch, TensorFlow, Keras, Hugging Face Transformers
LLM Techniques: Fine-tuning (SFT, preference optimization, KTO), Alignment, RLHF, vLLM serving
ML Methods: Deep Learning (CNN, LSTM, RNN, GAN), Causal Inference, Meta-Learning
Data & Analytics: Spark, AWS, Kubernetes, Git, Tableau, Statistical Analysis
Specialized: SHAP/LIME Interpretability, Crowdsourcing (AMT), Clinical Data Systems

Teaching

Service

Committee

Junior Chair, Bias and Fairness Area, ML4H Symposium, 2025

Reviewing

ACL Rolling Review (ARR) – ACL, EMNLP, NAACL, AACL cycles (2024–2025)
Trustworthy NLP Workshop
ACM Transactions on Intelligent Systems and Technology
Journal of Medical Internet Research
Machine Learning 4 Health (ML4H)
Expert Systems with Applications

Invited Talks

“You gotta be a Doctor Lin”: The Risk of Bias in LLM, UMD INST 414, Spring 2025
A Tutorial on Trustworthiness and Bias in LLMs, UMD Values-centered AI Institute, Fall 2024
Summer Education Program for High Schools in STEM, Summer 2023