CV
Education
- Ph.D. in Computer Science, University of Maryland, College Park, 2023 – 2027
- Dean’s Fellowship recipient. Advisor: Professor Hal Daumé III. GPA: 3.96/4.0.
- Dissertation focus: Safety, Robustness and Trustworthiness in AI Systems.
- M.S. in Computer Science, University of Southern California, Los Angeles, 2018 – 2021
- Concentration: Data Science. GPA: 3.83/4.0.
Research Experience
- Doctoral Student & Research Assistant, University of Maryland, College Park, 2023 – 2027
- AI Oversight and Governance: Design oversight methods to detect and mitigate misaligned or deceptive behaviors in deployed AI systems.
- LLM Safety & Alignment: Develop data-efficient, self-correcting LLM frameworks using preference optimization to improve robustness in content moderation.
- Bias in AI Systems: Mitigate social and systemic biases in LLMs for employment, education, and healthcare applications using causal inference.
- Causal Fairness: Develop counterfactual fairness methods with respect to protected attributes, enhancing transparency in AI decision-making.
- Explainable AI: Leverage causal inference and in-context learning to reduce stereotypes in generative explanations from large language models.
- MATS 9.0 Fellow, Berkeley, 2026
- AI + Legal Alignment: Developing audit methods to evaluate, predict, and orient LLM and AI tools to legal requirements.
- Researcher, Information Sciences Institute, Marina del Rey, 2019 – 2023
- Cryptocurrency Fraud Detection: Develop deep CNN & RNN models achieving <6% error margin in detecting pump-and-dump frauds using social and financial data.
- Content Moderation: Create crowdsourced annotation frameworks and improved anti-Asian hate speech detection by 8% F1-score using BERT & RoBERTa models.
- Meta-Learning for NLP: Design Prototypical-MAML classifiers achieving >70% accuracy in cross-dataset offensive speech detection with only 100 labels.
- Data Scientist & Team Lead, Children’s Data Network, Los Angeles, 2017 – 2023
- Predictive Modeling for Social Good: Lead development of Random Forest & XGBoost models to optimize California’s child welfare system, ensuring model transparency through SHAP and LIME interpretability tools.
- Large-Scale Data Mining: Analyze historical administrative databases to build predictive classifiers for child abuse prevention, directly informing state and federal policy.
- Healthcare Analytics: Conduct mental health analyses using ICD-9/10 codings to improve Medicaid-insured children’s services.
Industry Experience
- GenAI Research Intern, Oracle Cloud Infrastructure (OCI), Burlington, MA, 2025
- Build an iterative alignment framework for conversational medical assistants using preference optimization methods.
- Achieve up to 42% improvement in harmful query detection on the CARES-18K benchmark across multiple LLMs.
- Graduate Intern, Capital One, McLean, VA, 2024
- Implement alignment techniques (SFT, preference optimization, KTO) to strengthen LLM safety guardrails, boosting defense against adversarial attacks by 250%.
- Deploy LLM prototypes (Llama 2/3, Mixtral), reducing inference latency by 75%.
- Manager-nominated Leadership Award for initiating impactful research collaborations.
- Data Programmer, Edwards LifeSciences, Irvine, CA, 2015 – 2017
- Maintain ORACLE-based clinical database systems for FDA-compliant clinical trials.
- Create machine learning models to automate health record analysis and reporting.
Honors & Awards
- MATS 9.0 Scholar – Acceptance rate below 7%, 2026
- Outstanding Reviewer Award – IJCNLP-ACL 2025
- Best Paper Award – ML4H Symposium, 2025
- Outstanding Paper Award – AAAI PDLM Workshop, 2025
- Leadership Award – Capital One (Manager-nominated), 2024
- Dean’s Fellowship – University of Maryland, College Park, 2023 – 2027
- Finalist – Data Science for Social Good Fellowship, Carnegie Mellon University, 2021
- Awardee – UCSD Summer Training Academy for Research Success, 2021
- Head of Programs – USC Graduate and Rising in Information and Data Science, 2020
- Finalist – Center for Knowledge-Powered and Interdisciplinary Data Science, 2019
- 2nd Place – USC–Boeing Inaugural Data Hackathon, 2018
Publications
See my Google Scholar for a full list of publications.
Technical Skills
- Programming: Python, SQL, SAS, JavaScript, MATLAB
- ML/AI Frameworks: PyTorch, TensorFlow, Keras, Hugging Face Transformers
- LLM Techniques: Fine-tuning (SFT, preference optimization, KTO), Alignment, RLHF, vLLM serving
- ML Methods: Deep Learning (CNN, LSTM, RNN, GAN), Causal Inference, Meta-Learning
- Data & Analytics: Spark, AWS, Kubernetes, Git, Tableau, Statistical Analysis
- Specialized: SHAP/LIME Interpretability, Crowdsourcing (AMT), Clinical Data Systems
Teaching
Service
Committee
- Junior Chair, Bias and Fairness Area, ML4H Symposium, 2025
Reviewing
- ACL Rolling Review (ARR) – ACL, EMNLP, NAACL, AACL cycles (2024–2025)
- Trustworthy NLP Workshop
- ACM Transactions on Intelligent Systems and Technology
- Journal of Medical Internet Research
- Machine Learning 4 Health (ML4H)
- Expert Systems with Applications
Invited Talks
- “You gotta be a Doctor Lin”: The Risk of Bias in LLM, UMD INST 414, Spring 2025
- A Tutorial on Trustworthiness and Bias in LLMs, UMD Values-centered AI Institute, Fall 2024
- Summer Education Program for High Schools in STEM, Summer 2023