Resume

Basics

Name Yuan (Gabriel) Zhang
Position Research Assistant - Computer Vision & Agricultural Robotics at USDA ARS

Education

  • Sep 2023 - Dec 2024

    Evanston, IL

    Master of Science
    Northwestern University
    Machine Learning and Data Science
    • Deep Learning
    • Large Foundation Models
    • Natural Language Processing
    • Predictive Analytics
    • Cloud Engineering
  • Aug 2019 - Jun 2023

    Irvine, CA

    Bachelor of Science
    University of California, Irvine
    Data Science
    • Machine Learning
    • Probability and Statistics
    • Algorithms
    • Big Data Analytics
    • Information Retrieval
    • Database Management

Work

  • May 2025 - Present

    Fargo, ND

    Research Assistant - Computer Vision & Agricultural Robotics
    US Department of Agriculture (USDA) Agricultural Research Service (ARS)
    • Developing real-time computer vision system enabling autonomous weed detection and precision spray control on Farm-ng Amiga robotic platform
    • Architecting multi-process system with IPC and threading to ensure low-latency integration of depth sensing, GPS, and camera feeds
    • Deploying PyTorch models on NVIDIA Jetson Xavier via TensorRT FP16, achieving 8.6x inference speedup (718ms → 84ms)
    • Engineering dual-camera GUI integrating weed detection and obstacle avoidance with automated safety-stop functionality
  • Sep 2024 - Dec 2024

    Evanston, IL

    Data Science Consultant
    Kavi Global
    • Developed an LLM-powered lead qualification chatbot to route website visitors across 3 persona flows and drive consultation bookings—in production since February 2025
    • Achieved 81% response accuracy across 180+ test cases by engineering GPT-4o prompts and dynamically leveraging website content as a RAG knowledge base
    • Designed intent classification and prospect qualification workflow from greeting through service matching and case study recommendations; customized human handoff escalation for complex queries
    • Recommended MS Copilot Studio over Dialogflow, Rasa, and Botpress for cost efficiency and Azure OpenAI integration
  • Jul 2024 - Sep 2024

    Waltham, MA

    Data Science Engineer Intern
    SS&C Intralinks
    • Engineered a microservice-ready language identification pipeline for scanned PDFs using CLIP and PyTorch, achieving 98% accuracy across 14 languages (~0.3s/doc)—projected to save $300K/year on OCR costs
    • Designed image preprocessing to select text-heavy pages, improving top-1 accuracy from 89% to 98% by filtering blank and image-heavy content before inference
    • Benchmarked 30 vision-language models on AWS GPU instances using Hugging Face Transformers for accuracy, latency, and memory trade-offs
    • Implemented adaptive multilingual prediction with configurable confidence thresholds for OCR and RAG pipelines
  • Jun 2022 - Mar 2024

    Irvine, CA

    Full Stack Software Developer (Volunteer)
    Irvine Canaan Christian Community Church
    • Created a full-stack enrollment and attendance management system using Python, Flask, and MySQL for children's ministry, processing 80+ weekly check-ins/check-outs for 3+ years (since December 2022)
    • Developed 8 data management dashboards and 15+ role-based pages serving 4 user roles with real-time validation
    • Refactored REST APIs and reduced codebase by 3,600+ lines via SQLAlchemy optimization and DRY architecture
    • Integrated barcode SDK for automated badge printing, enabling streamlined check-in workflow for staff and guardians
  • Jul 2021 - Aug 2021

    Shanghai, China

    Data Analyst Intern
    Shanghai Daiqian Information Technology Co., Ltd.
    • Conducted consumer research for pre-launch product testing, performing survey analysis and building internal tooling
    • Analyzed 190+ consumer surveys using R; findings: 9/10 satisfaction, 86% recommendation rate, 87% purchase intent
    • Created 20+ visualizations using ggplot2 (satisfaction radar charts, demographic profiles, efficacy heatmaps, usage treemaps), adopted in stakeholder presentations and consumer testing reports
    • Built internal trial management system on Alibaba Cloud with automated tester registration, feedback reminders, and sample tracking for product development workflows

Projects

  • Oct 2024 - Dec 2024
    paper2summary
    • Developed a scientific paper summarization system by LoRA fine-tuning Llama-3.2-1B-Instruct on 20K arXiv papers, training only 0.07% of parameters (~850K) with 10K token context support (~28 hours on single RTX A6000)
    • Achieved +51% ROUGE-2 and +37% ROUGE-3 improvement over base model on 6,440-sample test set
  • Oct 2023 - Dec 2023
    Dillard's Black Friday Return Prediction
    • Built ML pipeline to predict Black Friday purchase vs. return outcomes, reducing return-related costs for Dillard’s
    • Queried 160M+ POS records and applied SMOTE for class imbalance; trained K-means + Logistic Regression ensemble
    • Achieved 78% purchase precision and 58% return recall with 227% projected ROI (~$590K)

Awards

Skills

Languages
Python
SQL
R
C++
ML/DL
PyTorch
TensorFlow
Transformers
Scikit-learn
XGBoost
PySpark
Computer Vision
OpenCV
Torchvision
Ultralytics
TensorRT
NVIDIA Jetson
NLP
Sentence Transformers
SpaCy
NLTK
BERTopic
Data Platforms
PostgreSQL
Spark
Databricks
BigQuery
MongoDB
Neo4j
Pinecone
Tools
AWS
Docker
Linux
Git
CI/CD
W&B
Streamlit
Tableau