Annu Sharma

Anupama Sharma I'm a Data Science grad student at the University of Maryland with a background in Electrical Engineering and Physics from BITS Pilani. Before UMD, I spent two years at Synechron's AI Finlabs building agentic systems, fine-tuning language models, and making sense of messy data. I'm drawn to problems at the intersection of machine learning and the real world — whether that's improving how models reason, making AI systems more reliable, or turning complex datasets into something meaningful. 📍 College Park, MD  |  📬 sharma25@umd.edu  |  🔗 LinkedIn

What I’m Working On


A Few Things I’ve Built

Project What it does
Multi-Agent RAG System Production-ready RAG pipeline with semantic caching, token optimization, and a live monitoring dashboard (FastAPI + LangGraph)
DS Agent Benchmark Comparison Comparative evaluation of 4 data science agents across 4 benchmark datasets — measuring task completion, code correctness, and reasoning quality at scale (PySpark + Airflow)
Floor Plan Symbol Detector Computer vision pipeline for architectural drawings — fine-tuned Faster R-CNN on the CubiCasa5K dataset, achieving mAP@0.5 of 0.54
YRBSS Survey Analysis Statistical analysis of CDC national survey data (N ≈ 20,100) with post-stratification reweighting for sexual minority populations
RL Playground Interactive dashboard for learning Reinforcement Learning from first principles — live MDP Visualizer, Gridworld with Value & Policy Iteration, Bellman equation explorer

RL Playground — Interactive RL Learning Dashboard

RL Playground


Experience

Jr. Associate, ML Research — Synechron Technologies, Finlabs (July 2024 – July 2025)

Focused on production-grade ML systems for the financial domain. Fine-tuned the Qwen2 Vision-Language Model using LoRA and 4-bit quantization for financial document extraction, improving F1 from 0.68 to 0.85 on a manually-labeled test set. Trained RoBERTa-based embeddings combined with XGBoost and Random Forest classifiers for customer segmentation across 50K records, integrating the pipeline into a production demo with stratified cross-validation. Also built a FastAPI microservice for automated PostgreSQL DDL validation — catching referential integrity and primary key errors before production deployment — and contributed to the codebase for an in-house agentic AI framework, including technical documentation.

Skills developed: LLM fine-tuning (LoRA, quantization), vision-language models, ensemble classifiers, FastAPI, PostgreSQL, agentic systems, technical documentation


AI/ML Intern — Synechron Technologies, Finlabs (July 2023 – June 2024)

Built foundational agentic and RAG infrastructure from the ground up. Developed multi-agent workflows using LangGraph and Autogen with locally hosted open-source models via Ollama, writing evaluation criteria and performance metrics per agent node to track output consistency and identify failure modes. Built an end-to-end document QA pipeline using ChromaDB, LangChain, and Streamlit to extract structured insights from unstructured financial documents at scale. Presented findings to the broader team and maintained reproducible documentation throughout.

Skills developed: multi-agent orchestration, RAG pipelines, LangGraph, LangChain, ChromaDB, Streamlit, LLM evaluation, vector databases


Skills

Languages   Python · R · SQL · C · HTML

ML / AI   PyTorch · TensorFlow · Scikit-learn · LangChain · LangGraph · Autogen · FastAPI · spaCy · NLTK · LLM Fine-Tuning (LoRA) · Diffusion Models · OpenCV · Embeddings

Data & Big Data   PySpark · Airflow · Pandas · NumPy · SciPy · Matplotlib · Power BI

Tools & Platforms   Git · Docker · Weights & Biases · ChromaDB · MongoDB · PostgreSQL · AWS · GCP

Interests   Generative AI · Computer Vision · Agentic Systems · Statistical Modeling · Responsible AI


Writing & Research


Always open to research collaborations, interesting problems, and good conversations.