HELLO, WORLD. I AM

Sreevarshini
Srinivasan.

I build reliable data systems.

I'm fascinated by the mechanics of scale: how do you ingest a terabyte without losing data? What breaks first when your schema changes? How do you keep pipelines reliable when everything upstream is chaotic? I spend a lot of time thinking about these problems, and more time building systems that catch them before they become incidents. I recently finished a master's in data science, and I'm equally interested in the algorithmic side designing models and learning systems that actually work at scale, not just in notebooks. But I've learned that a brilliant model is worthless if the data feeding it is broken. So I design with both layers in mind: infrastructure that's observable and reliable, and models that know how to fail gracefully.

See my work Get in touch

SELECTED WORK

Projects

Systems that connect real data to reliable ML outcomes.

SchemaDrift — Resilient Data Pipeline

Python · PostgreSQL · Watchdog · Pandas · JSON Schema · EDA · SQL

Built a self-documenting data pipeline that detects schema drift in real time, maps the blast radius of mutations across Bronze/Silver/Gold layers, and auto-rescues malformed data into a dynamic PostgreSQL architecture.

Impact: Auto-detects schema mutations in real-time and visualizes downstream impact across 3 data layers

View on GitHub

TaxFilingChatBot — AI Assistant

Python · LLM · Docker · RAG · PDF Processing

Containerized AI assistant that grounds responses strictly in official IRS tax documents, helping international students navigate 2025 regulations with verified accuracy (no hallucinations).

Impact: Zero-hallucination tax guidance — 100% responses grounded in official IRS documents

View on GitHub

ICU Sepsis Prediction — DyT Transformer

Python · PyTorch · Transformers · Time-Series ML

Built a transformer-based model with dynamic normalization (DyT) that predicts sepsis risk 24+ hours in advance using irregular ICU time-series data, handling missing values and temporal gaps in real-time monitoring.

Impact: Achieves early sepsis prediction with adaptive handling of missing ICU data

View on GitHub

BrisT1D Blood Glucose Prediction

Python · Time-Series Forecasting · XGBoost · LightGBM

Data science project forecasting blood glucose 60 minutes ahead for Type 1 diabetes patients using CGM sensors, insulin doses, and activity data—built for early intervention and personalized care.

Impact: 60-minute blood glucose forecasting for T1D patients with real-time personalization

View on GitHub

TandC Summarisation — Fine-tuned LLM

Python · Mistral-7B · QLoRA · NLP · Transformers · ROUGE

Fine-tuned Mistral-7B with QLoRA to summarize complex Terms & Conditions into plain English, making legal documents accessible to non-experts.

Impact: ROUGE-based evaluation for summarization quality on legal documents

View on GitHub

EXPERIENCE

Where I've Worked

Data Engineer

@ Loyalytics AI

April 2023 – July 2024 · Remote

Built lakehouse-style data infrastructure for retail analytics at scale, with reliability and governance as defaults.

Created pipelines in Databricks and set up the whole system with Delta Live Tables — managing 10TB of historical data and batch pipelines updating hourly for 10+ sources and 100+ tables through medallion architecture.
Built an internal tax calculation platform on Azure for LuLu, a retail chain operating across 7 countries — processing 6 years of historical data plus ongoing batch ingestion, with 100% accuracy.
Ran Apache NiFi on a VM rather than managed cloud specifically to cut compute costs and keep sensitive financial data secure.
Implemented complex sales and inventory transformation logic to power downstream analytics and reporting.
Migrated ~6TB data from legacy databases (Oracle, MySQL) into Azure ecosystem and authored complex SQL and Python queries for robust data processing.
Led migration to Unity Catalog, optimizing data management and governance for a massive retail chain in the Middle East.

Azure Data Factory Databricks Delta Live Tables Apache NiFi PySpark Python SQL Oracle Azure Storage

Data Engineer Intern

@ Loyalytics AI

January 2023 – March 2023 · Hybrid

Foundation work in orchestration, transformations, and governance for retail analytics pipelines.

Built ADF pipelines orchestrating ingestion from diverse sources (Oracle, MySQL, XMLs, CSVs) in Azure Storage.
Authored complex SQL and Python queries for efficient data processing and transformation.
Automated validation processes across large datasets, capturing issues early and ensuring integrity.
Led foundational migration efforts to Unity Catalog to improve governance and data discoverability.
Improved data flow efficiency by 40% via incremental load strategies and tuning, contributing to streamlined operations.

Azure Data Factory SQL Python Oracle MySQL Unity Catalog

Cloud Infrastructure Engineer (Founding Team)

@ DriverAI, LLC

June 2025 – August 2025 · Peoria, AZ

Built automation-first cloud infrastructure with monitoring, security best practices, and reproducible deployments.

Founding engineer responsible for setting up end-to-end ingestion for structured and unstructured data.
Deployed AWS (RDS, EC2) and Azure infrastructure ensuring high availability and scalable patterns.
Automated resource provisioning with Terraform, enhancing operational efficiency and cutting manual overhead by 60%.
Integrated monitoring via CloudWatch and Grafana to track system performance and resolve incidents early.
Collaborated on CI/CD pipelines and enforced cloud security best practices through IAM and secrets management.

AWS Azure Terraform Grafana CloudWatch CI/CD

Lab Ambassador

@ Tinkerspace UMD

Present · College Park, MD

Supporting a collaborative robotics lab by helping users move from ideas → safe, repeatable workflows.

Support students, faculty, and staff using lab tools and equipment in a hands-on environment.
Translate technical steps into clear workflows and documentation for different skill levels.
Guide users through safe, repeatable processes that reduce friction and errors.
Help maintain shared standards for organization, tooling, and usage procedures.

Robotics Technical Support Documentation

SKILLS

Technologies & Tools

Data Engineering

Expert: Python • SQL • PySpark • Databricks

Familiar: Snowflake • dbt • Airflow • NiFi • PostgreSQL

Cloud & Infra

Expert: AWS (S3, EC2, Lambda) • Azure (ADF, Storage) • Terraform

Familiar: Glue • Athena • CloudWatch • Azure Databricks

Governance & Reliability

Expert: Data Quality • Lineage • Unity Catalog

Familiar: RBAC • Audit Logs • Freshness SLAs • Monitoring

ML / Deep Learning

Expert: PyTorch • Transformers • Time-Series ML • Scikit-learn

Familiar: NLP • LangChain • Gemini • XGBoost • LightGBM

Dev & Ops

Expert: Git • Linux • Bash • CI/CD

Familiar: Docker • IaC • Grafana • GitHub Actions

Dashboards

Expert: Streamlit • Analytics Reporting

Familiar: Grafana • Power BI

ABOUT

The plumbing that makes big data work.

Hi, I'm Sreevarshini. I build infrastructure for data and AI systems.

I recently finished a master's in data science, and I'm equally interested in the algorithmic side—designing models and learning systems that actually work at scale, not just in notebooks. But I've learned that a brilliant model is worthless if the data feeding it is broken. So I design with both layers in mind: infrastructure that's observable and reliable, and models that know how to fail gracefully.

EDUCATION

Education & Learning

M.S. in Data Science

University of Maryland, College Park

2024 – 2026 · College Park, MD

Coursework across machine learning, deep learning, NLP, and large-scale data systems, with a focus on building reliable ML-ready data pipelines.

Let's build something.

Currently open for new opportunities. Whether you have a question or just want to say hi, my inbox is always open.

Say Hello

SreevarshiniSrinivasan.

I build reliable data systems.

Projects

SchemaDrift — Resilient Data Pipeline

TaxFilingChatBot — AI Assistant

ICU Sepsis Prediction — DyT Transformer

BrisT1D Blood Glucose Prediction

TandC Summarisation — Fine-tuned LLM

Where I've Worked

Data Engineer

Data Engineer Intern

Cloud Infrastructure Engineer (Founding Team)

Lab Ambassador

Technologies & Tools

Data Engineering

Cloud & Infra

Governance & Reliability

ML / Deep Learning

Dev & Ops

Dashboards

The plumbing that makes big data work.

Education & Learning

M.S. in Data Science

Let's build something.

Sreevarshini
Srinivasan.