Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

SaiReddy Thatiparthi

Fremont

Summary

5.4 years of professional experience delivering Data Scientist, AI/ML, Deep Learning, NLP and GenAI solutions across Aviation, Finance/Banking And Healthcare Strong background in Big Data processing using Apache Spark, Kafka, and Hadoop, with a focus on delivering impactful business insights and measurable improvements Expertise in Machine Learning and Deep Learning techniques Classification, Regression, Clustering, Time Series forecasting, Computer Vision, NLP and cutting-edge GANs, Autoencoders, Diffusion Models Proven success in MLOps pipeline implementation using MLflow, Airflow, Kubeflow, Jenkins and Docker with measurable cost and effort reductions Hands-on experience in building production-ready Generative AI and LLM systems, RAG based GenAI Chatbots and scalable ML pipelines on AWS, tuning using LangChain, Hugging Face, and Vector databases

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Scientist – ML Engineering

American Airlines
Dallas
01.2024 - Current
  • Built and deployed AI/ML models for flight optimization using AWS SageMaker, Bedrock, GCP Vertex AI, reducing operational costs by 45%
  • Engineered a fast, adaptable machine learning model that improved airline search accuracy by 30%
  • Developed a robust data pipeline that reduced manual data extraction time by 50%
  • Created CNN- based computer vision pipeline using OpenCV and TensorFlow to detect aircraft damage, boosting accuracy by 45%
  • Utilized NVIDIA GPU -acceleration to train and fine-tune Deep learning models and LLMs for flight operations and customer experience automation
  • Developed RAG-based GenAI chatbot using LangChain, OpenAI, Pinecone and Neo4j, cutting internal manual search time by 80%
  • Led deployment of end-to-end MLOps pipelines using MLflow, Airflow, Kubeflow, Jenkins, Docker and Kubernetes for CI/CD model automation
  • Implemented end-to-end Generative AI solutions to automate ticket rebooking and enhance customer support
  • Collaborated with cross-functional teams to translate business requirements into AI/ML solutions, improving decision making process
  • Environments: Python, R, C++, HIVE, Kafka, HDFS, AWS SageMakerAI, Bedrock, Databricks, Snowflake, NLTK, SpaCy, Scikit-learn, TensorFlow, PyTorch, Keras, MLflow, Airflow, Kubeflow, Kubernetes, Docker, Jenkins, Matplotlib, Seaborn, RASA, Dialogflow, Hugging Face, Open API, LangChain, Mistral, Transformers, ANN,RNN,CNN, OpenCV, GPT, BERT,NVIDIA GPU, FAST API, Flask, Streamlit

Data Scientist

Infosys
12.2020 - 08.2022
  • Created an ensemble of gradient boosting, feature extraction techniques to handle large financial datasets and Neural network models for fraud detection, reducing false positives by 20%
  • Developed fraud detection pipelines with real-time data streaming using Apache Kafka and Spark, enhancing detection of anomalies in banking transactions
  • Utilized TensorFlow and Keras to build deep learning models, improving the identification of market anomalies by 20%
  • Conducted A/B testing on predictive models, increasing the conversion rate of financial products by 12%
  • Applied Natural Language processing to analyze sentiment from social media data, providing actionable insights for investment decisions
  • Implemented a scalable ML infrastructure with AWS SageMaker, GCP and Azure, reducing model deployment time by 40%
  • Created interactive dashboards and reports using Matplotlib, Seaborn, Power BI and Tableau to present model outputs and key metrics
  • Worked closely with business stakeholders and data engineers to understand domain requirements and deliver actionable insights
  • Environment: Python, Spark, Hadoop, Kafka, HIVE, Matplotlib, Seaborn, Tableau, Scikit-learn, SQL, AWS SageMaker, GCP, NLTK, Spacy, TensorFlow, Docker, Jenkins, MLflow, Kubeflow, Airflow, Azure ML, Classification, Regression, Clustering

Data Scientist

Hexaware
01.2019 - 12.2020
  • Assisted in the development of a machine learning model for patient flow optimization, contributing to a 10% reduction in patient wait times through early deployment insights
  • Automated the data processing pipeline, decreasing data readiness time by 45%, which enabled faster analytical turnaround and decision-making
  • Conducted exploratory data analysis and data visualization for healthcare datasets, aiding data-driven decision making and strategic planning
  • Developed a machine learning-driven HER system which streamlined patient data processing, leading to a 40% reduction in administrative workloads and a 20% increase in data accuracy
  • Implemented a feature engineering pipeline using Python and scikit-learn that enabled the creation of highly predictive models for patient readmission risks, slashing readmission rates by 25% within a year
  • Pioneered the use of unsupervised algorithms for anomalies detection in medical data sets, identifying potential data breaches and health fraud with an efficacy improvement of over 70%
  • Collaborated with external healthcare providers to tailor machine learning solutions, boosting predictive accuracy by tailoring model to specific demographic data, resulting in a 30% increase in partnership engagements
  • Environment: Python, C++, TensorFlow, Scikit-learn, SQL, AWS SageMaker, Pandas, Matplotlib, Seaborn, Classification, Regression, Clustering, MLflow, Airflow, Kubeflow, Docker, Flask, Kubernetes, Apache Spark, Hadoop, GCP, Azure

Education

Master of science - Information Technology, Data Science& Machine learning

Trine University
Angola, IN, USA
12.2023

Skills

  • Python
  • R
  • C
  • Java
  • SQL
  • EDA
  • ETL
  • Feature Engineering
  • Hypothesis Testing
  • Statistics
  • Mathematics
  • Linear algebra
  • Tableau
  • PowerBI
  • Matplotlib
  • Plotly
  • Seaborn
  • Hadoop
  • HDFS
  • Apache Spark
  • Hive
  • Kafka
  • Storm
  • Snowflake
  • Databricks
  • MySQL
  • Oracle
  • Microsoft SQL Server
  • Teradata
  • Regression
  • Classification
  • Clustering
  • Scikit-learn
  • TensorFlow
  • PyTorch
  • Keras
  • Time Series ARIMA
  • Prophet
  • ANN
  • CNN
  • RNN
  • LSTM
  • GRU
  • GANs
  • Autoencoders
  • OpenCV
  • Transformers
  • AWS SageMaker AI
  • Bedrock
  • Redshift
  • GCP Vertex AI
  • Azure ML
  • Docker
  • MLflow
  • Airflow
  • Kubeflow
  • LLMs
  • GPT
  • BERT
  • T5
  • LLaMA
  • Mistral
  • LangChain
  • Hugging Face
  • Dolly-E
  • Gemini
  • Prompt Engineering
  • RAG
  • Pinecone
  • Weaviate
  • Neo4j
  • FAISS
  • Chroma DB
  • Sentiment Analysis
  • Document Classification
  • Word Embeddings BOW
  • TF-IDF
  • Word2Vec
  • GloVe
  • RASA
  • Dialogflow
  • NLTK
  • SpaCy
  • HTML
  • CSS
  • JavaScript
  • Reactjs
  • Angular
  • Django
  • Fast API
  • Flask
  • Streamlit

Certification

  • AWS Certified Machine Learning Engineer – Associate, 12/06/24, 12/06/27, https://www.credly.com/users/saireddy-thatiparthi
  • AWS Certified Machine Learning – Specialty, 12/28/24, 12/28/27, https://www.credly.com/users/saireddy-thatiparthi
  • NVIDIA Certified Associate – Generative AI LLMs, 03/04/25, 03/04/27, https://www.credly.com/users/saireddythatiparthi.9b7c49c0

Timeline

Data Scientist – ML Engineering

American Airlines
01.2024 - Current

Data Scientist

Infosys
12.2020 - 08.2022

Data Scientist

Hexaware
01.2019 - 12.2020

Master of science - Information Technology, Data Science& Machine learning

Trine University
SaiReddy Thatiparthi