- LLM English-SQL Bot: Enhanced the performance of the LLM (Azure OpenAI - GPT 3.5) model by optimizing it through Snowflake database tuning. This optimization facilitated natural language database searching in Snowpark for the BI team.
Tools : Azure OpenAI , Python , Snowpark , Streamlit , Snowflake, GPT 3.5
- Customer Segmentation 360 Dashboard: Developed a Customer Segmentation algorithm for a massive 900M transaction dataset, resulting in improved targeted marketing strategies and the creation of the "Customer 360 Dashboard" for Staples US stores.
Tools : Python , SQL , Snowpark , Clustering , Big Data
- A/B Testing: Streamlined marketing A/B testing processes with a Python application, contributing to enhanced testing procedures.
Tools : Python , A/B Test , Bayesian Probability , Flask
- Sentiment Analysis: Conducted Sentiment Analysis on a dataset consisting of 560K survey responses, achieving an impressive accuracy rate of 96.7% in labeling by employing TF-IDF and LSTM models.
Tools : Python , Snowflake , TF-IDF , NLTK , PySpark , Tableau , PowerBI
- Time Series Forecasting: Provided mentorship to a Junior Data Scientist in the realm of Time Series Analysis and successfully implemented a production-ready LSTM model in Snowflake. This model enabled the generation of hourly flash reports and the detection of data anomalies.
Tools : Python , Snowflake , Neural Network (Keras) , Pytorch , Time Series Analysis ( SARIMAX ) , Tableau
- Life Time Value Model: Transformed the Life Time Value (LTV) model from Pandas to Pyspark, dramatically reducing the training time from 2 days to just 40 minutes for modeling 17M customers using the XGBoost algorithm.
Tools : Python , Snowflake , XGBoost, Pyspark