Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Sathish Punati

Calgary,AB

Summary

Data Scientist with 10 years of experience, familiar with gathering, cleaning, and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic, and other analytical techniques.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Senior Data Scientist

360 EEC
10.2023 - 01.2024
  • Spearheaded critical data extraction from PDF documents related to environmental safety assessments of oil wells, employing advanced Natural Language Processing (NLP) techniques
  • Leveraged Hugging Face's cutting-edge NLP models and Transformers (BERT) for sentiment analysis on extracted data, contributing pivotal insights for environmental risk evaluation
  • Demonstrated adaptability to industry-specific tasks and AI technologies, showcasing proficiency in Language Model (LLM) implementation, Generative Adversarial Networks (GANs), and expert data extraction within the challenging context of oil well safety assessments
  • Implemented Generative chat functionality for asking questions on PDFs using the OpenAI API ChatGPT
  • Applied fine-tuning methodologies using LORa and QLora to customize models according to project requirements, optimizing performance and ensuring robust results
  • Orchestrated Falcon 7B dataset structuring and library setup, incorporating HuggingFace Transformers, Datasets, and WandB for streamlined training progress monitoring
  • Selected and configured Falcon 7B LLM model, defined PEFT parameters for LoRA, and implemented quantization strategies, balancing memory efficiency with acceptable error rates
  • Defined training arguments, including batch size, optimizer, learning rate scheduler, and checkpoints, for the fine-tuning process
  • Executed fine-tuning using the HuggingFace Trainer with PEFT configuration, monitored training progress with WandB, and maintained a vigilant approach to prevent overfitting through continuous validation of both training and validation loss.

Practice Lead AI/ML

Infovision Inc.
06.2022 - 06.2023
  • Led a team of 5+ data scientists and data engineers in developing a time series forecasting model for predicting the demand patterns of products at Henry Schein's warehouse over time
  • Utilized Google Cloud Platform tools, including BigQuery, SQL, and Python libraries such as Keras and TensorFlow, to implement an efficient algorithm
  • This significantly improved inventory management by providing accurate forecasts of product demand, aiding in proactive decision-making
  • Conducted Pareto analysis on a comprehensive dataset comprising 400K products, customers, and suppliers
  • Identified high-demand and high-profit items for targeted optimization and Combinatorial Optimization, enhancing overall warehouse efficiency
  • Investigated influential factors affecting specific variables over different time periods, contributing to a nuanced understanding of the product time series forecasting model
  • Executed regression analysis to refine and optimize the predictive model, emphasizing precision and reliability in forecasting demand patterns for 400K products across various customers and suppliers
  • Analyzed historical data and patterns to comprehend the underlying dynamics influencing product demand over time
  • Aligned current demand situations with patterns derived from historical data, ensuring accuracy by synchronizing trends from past data with real-time information
  • Contributed to a deeper understanding of product demand, streamlined inventory operations, and refined business strategies
  • The project ultimately improved Henry Schein's overall operational efficiency and customer satisfaction through proactive inventory management.

Team Lead/Senior Data Scientist

HTC Global Services
04.2021 - 06.2022
  • Led and coordinated two teams of data engineers and ML engineers in the development of a sophisticated customer segmentation algorithm using K-Means clustering within the Google Cloud Platform (GCP) infrastructure
  • Implemented advanced techniques within the K-Means clustering process to enhance accuracy and granularity in customer segmentation
  • Managed large-scale customer datasets, overseeing projects focused on Motor Insurance cross-selling, Customer Segmentation based on spending patterns, and Revenue Growth prediction through the utilization of various bank products
  • Conducted data preprocessing using Google BigQuery and Vertex AI, implementing Python libraries such as Pandas and NumPy for efficient data handling and manipulation
  • Developed and fine-tuned machine learning algorithms, incorporating regression models for Revenue Growth prediction
  • Applied statistical methods and feature engineering to optimize the accuracy and reliability of the models
  • Spearheaded data migration initiatives to streamline processes and enhance overall data efficiency
  • Designed and implemented an interactive Google Analytics dashboard, providing stakeholders with a visually intuitive platform for data exploration and insights
  • Employed advanced ML techniques to enhance Revenue Growth prediction models, leveraging insights derived from spending patterns and customer segmentation
  • Collaborated with business stakeholders and cross functional teams to understand objectives and ensure ML algorithms aligned with strategic goals, facilitating effective cross-functional teamwork
  • Contributed to Hong Leong Bank Malaysia's data-driven decision-making process, enabling personalized customer engagement, targeted marketing efforts, and sustainable revenue growth.

Senior Data Scientist

Techno Brain Group
03.2020 - 04.2021
  • Developed ML-Spark scripts processing thousands of images for prediction algorithm
  • Created synthetic data using Keras data augmentation for better training of the model
  • Developed a computer vision algorithm with Convolution Neural Networks (CNN's) TensorFlow and Keras for cattle recognition using the SIFT (Scale-Invariant Feature Transform) technique
  • Conducted experimental evaluations, demonstrating the superior performance of the proposed algorithm
  • Developed a generative AI model using GAN's which can generate high-quality natural images that develop gradually to generate more and more realistic looking data by coupling with an adversarial network
  • This framework not only has the possibility of generating very high-quality synthetic data but also it can be used to enhance pixels in photos, conversion of images from one domain to another
  • Achieved a high identification accuracy of 93.3% within a reasonable processing time
  • Outperformed traditional identification approaches, which achieved an identification accuracy of 84%.

Data Scientist

Avows Group
11.2019 - 03.2020
  • Developed a CNN project using Python to identify optimal tree crowns for paper-making in a paper mill
  • Applied convolutional neural network algorithms to determine suitable cutting points, enhancing efficiency in paper production
  • Implemented a precise algorithm, enhancing the paper-making process by automating the identification of ideal tree crown cutting points using convolutional neural networks and Tensorflow in Python
  • Employed a Python-based machine learning recommender model using Artificial Neural Networks and Random Forest algorithm to predict KAPPA values for RGE Group Indonesia, incorporating TensorFlow and Keras frameworks to address the complexity of the task
  • Collected and cleansed data using SQL(ETL), conducted a decade-spanning data analysis to anticipate optimal KAPPA values based on diverse parameters
  • Enhanced data visualization through a Tableau dashboard, while also crafting a deep learning model for precise KAPPA number predictions
  • Collaborated with business stakeholders and cross-functional teams to understand objectives and ensure ML algorithms aligned with strategic goals, facilitating effective cross-functional teamwork
  • Contributed to RGE Group Indonesia's data-driven decision-making process, enabling personalized customer engagement, targeted marketing efforts, and sustainable revenue growth.

Data Scientist

Automotive Robotics India Pvt Ltd
02.2017 - 11.2019
  • In the project "Caterpillar Engine Image Detection Using CNN" (Convolutional Neural Networks), my tasks included Exploratory data analysis for pipeline establishment, Choosing a network architecture and experimenting with design, Exploring pre-processing techniques to enhance model performance and Utilizing mind maps for process optimization and continuous improvement
  • Created synthetic data(Engine Images) using Keras data augmentation for better training of the model
  • The main achievement of the project was developing an algorithm to predict engine model numbers from provided images
  • For the same client, Caterpillar, I developed an additional machine learning algorithm for predictive maintenance of boat engines
  • This involved: Collecting data from various Engine Control Units (ECUs) installed on the engine, Employing Artificial Neural Networks to address this intricate challenge, Implemented predictive maintenance analytics using machine learning models built with Python.

Data Analyst

Infogem Web Solutions pvt ltd
11.2014 - 01.2017
  • As a data analyst for the project "Demand Planning and Inventory Management" at Siloam Hospitals, my responsibilities included: Extracting raw data and developing a Data Discrepancy report across different data sources, Migrating data from MySQL to Microsoft Excel Sheets and further processing it in Python for analysis
  • Used NLP in demand forecasting by analyzing text data for future demand
  • Utilizing the data discrepancy report to perform Pareto Analysis, classifying SKUs into top 70%, mid 20%, and low 10% categories based on their value and profitability, using SQL, Python, Scikit-Learn, Pandas, Matplotlib, Seaborn, and Power BI.

Education

Electrical Engineering -

Vellore Institute of Technology
Vellore, TN
08.2010

Maths and Physics -

Sri Chaitanya College
Hyderabad, Telangana
05.2006

Skills

  • Machine Learning
  • Deep Learning
  • Python
  • SQL
  • Pandas
  • Statistics
  • Data Management
  • Pyspark
  • Artificial Intelligence
  • MLOps
  • NLP
  • Google Cloud Platform
  • TensorFlow
  • Microsoft Azure
  • Intelligence Gathering
  • Advanced data mining

Certification

  • Improvising Deep Neural Networks - Coursera
  • Machine Learning A-Z - Udemy
  • Deep Learning A-Z - Udemy

Timeline

Senior Data Scientist

360 EEC
10.2023 - 01.2024

Practice Lead AI/ML

Infovision Inc.
06.2022 - 06.2023

Team Lead/Senior Data Scientist

HTC Global Services
04.2021 - 06.2022

Senior Data Scientist

Techno Brain Group
03.2020 - 04.2021

Data Scientist

Avows Group
11.2019 - 03.2020

Data Scientist

Automotive Robotics India Pvt Ltd
02.2017 - 11.2019

Data Analyst

Infogem Web Solutions pvt ltd
11.2014 - 01.2017

Electrical Engineering -

Vellore Institute of Technology

Maths and Physics -

Sri Chaitanya College
Sathish Punati