Summary
Overview
Work History
Education
Skills
Languages
Timeline
background-images

AJAY SINGH

Data Scientist
Toronto,ON

Summary

Experienced Data Scientist with a proven track record of implementing machine learning techniques and frameworks across diverse industries. Skilled in solving complex business problems using structured and unstructured data. Proficient in a wide range of data science languages, tools, and libraries, with a focus on leveraging Generative AI through innovative Python applications. Demonstrates a forward-thinking approach to integrating AI technology, driving impactful results for organizations.

Overview

10
10
years of professional experience

Work History

Sr. Principal Engineer - Data Science

Marsh McLennan
02.2023 - 01.2025
  • Played a pivotal role in requirement gathering and designing the Hybrid Anomaly detection model to highlight the transactional values into anomalies by applying Business Rules, Isolation Forest, and Inter Quartile values
  • Streamlined the automation of data ingestion of ledger data from Azure data lake to Databricks, performing the data transformation required for developing the Anomaly detection solution and Power BI dashboard
  • Developed an additional feature of ranking the anomalies by weights addition creating a value addition for business stakeholders to focus on important outliers reducing time and improving efficiency by 60%
  • Led the designing and development of Attendance bot targeted to answer questions in the form of reports and numbers for higher management and Real Estate team optimizing time and improving productivity by directly asking question in natural language themselves rather writing SQL queries to get the answers
  • Designed the project specific data by performing ETL task like data transformations, data cleansing, defining new columns and loading the data to synapse platform using Databricks
  • Developed the Attendance Bot using LLM techniques of prompt engineering, using few shot techniques in the knowledge base integrating function calling approach with OpenAI GPT 3.5 Turbo model, performing RAG on location data useful in generating customised SQL queries giving 80% times correct responses
  • This innovative solution was hosted on company's internal server using CI/CD pipeline, Dockers and Github
  • Designed, developed, and iteratively refined interactive Power BI dashboards of Meal Card tailored for admin team, ensuring accurate data representation and actionable insights related for effective decision-making
  • Supervised a team of data engineers and interns providing leadership and guidance to drive performance and achieve organizational goals

Data Scientist - Advanced Analytics

Holcim Global Digital Hub.
12.2020 - 02.2023
  • Led the model development for Cement fineness Prediction to predict the quality of cement - R45 values for 40 cement manufacturing mills and deploy the model in the manufacturing plant virtual machines
  • Complete project is working on Google cloud platform using various services like Dataprep (cleaning data), Tables (Model development), Docker (containerization of developed model and exporting to plant edge VM) and Data Studio (Dashboard for Visualization)
  • Performed the statistical analysis of each plant mill and bring into highlight the data quality issues and communicating the results to cement plant mill leaders and management stakeholders
  • Developed a Rule Based Alerts Mechanism for the Marketing team of Australia and Philippines identifying the anomalies using statistical and Rule based algorithms and deployed the complete solution on AWS
  • Generated alerts are sent to Gmail and google chat ids of relevant stakeholders
  • Led the development of interactive Qlik sense dashboard for web and mobile users for the higher management by managing the complete ETL task from sensor data from plants to AWS cloud

Analyst (Data Science) - AI and Advanced Analytics

M&G Global Services Pvt. Ltd.
03.2019 - 11.2020
  • Developed a Call Forecasting solution for Customer Service Team to forecast call volumes of 35 business teams (onshore and offshore)
  • The training set for this univariate analysis project was call volumes from year 2015 to 2019
  • Models developed - Moving Average, Single Exponential Smoothing, Double ES, Triple ES and Auto ARIMA optimizing the operational efficiency
  • Automated the process of Text Extraction from Scanned Documents by extracting relevant data points according to stakeholder needs from over 1000 pages of scanned PDF documents using First Page Image classification using CNN classifier and Tesseract and Regex for extracting relevant data points
  • The tool developed involved a frontend using streamlit and increased the efficiency of stakeholders by reducing the manual task by 80%
  • Performed an exceptional team player role by collaborating in a hackathon in Vietnam delivering a Product Recommendation system (POC) for Vietnam geography of 5 different insurance products using the customers demographic profile as dependent parameters
  • The cosine similarity technique was used for building the recommendation system for the existing customers

Software Engineer - Data Science

Rolta BI and Big Data Analytics
07.2018 - 01.2019
  • Developed a Damaged Auto Parts Segregator classifying in 4 classes for manufacturing industry- Backside, Frontside, Scratch Parts and Bolt Damaged categories helping in automatic filtering out damaged parts
  • Data understanding, Image labelling, Image pre-processing (resizing), and developing a Deep learning (CNN) model using TensorFlow framework for automatic recognition using SoftMax classifier
  • Handled the Data Imbalance problem with Image Augmentation

Intern - Data Science

Intelligence Node
05.2018 - 07.2018
  • Automated the Product categorization task by developing a python-based tool by transforming the raw textual data using NLTK (stop words analysis, Lemmatization, Word Embedding) and applying Deep Learning framework KERAS for implementing RNN solution reducing time of senior members by 40%

Sr. Engineer - Data Analyst

Leighton India Contractors Pt. Ltd.
06.2015 - 06.2017
  • Enhanced the tendering process by successfully designing and implementing a regression solution for predicting the cost of the new projects to be considered as a benchmark score helping business leaders for effective decisions
  • Successfully performed the data analysis task of commercials presenting the routine reports to higher managements including human resources on different project sites, and the productivity charts bringing out insights based on man, material and fund of projects in India

Education

PGP - Business Analytics

Praxis Business School
01.2018

MBA - Project Management

Amity University
01.2015

BE - Electronics Engineering

University of Pune
01.2013

Skills

  • Data science pipeline
  • Cleansing
  • Wrangling
  • Visualization
  • Modeling
  • Interpretation
  • Regression techniques
  • Classification techniques
  • Statistics
  • Time series
  • Function calling
  • RAG
  • Langchain
  • Python
  • Tensorflow
  • Pytorch
  • Pandas
  • Numpy
  • Scikit-learn
  • NLTK
  • Git
  • Dockers
  • SQL
  • AWS
  • Lambda
  • Glue
  • S3
  • Azure
  • Databricks
  • Blob Storage
  • ADLS
  • Power BI
  • Plotly
  • Matplotlib

Languages

English
Full Professional
Hindi
Full Professional

Timeline

Sr. Principal Engineer - Data Science

Marsh McLennan
02.2023 - 01.2025

Data Scientist - Advanced Analytics

Holcim Global Digital Hub.
12.2020 - 02.2023

Analyst (Data Science) - AI and Advanced Analytics

M&G Global Services Pvt. Ltd.
03.2019 - 11.2020

Software Engineer - Data Science

Rolta BI and Big Data Analytics
07.2018 - 01.2019

Intern - Data Science

Intelligence Node
05.2018 - 07.2018

Sr. Engineer - Data Analyst

Leighton India Contractors Pt. Ltd.
06.2015 - 06.2017

MBA - Project Management

Amity University

BE - Electronics Engineering

University of Pune

PGP - Business Analytics

Praxis Business School
AJAY SINGHData Scientist