Summary
Overview
Work History
Education
Skills
Websites
References
Timeline
Generic

Rishabh Luthra

Toronto

Summary

  • Experienced Data Engineer with 5+ years of experience in designing data-centric solutions.
  • Developed various pipelines for Data Engineering, Data Validation, Feature Engineering for structured and unstructured datasets with various financial and non-financial datasets
  • Used deep business domain knowledge to independently lead the analytics process to identify valuable and innovative insights.
  • Established and maintained collaboration with research and business teams to converge on the best solutions in industries like Finance, Telecommunication, and Insurance
  • Extended prototypes into fully functional, scalable and polished solutions ready for internal and/or external use

Overview

6
6
years of professional experience

Work History

Technical Lead for Data products for Capital Markets

OMERS
02.2022 - Current
  • Project Description: Technical Lead for delivering data services to OMERS Capital Markets
  • Worked with the business to deliver Investment and Risk data coming from the book of records directly to the Capital markets division. Worked closely with the data owners and integrated multiple data sources to provide an API and the front end. Currently provides an outlook on daily investment decisions dating 2020 and supports Portfolio managers, Analysts, Traders, and developers directly.
  • Developed Gen AI RAG pipeline in Databricks to get better and more customized performance than Copilot. Also integrated the solution with tools like Azure Document Intelligence.
  • Led the team to set up a vector database on Azure Kubernetes Service to store T4 filings. Currently processing data worth 50M vectors in milliseconds
  • Delivered self-serve platform to the Portfolio Analytics team which automated manual running python jobs. This saved 2 FTE in 1 year and enabled automation within the team
  • Support Python libraries which enable technical users to gather data from various sources and also lets them do quantitative functions directly in one place.

Lead Data Engineer at a Retail Company

EY
08.2021 - 02.2022
  • Project Description: Delivering solution for cloud migration for a major Canadian retail company
  • Led the source data analysis to develop data lineage to establish visibility between project owners and source owners
  • Coordinated with Data stewards to develop proper triage processes for metadata capture, CDE identification, Data Quality Rule development and Ingestion pattern development
  • Bridged gap between Business users and Technical team by implementing custom DQ framework for daily DQ dashboards.

Lead Data Engineer at a Crown Corporation

EY
03.2020 - 08.2021
  • Project Description: Solution design and implementation of a no- fault Insurance model for a crown corporation in British Columbia
  • Owned the delivery of state of the art pipelines on Scala and Apache spark to run on a MapR Hadoop environment which involved transactions over 5 Million + rows everyday
  • Worked with Key business stake holders to gather requirements for the development in an Agile environment
  • Took ownership of the design and implementation of elements that will help the end users (Actuaries) analyze the premium rates in a better way
  • Optimized the pipeline execution by almost 3 hours or 60%
  • Triaged defects in an agile environment using Continuous Integration Continuous Deployment Method (CICD)
  • Presented the outcomes of the development to the business that will help reduce the insurance premium of the people of British Columbia by an amount of 250$ on an average per year per driver

Senior Data Engineer at a Leading Canadian Pension Plan

EY
06.2019 - 03.2020
  • Project Description: Enabling Natural Language Processing capabilities for a Canadian Pension Plan
  • Developed Pipeline in Azure Databricks for Data Engineering of unstructured text data for data cleaning steps like stop words removal, lemmatization etc.
  • Created features like n-grams, word to vec, sparse matrix of important keywords etc for supporting Machine Learning tasks.
  • Optimized the data engineering and feature generation process by 60% using the parallel processing of Apache spark.
  • Developed a novel way of Topic Modeling which uses business input and Guided LDA to extract the topic distribution of an article. Achieved 80% accuracy in predicting the top 2 topics.
  • Created a sentiment analysis model for analyzing the sentiment of the text articles which involves using retrofitting methods to manipulate word embedding. Achieved 75% accuracy when used with a logistic regression model to classify the sentiment
  • Used XGBoost classification method for prioritizing emails for FinOps department. These emails consisted of daily trade calls from banks and needed human effort. Achieved 95% accuracy and saved approximately 3 FTEs

Data Engineer at a Canadian Telecom company

EY
09.2018 - 05.2019
  • Project Description: Reporting engine for IFRS calculations for a Canadian Telecommunication Company
  • Used SQL on Teradata for designing and implementation of the solution
  • Created various key deliverables including the metrics report and rolling asset which was a month to date summary of all the data received. This report has been leveraged and shared with the client for all knowledge sharing meetings and discussions

Education

Master of Engineering - Electrical And Computer Engineering

Toronto Metropolitan University
05.2018

Bachelor of Engineering -

PEC University of Technology
05.2016

Skills

  • Programming languages : Python, SQL, R, Scala, Java
  • Python packages : Numpy, Pandas, Pyodbc, NLTK, Scikit, Tensorflow, Streamlit
  • Gen AI : Open Ai, Meta Llama
  • Cloud providers : Azure, AWS
  • Cloud tools: Azure Databricks, Azure Data Factory, Azure Logic Apps, Azure Synapse, Azure Open AI, Azure Document Intelligence, Azure Cognitive search, Azure AI studio
  • Databases : SQL Server, MySQL, HIVE, Databricks Delta Lakehouse, Azure PostgreSQL
  • Orchestrators : Prefect
  • Visualizations : Python Matplotlib, PoweBI

References

References available on request

Timeline

Technical Lead for Data products for Capital Markets

OMERS
02.2022 - Current

Lead Data Engineer at a Retail Company

EY
08.2021 - 02.2022

Lead Data Engineer at a Crown Corporation

EY
03.2020 - 08.2021

Senior Data Engineer at a Leading Canadian Pension Plan

EY
06.2019 - 03.2020

Data Engineer at a Canadian Telecom company

EY
09.2018 - 05.2019

Master of Engineering - Electrical And Computer Engineering

Toronto Metropolitan University

Bachelor of Engineering -

PEC University of Technology
Rishabh Luthra