Summary
Overview
Work History
Education
Skills
Timeline
Generic

Paramik Dasgupta

Toronto,ON

Summary

Working professional with around 5 years of experience in Data Engineering adept at performing data warehousing, ingestion and data wrangling using SQL, Python, ETL/ELT process for both on-premise and cloud platforms. Extensive knowledge and industry experience in data wrangling, data manipulation for improving workflow systems, building strategies to boost revenue.

Overview

6
6
years of professional experience

Work History

Data Engineer

Aeonix Research and Innovations LLP
05.2023 - 07.2023
  • Understood data requirements and developed data architecture, and deployed an Azure-based image processing pipeline for efficient data handling and data ingestion, reducing processing time and increasing scalability by 30%.
  • Integrated the Yolo v8 Python models into Azure, enhancing image detection capabilities by 25%.
  • Involved in data acquisition, utilized Azure Data Labeling services for test data image annotation and preprocessing and suggested automated steps to fulfill business operations.
  • Played a key role in data accuracy, completeness, usability, security and user testing to deliver effective solutions within Azure environment thereby resulting in cost savings of $50,000 for the client.

Data Engineer

PricewaterhouseCoopers
Kolkata, India
07.2017 - 01.2022
  • Interacted with various stakeholders and identified different external heterogeneous sources, extracted and integrated data from flat files, IDW sources, and loaded to the staging area and database tables.
  • Understood the requirement and provide technical solution. Lift and shift of tables from an On-premises database and worked in data management, data integrity and data governance for a Banking client.
  • Worked for a Global electronic appliances client and designed and built jobs to harmonize large data, loading parquet files into Delta tables, data processing using Azure Databricks and PySpark, reducing processing time by 40%.
  • Connected various source systems such as API based, File Based, Cloud based, Database based source system to ingest the source data in csv/parquet/JSON format into various cloud storage.
  • Worked for a client and performed ETL of data from an on-premises database to ADL by building pipelines using Linked Services/Datasets/Activities reducing data transfer time by 30%.
  • Performed data movement, data transformations, and control activities in Azure Data Factory and Azure Databricks.
  • Performed pre-processing of data in the intermediate layer and handled complex transformations using PySpark in the staging layer as well thereby improving data quality by 15%.
  • Performed data transmission into various perspectives using Azure Databricks to build aggregate KPIs, resulting in cost savings of $25,000 for a Global electronic appliances client.
  • Experienced in data quality assurance, in designing and executing test cases and in validating data post-migration to ensure integrity and accuracy.
  • Trigger copy job pipelines to move data objects from SQL server to Azure storage.
  • Collaborated with the product owner to develop KPIs for Power BI dashboards, using MySQL for data analysis and benchmarking.
  • Transformed the business logics into DAX queries and built calculated columns and measures in Power BI.
  • Used M language in Power Query editor to build date tables, and perform other data transformations.

Education

Master of Science - Data Science And AI

Asian Institute of Technology
Thailand
12-2023

Master of Science - Computer Science And Engineering

University of Connecticut
United States Of America
08-2017

Bachelor of Science - Computer Science And Engineering

Maulana Abul Kalam Azad University of Technology
India
07-2015

Skills

Programming Languages: Python, SQL, PySpark, DAX

Relational Databases: Oracle MySQL, Microsoft SQL Server, Postgre SQL, Data warehousing

On-Prem and Cloud environment:

On-Prem- SSIS, SSAS, SSRS

Microsoft Azure- EC2, S3, Azure Data Factory (ADF), Azure Databricks, Azure Data Lake Gen2 (ADLS), Azure Blob storage, Azure SQL Server, Azure Synapse

Version Control: Git

Data visualization tools: Power BI, Tableau, SQL Server Reporting Services (SSRS), Jupyter Notebook

Operating Systems: Windows, Linux

Microsoft Office Suite: MS Word, MS Excel, MS PowerPoint

Timeline

Data Engineer

Aeonix Research and Innovations LLP
05.2023 - 07.2023

Data Engineer

PricewaterhouseCoopers
07.2017 - 01.2022

Master of Science - Data Science And AI

Asian Institute of Technology

Master of Science - Computer Science And Engineering

University of Connecticut

Bachelor of Science - Computer Science And Engineering

Maulana Abul Kalam Azad University of Technology
Paramik Dasgupta