Summary
Overview
Work History
Education
Skills
Websites
Timeline
Hi, I’m

Shweta Dwivedi

Toronto,Canada
Shweta Dwivedi

Summary

Data Engineer with 5+ years of experience in PySpark, SQL, Python, Hive, Airflow, Databricks, DBT, AWS Redshift, S3, Glue, Snowflake —transforming complex data into clear, strategic business insights.

Overview

5
years of professional experience

Work History

Walmart

Data Engineer
08.2024 - Current

Job overview

  • Pioneered ETL/ELT automation & analytics using PySpark, Python, DBT, Spark SQL, Trino, and AWS Glue, coupled with Airflow for workflow management, streamlining data operations.
  • Migrated petabyte-scale S3 data to Delta Lake for schema enforcement, time travel, and faster analytics.
  • Built low-latency PySpark streaming with Kafka on Delta Lake & S3, driving operational insights.
  • Managed transactional data using Oracle, PostgreSQL, and Teradata, enhancing data storage and manipulation.
  • Enforced data quality via automated checks (AWS Deequ & Databricks DQx) to catch anomalies early.
  • Ingested API, Google Analytics, and external data into Snowflake/Redshift using PySpark and Airflow.
  • Optimized Spark jobs and SQL queries to minimize runtime and resource consumption.
  • Built CI/CD pipelines using GitHub Actions to automate testing and deployment of data workflows.
  • Created AI-driven code generation tools using Ollama LLM models for automated data modeling and version control.

Shopify

Data Engineer
08.2022 - 07.2024

Job overview

  • Led the development of scalable Databricks pipelines, leveraging Delta Lake for improved data reliability in a 10TB lake, enhancing query performance by 40%.
  • Extensive experience in ETL/ELT automation of data extraction, data cleaning and data preparation using Pyspark, Python, Spark SQL, AWS Glue, Airflow.
  • Automated data ingestion into Redshift using AWS Glue Crawlers, improving data availability for downstream users.
  • Built and managed secure data lakes on Amazon S3, enabling advanced analytics and machine learning workloads.
  • Designed and implemented data pipeline to process huge dataset by integrating 150 million raw records from 10+ data sources.
  • Successfully migrated legacy database systems to AWS cloud environments using AWS DMS, reducing infrastructure costs by 20% while ensuring zero data loss during the transition.
  • Implemented Hightouch reverse ETL to sync Redshift and Iceberg data into Salesforce and HubSpot.

Delta Airlines

Data Engineer
09.2021 - 07.2022

Job overview

  • Created data pipeline to migrate data from Oracle to Redshift saving $750,000 with a performance increase of 23%.
  • Developed and maintained PySpark scripts to automate ETL workflows between AWS S3, Glue, Hive, Redshift, and other data sources.
  • Developed and implemented a Python-based automation script for data quality checks, enhancing data integrity across 1M+ records.
  • Created Python Library to parse and reformat data from external vendors, reducing error in the data pipeline across 1M+.

JP Morgan

Data Engineer
09.2020 - 08.2021

Job overview

  • Developed PySpark and Python scripts to reconcile credit card and debit card transaction data using Data Vault.
  • Collaborated with business users to define data requirements for transaction analytics and Data Vault modeling.
  • Engineered scalable ETL pipelines using PySpark to efficiently process large volumes of transaction data, enabling timely and accurate financial reporting for key stakeholders.

Education

VIT University
Pune

Masters of Technology
01.2021

University Overview

GPA: 8.3

Skills

  • Python
  • Pyspark
  • AWS: S3, Lambda, Redshift, Sagemaker, Deequ, Glue, Step Functions, Athena
  • Databricks
  • Airflow
  • LakeHouse - Delta Lake / Iceberg
  • DBT
  • Snowflake
  • Hive
  • Oracle / Postgres
  • Spark Streaming & Kafka

Timeline

Data Engineer
Walmart
08.2024 - Current
Data Engineer
Shopify
08.2022 - 07.2024
Data Engineer
Delta Airlines
09.2021 - 07.2022
Data Engineer
JP Morgan
09.2020 - 08.2021
VIT University
Masters of Technology
Shweta Dwivedi