Summary

Overview

Work History

Education

Skills

Websites

Timeline

Shweta Dwivedi

Toronto,Canada

Summary

Data Engineer with 6+ years of experience in PySpark, SQL, Python, Hive, Airflow, Databricks, DBT, AWS Redshift, S3, Glue, Snowflake —transforming complex data into clear, strategic business insights.

Overview

years of professional experience

Work History

Data Engineer

Walmart

08.2024 - Current

Pioneered ETL/ELT automation & analytics using PySpark, Python, DBT, Spark SQL, Trino, and AWS Glue, coupled with Airflow for workflow management, streamlining data operations.
Migrated petabyte-scale S3 data to Delta Lake for schema enforcement, time travel, and faster analytics.
Built low-latency PySpark streaming with Kafka on Delta Lake & S3, driving operational insights.
Managed transactional data using Oracle, PostgreSQL, and Teradata, enhancing data storage and manipulation.
Enforced data quality via automated checks (AWS Deequ & Databricks DQx) to catch anomalies early.
Ingested API, Google Analytics, and external data into Snowflake/Redshift using PySpark and Airflow.
Optimized Spark jobs and SQL queries to minimize runtime and resource consumption.
Built CI/CD pipelines using GitHub Actions to automate testing and deployment of data workflows.
Created AI-driven code generation tools using Ollama LLM models for automated data modeling and version control.

Data Engineer

Shopify

08.2022 - 07.2024

Led the development of scalable Databricks pipelines, leveraging Delta Lake for improved data reliability in a 10TB lake, enhancing query performance by 40%.
Extensive experience in ETL/ELT automation of data extraction, data cleaning and data preparation using Pyspark, Python, Spark SQL, AWS Glue, Airflow.
Automated data ingestion into Redshift using AWS Glue Crawlers, improving data availability for downstream users.
Built and managed secure data lakes on Amazon S3, enabling advanced analytics and machine learning workloads.
Designed and implemented data pipeline to process huge dataset by integrating 150 million raw records from 10+ data sources.
Successfully migrated legacy database systems to AWS cloud environments using AWS DMS, reducing infrastructure costs by 20% while ensuring zero data loss during the transition.
Implemented Hightouch reverse ETL to sync Redshift and Iceberg data into Salesforce and HubSpot.

Data Engineer

Delta Airlines

11.2020 - 07.2022

Created data pipeline to migrate data from Oracle to Redshift saving $750,000 with a performance increase of 23%.
Developed and maintained PySpark scripts to automate ETL workflows between AWS S3, Glue, Hive, Redshift, and other data sources.
Developed and implemented a Python-based automation script for data quality checks, enhancing data integrity across 1M+ records.
Created Python Library to parse and reformat data from external vendors, reducing error in the data pipeline across 1M+.

Data Engineer

JP Morgan

03.2018 - 11.2020

Developed PySpark and Python scripts to reconcile credit card and debit card transaction data using Data Vault.
Collaborated with business users to define data requirements for transaction analytics and Data Vault modeling.
Engineered scalable ETL pipelines using PySpark to efficiently process large volumes of transaction data, enabling timely and accurate financial reporting for key stakeholders.

Education

Masters of Technology -

VIT University

Pune

01.2021

Skills

Python
Pyspark
AWS: S3, Lambda, Redshift, Sagemaker, Deequ, Glue, Step Functions, Athena
Databricks
Airflow
LakeHouse - Delta Lake / Iceberg

DBT
Snowflake
Hive
Oracle / Postgres
Spark Streaming & Kafka

Websites

https://www.linkedin.com/in/shweta-dwivedi-03854420b

Timeline

Data Engineer

Walmart

08.2024 - Current

Data Engineer

Shopify

08.2022 - 07.2024

Data Engineer

Delta Airlines

11.2020 - 07.2022

Data Engineer

JP Morgan

03.2018 - 11.2020

Masters of Technology -

VIT University

Shweta Dwivedi

Summary

Overview

Work History

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Education

Masters of Technology -

Skills

Websites

Timeline

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Masters of Technology -

Similar Profiles

Curtis LamarCurtis Lamar

Jalonda PindellJalonda Pindell

Estella Armand CharlesEstella Armand Charles

Braden MooreBraden Moore

Noina KhanNoina Khan