Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

VENKATESHWAR REDDY KODIGANTI

San Jose

Summary

Dynamic Associate Data Engineer at HCL with expertise in designing ETL pipelines and optimizing data workflows using Azure and Spark. Proven ability to enhance data accuracy and streamline processes through collaboration and advanced SQL skills. Adept at delivering impactful insights and developing RESTful APIs, driving data-driven decision-making.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Associate Data Engineer

HCL
Ashburn
06.2024 - Current
  • Designed and developed ETL pipelines in Azure by moving data from Azure Data Lake to various target systems using Azure Data Factory (ADF), Spark, and Scala.
  • Utilized PySpark and Spark SQL to perform complex data transformations and optimize large-scale data processing workflows.
  • Built robust batch-processing workflows for e-commerce and transactional data, ensuring consistency and accuracy across multiple datasets.
  • Engineered data ingestion pipelines from Snowflake, MS SQL, MongoDB, and Teradata into distributed storage, supporting advanced analytics and reporting needs.
  • Optimized dynamic workloads by leveraging Snowflake's performance and auto-scaling capabilities.
  • Implemented end-to-end PySpark-based solutions in Azure Databricks, orchestrating process via ADF and scheduling with automation tools.
  • Constructed CI/CD pipelines using PowerShell, Bash, YAML, JSON, and Git for deployment via Azure Resource Manager (ARM) templates.
  • Diagnosed and resolved data processing issues using Spark, ensuring SLA adherence and enhanced performance of ETL scripts.
  • Collaborated with business teams and stakeholders to validate data accuracy and support reporting through advanced SQL and Python-based data handling.
  • Delivered visual insights using Power BI, Tableau, and Excel pivot tables to support business decision-making.
  • Developed RESTful APIs with Python (Flask) and built GraphQL endpoints to streamline data access across multiple services.
  • Engineered and deployed Azure-based PaaS and IaaS architectures in cross-functional collaboration to ensure optimal performance and cost efficiency.

Junior Data Engineer

TWINAPPS SOFTWARE
Hyderabad
01.2021 - 08.2022
  • Supported senior engineers in developing ETL pipelines using AWS Glue, Lambda, and S3 to process data from APIs and flat files.
  • Gained practical experience working with Amazon Redshift and RDS for data storage, transformation, and query optimization.
  • Assisted in automating data workflows using Apache Airflow and AWS Step Functions, enhancing overall pipeline reliability and reducing manual effort.
  • Engaged in team discussions, testing cycles, and documentation processes to gain a deeper understanding of full project life cycles.
  • Contributed to creating and testing Spark jobs for log data processing, aiding in the development of real-time analytics use cases.
  • Collaborated on data profiling and cleansing tasks using Glue Data Brew to prepare customer datasets.
  • Developed and executed Python scripts for data extraction and validation between S3 and Redshift environments.
  • Explored AWS Athena for querying structured data and utilized Terraform for setting up key infrastructure components.

Education

Master of Science - Business Analytics

University of the Pacific
Stockton, CA
05.2024

Skills

  • Programming & Scripting: Python, R, SQL
  • Python Libraries: Pandas, NumPy, Scikit-learn, PySpark, Matplotlib
  • Big Data Ecosystem: Spark, Kafka, Apache Airflow
  • Cloud Platforms: AWS (EMR, EC2, RDS, S3), Azure (Data Factory, Blob Storage, Databricks), GCP
  • Data Warehouses: Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse Analytics, Apache Hive, Teradata, Oracle
  • Data Engineering: Data pipeline design, Data product development, ETL processes, Data Governance, Data modeling
  • Tools & Methodologies: Software Development Life Cycle, Project Life Cycle, Data Systems Understanding, GitLab, GitHub
  • ETL pipeline development
  • Data transformation
  • SQL querying
  • RESTful API development
  • Data governance
  • Data analytics
  • Workflow automation
  • Stakeholder collaboration
  • Data pipeline design
  • Data modeling
  • Data warehousing
  • SQL programming
  • Data migration
  • Data integration
  • Real-time processing
  • Big data analytics
  • Data analysis
  • Database security
  • RDBMS
  • Amazon redshift
  • Data acquisitions

Certification

Deloitte Australia: Data Analytics job simulation

Timeline

Associate Data Engineer

HCL
06.2024 - Current

Junior Data Engineer

TWINAPPS SOFTWARE
01.2021 - 08.2022

Master of Science - Business Analytics

University of the Pacific
VENKATESHWAR REDDY KODIGANTI