Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Ramkishore Pulakurthi

Plainsboro,NJ

Summary

Adept at driving data engineering innovations, I leveraged Pyspark and strategic problem-solving at Albertsons to enhance data processing efficiency by 30%. My expertise spans from ETL development to cloud migrations, underscored by a proven track record in optimizing data solutions and leading high-impact projects across diverse environments.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Data Engineer

Albertsons
11.2022 - Current
  • Building the Datamart according to the requirements of the C360 Business Team.
  • Involved in building the , Performing ETL operations and pushing the data to MongoDB using Databricks, Fetching the Birthday data from Protegrity API and storing the decrypted data into C360_Customer_Profile table.
  • Worked on cloud migration project transitioning from Azure to GCP.
  • Deploying the code through the CI/CD Pipeline using GitHub Actions.
  • Scheduled tasks utilizing Azure Databricks workflows,Airflow.
  • Monitoring the jobs and resolving the job failures.
  • Implemented data transformations through BigQuery Merge queries,CTE and Stored Procedures.
  • Developed ETL logic using Databricks (PySpark) and .
  • Managed version control and deployment of data applications using Git and Jenkins.
  • Involved in the Requirement Gathering, Analysis, Development, Unit Testing, Deployment of code, Bug fixes and Enhancements.
  • Implemented robust data warehousing solutions using tools such as Snowflake and BigQuery.

Data Engineer

CVS
06.2022 - 11.2022
  • Optimized Datamart structures with performance tuning methodologies, reducing query latency by 15%.
  • Designed ETL processes utilizing Azure Databricks and Pyspark.
  • Performed proof-of-concept using DeltaLake for ACID-compliant table operations.
  • Facilitated code deployment via CI/CD pipeline with GitHub Actions.
  • Involved in the Requirement Gathering, Analysis, Development, Unit Testing, Deployment of code, Bug fixes and Enhancements.
  • Unity Catalog in Databricks.
  • Spearheaded the development of scalable ETL pipelines using Azure Databricks(Pyspark), enabling real-time data processing and achieving a 30% increase in data processing efficiency.

Data Engineer

Swissre
05.2021 - 01.2022
  • Worked on Swissre Reinsurance project to migrate the ongoing functionality from RDBMS to Azure Cloud using Big Data Technologies.
  • Developed automated pipelines in Azure Data Factory to facilitate seamless migration of on-premise RDBMS data to Azure Data Lake.
  • Implemented ETL processes using Azure Databricks with PySpark, optimizing data processing efficiency.
  • Developing the pipelines using Azure Data Factory and scheduling those using scheduled Triggers in DataFactory.
  • Connecting the multiple pipelines together to run in series,parallel using Azure DataFactory.
  • Maintaining the config tables in the Azure SQL Database.
  • Utilized Python Lists, Sets, Dictionaries, Pandas
  • Implemented robust retry strategy(RETRYDAT) for failed pipeline execution with minimal human input, leveraging Azure Data Factory.
  • Maintaining the status of the pipelines in Azure SQL tables and also track of the reason for the Pipeline Failure using config tables, If any pipeline fails.
  • Handled data ingestion and storage using Parquet, CSV file formats, and optimized storage efficiency and query performance.

Senior Big Data Engineer

Axis Bank
04.2019 - 05.2021
  • Transitioned jobs from RDBMS/ETL to PySpark for increased performance.
  • Worked on Axis Bank internal services like Developing the calculator API Scripts using Python, sentimental analysis project to fetch the data from Twitter, Bing and Blogs and Transform and Load data into HIVE Tables.
  • Enhanced processing speed of Spark jobs.
  • Developed application code in Python programming language using Pandas, Lists, Tuples, Sets, Dictionaries, Functions and scheduled the jobs using the shell scripts.

Hadoop/Spark Developer

Daimler
02.2018 - 04.2019
  • Imported diverse automotive datasets into Hadoop Hive platforms.
  • Processed comprehensive automobile-related datasets including claims and manufacturing information.
  • Prepared datasets tailored to DataScience team requirements.
  • Conducted POCs on Azure for optimal component selection during migration from On-premise.

Hadoop/Spark Developer

PPG
04.2016 - 02.2018
  • Executed seamless integration of increasingly large datasets into the Hadoop framework.
  • Created scalable Hive scripts for data analysis.
  • Transformed ETL logic from RDBMS to Spark using Spark RDD's, Data Frames,Higher order functions,Traits,Error Handling Techniques.
  • Worked on Sqoop to import and export data from RDBMS.
  • Optimized performance of Hive tables through strategic partitioning and bucketing techniques.
  • Utilized multiple data formats such as CSV, ORC, and Parquet.
  • Executed data ingestion from local file systems to HDFS.
  • Constructed ETL pipelines to facilitate data loading into Hive tables using Spark jobs.

Oracle Developer

Veolia
11.2014 - 02.2016
  • Developed Procedures,Functions which are reusable in PLSQL.
  • Performing the DML Operations on the Oracle Tables and also made the performance optimization.
  • Developed Unix Shell Scripts to automate processes.

Oracle PL/SQL Developer

Vodafone
06.2013 - 02.2014
  • Developed Complex database objects like Stored Procedures, Functions, Packages and Triggers using SQL and PL/SQL.
  • Developed intricate SQL queries to extract data from databases.
  • Established partitions and indexes for tables.

Oracle Developer

TCS
10.2011 - 05.2013
  • Analyzed module requirements and facilitated peer discussions on necessary database objects.
  • Increased efficiency by refining indexing and partitioning techniques.
  • Developed advanced SQL queries,shell scripts to retrieve data from tables.

Education

B.Tech - Electronics and Communication Engineering

Jawaharlal Nehru Technological University
Hyderabad
01.2011

Skills

  • Python
  • Scala
  • AWS
  • Azure Blob Storage
  • Performance tuning
  • Data pipelines
  • Azure DataFactory
  • Azure DataLake
  • Azure Key Vault
  • Azure SQL Database
  • GCP(Google Cloud Platform)
  • Airflow
  • BigQuery
  • Azure DevOps
  • GitHub Actions
  • Oracle
  • MySQL
  • Snowflake
  • HDFS
  • YARN
  • Pyspark
  • Azure Databricks
  • Apache Spark
  • Data warehousing
  • ETL development
  • MapReduce
  • Hive
  • Pig
  • Sqoop
  • HUE UI
  • Cloudera
  • Git version control
  • Data governance
  • SQL expertise
  • Hadoop ecosystem
  • Data migration
  • Big data processing
  • Data quality assurance
  • API development
  • SQL and databases
  • Relational databases
  • Indexing strategies
  • Database optimization
  • PL/SQL Programming
  • Query optimization
  • SQL programming
  • Cloud migration

Certification

  • Azure Data Engineer

Timeline

Data Engineer

Albertsons
11.2022 - Current

Data Engineer

CVS
06.2022 - 11.2022

Data Engineer

Swissre
05.2021 - 01.2022

Senior Big Data Engineer

Axis Bank
04.2019 - 05.2021

Hadoop/Spark Developer

Daimler
02.2018 - 04.2019

Hadoop/Spark Developer

PPG
04.2016 - 02.2018

Oracle Developer

Veolia
11.2014 - 02.2016

Oracle PL/SQL Developer

Vodafone
06.2013 - 02.2014

Oracle Developer

TCS
10.2011 - 05.2013

B.Tech - Electronics and Communication Engineering

Jawaharlal Nehru Technological University
Ramkishore Pulakurthi