Summary
Overview
Work History
Education
Skills
Timeline
Generic

Phanindra P Babu

Fremont

Summary

Results-driven Cloud and Data Engineer with 8+ years of experience in Risk Analytics and enterprise data platforms. Demonstrated strong expertise in cloud engineering and platform work by building, managing, and optimizing scalable distributed systems and cloud-native architectures on GCP and AWS to support enterprise data and analytics workloads. Participated in cloud migration initiatives, including modernizing existing code, performance optimization, and addressing compatibility issues. Actively contributed to end-to-end migration projects by refactoring legacy applications, improving system efficiency, and ensuring seamless transition to cloud-native environments. Strong expertise in Google Cloud Platform (GCP) and AWS cloud environments. Hands-on experience with core GCP services including Compute Engine, IAM, Cloud Storage, VPC Networking, and Cloud Monitoring. Proven experience in cloud migration and application modernization programs. Skilled in refactoring legacy applications for cloud-native architectures. Expertise in Python, PySpark, SQL, and automation scripting for data engineering solutions. Experience in performing Python code remediation during cloud migration efforts. Strong ability to update dependencies and modernize legacy codebases. Experience in restructuring and optimizing data pipelines for scalability and performance. Hands-on development experience directly within GCP cloud environments. Strong working knowledge of Docker for containerizing applications and workloads. Experience with Kubernetes orchestration (pods, deployments, services, scaling). Exposure to Red Hat OpenShift for enterprise containerized deployments. Proven ability to work within CI/CD pipelines using Jenkins, GitHub, and Docker. Experience deploying and managing applications in containerized production environments. Strong troubleshooting skills for code-level and platform-level issues in cloud environments. Experience resolving issues during cloud migration, deployment, and workload scaling. Skilled in incident management, root cause analysis (RCA), and production support. Experience working with cross-functional teams including DevOps, infrastructure, and risk analytics teams. Hands-on experience working with Hugging Face Transformers and LLM-based solutions for NLP, automation, and analytics use cases.

Overview

10
10
years of professional experience

Work History

Senior Data Engineer

Metlife
01.2025 - Current
  • Demonstrated strong expertise in cloud engineering and platform work by designing, building, and maintaining scalable cloud-native and distributed data systems for enterprise risk and analytics workloads.
  • Demonstrated working proficiency in distributed systems and Google Cloud Platform (GCP), including scalable data storage and processing solutions.
  • Worked extensively on GCP services including Compute Engine, IAM, Cloud Storage, VPC Networking, Cloud Monitoring, and BigQuery.
  • Participated in cloud migration initiatives, including modernization of legacy applications, performance optimization, and resolving compatibility issues.
  • Performed Python code remediation and modernization, including dependency upgrades, refactoring legacy code, and restructuring pipelines.
  • Designed and optimized data pipelines and ETL workflows using Python, PySpark, SQL, and BigQuery for large-scale data processing.
  • Developed automation scripts using Python, SQL, Pandas, and Shell scripting for data validation, reconciliation, and operational efficiency.
  • Worked in containerized environments using Docker and Kubernetes, managing pods, deployments, services, and scaling workloads.
  • Supported application deployments and troubleshooting in OpenShift environments for enterprise platform stability.
  • Built and maintained CI/CD pipelines using Jenkins, GitHub, and Docker for automated build, test, and deployment processes.
  • Implemented monitoring and observability using GCP Cloud Monitoring, Logging, and Alerting tools to ensure system reliability.
  • Participated in incident management, including production support, root cause analysis (RCA), and preventive solutions.
  • Collaborated with cross-functional teams including DevOps, infrastructure, data engineering, and risk analytics teams.
  • Improved system performance through pipeline optimization, workload tuning, and cloud resource efficiency improvements.
  • Supported data governance and risk analytics workflows ensuring compliance with enterprise standards.
  • Actively contributed to real-time troubleshooting of cloud issues during deployment, migration, and scaling activities.
  • Enhanced system observability and operational stability through improved dashboards, logging, and monitoring alerts.
  • Environment: GCP (Compute Engine, IAM, Cloud Storage, VPC Networking, BigQuery, Cloud Monitoring, Cloud Logging), Python, PySpark, SQL, Pandas, Shell Scripting, Docker, Kubernetes, OpenShift, Jenkins, GitHub, CI/CD, Apache Spark, Hadoop Ecosystem, Kafka, Airflow.

Hadoop Developer

Deutsche Bank
India
11.2018 - 01.2023
  • Contributed to large-scale distributed data platforms handling critical banking data with strong focus on accuracy, consistency, and reliability.
  • Participated in cloud migration initiatives, supporting transition of legacy batch systems to scalable distributed and cloud-ready architectures.
  • Developed and maintained batch and streaming ETL pipelines using Apache Spark, Hive, and SQL for financial and risk reporting datasets.
  • Supported data processing workflows for regulatory and risk reporting requirements, ensuring timely and accurate data delivery.
  • Assisted in modernization of legacy applications, including refactoring batch jobs and improving processing efficiency.
  • Automated operational tasks using Python and Shell scripting, improving workflow efficiency and reducing manual intervention.
  • Supported production data pipelines in 24x7 banking operations, ensuring job stability and SLA adherence.
  • Performed incident support, troubleshooting, and root cause analysis (RCA) for data pipeline failures and performance issues.
  • Worked closely with infrastructure and DevOps teams to support deployment and stability of data applications in enterprise environments.
  • Contributed to performance optimization of Spark and Hive jobs, reducing execution time and improving system efficiency.
  • Assisted in implementing controlled deployment processes using CI/CD pipelines (Jenkins, Git, Bitbucket) aligned with banking governance standards.
  • Collaborated with business, risk, and analytics teams to deliver high-quality datasets for decision-making and reporting.
  • Ensured adherence to data governance, compliance, and audit requirements across all data processing workflows.
  • Worked in a highly regulated banking environment supporting enterprise risk analytics and financial data processing systems.
  • Environment: Hadoop Ecosystem (HDFS, YARN, Hive, MapReduce), Apache Spark, SQL, Python, Shell Scripting, Kafka, Sqoop, Flume, Oozie, Oracle, Linux/Unix, Jenkins, Git, Bitbucket, Autosys.

Big-Data Engineer/ Hadoop developer

Hathway Cable & Datacom
India
12.2015 - 04.2018
  • Worked on enterprise data processing and reporting systems in the telecom domain.
  • Developed and supported ETL workflows for customer, billing, and operational data using SQL and PL/SQL.
  • Built and maintained batch data processing jobs using Hadoop ecosystem tools (Hive, MapReduce) for large-scale data handling.
  • Worked on data extraction and integration processes using Sqoop to move data between relational databases and Hadoop.
  • Assisted in designing and optimizing relational database structures in Oracle and MySQL for reporting and analytics use cases.
  • Developed SQL queries, stored procedures, and performance tuning techniques to improve data retrieval efficiency.
  • Wrote UNIX Shell scripts to automate data loads, scheduling, and file processing activities.
  • Supported production batch jobs and scheduled workflows, ensuring timely execution and resolving job failures.
  • Worked on data validation and reconciliation processes to ensure accuracy between source and target systems.
  • Participated in basic debugging and troubleshooting of ETL failures in production environments.
  • Assisted senior engineers in Hadoop-based data processing tasks and job monitoring activities.
  • Contributed to report generation and business data extracts for operational teams.
  • Maintained technical documentation for ETL processes, workflows, and database objects.
  • Gained hands-on exposure to big data technologies and distributed systems fundamentals during early career phase.
  • Environment: Hadoop (HDFS, Hive, MapReduce), SQL, PL/SQL, Oracle, MySQL, UNIX Shell Scripting, Sqoop, Informatica (basic exposure if applicable), Linux, Autosys.

Education

Master of Science - Information Technology Management

Golden Gate University
San Francisco, CA
04-2025

Skills

  • Apache Spark
  • Apache Kafka
  • Hadoop
  • Hive
  • Impala
  • Sqoop
  • Amazon Web Services
  • AWS Lambda
  • AWS EC2
  • AWS EMR
  • AWS S3
  • AWS Glue
  • AWS Step Functions
  • AWS SNS
  • AWS CloudWatch
  • AWS IAM
  • AWS Athena
  • AWS Redshift
  • Google Cloud Platform
  • GCP Big Query
  • Microsoft Azure
  • Azure Data Factory
  • SQL
  • PL/SQL
  • HiveQL
  • Python
  • Java
  • Scala
  • Oracle
  • MySQL
  • Microsoft SQL Server
  • MS-SQL
  • UNIX Shell Scripting
  • Apache Airflow
  • Jenkins
  • Git
  • Bitbucket
  • JIRA

Timeline

Senior Data Engineer

Metlife
01.2025 - Current

Hadoop Developer

Deutsche Bank
11.2018 - 01.2023

Big-Data Engineer/ Hadoop developer

Hathway Cable & Datacom
12.2015 - 04.2018

Master of Science - Information Technology Management

Golden Gate University
Phanindra P Babu