Summary
Overview
Work History
Education
Skills
Projects
Timeline
Generic

JAHNAVI PUNURU Data Engineer

Summary

4+ years of experience in working with big data platforms like Hadoop, Spark, and Hive for processing large-scale datasets. Proficient in designing and implementing scalable and reliable data architectures using cloud-based services like AWS, Google Cloud, and Azure. Skilled in data visualization using tools like Tableau, Power BI, and D3.js to effectively communicate insights and findings to stakeholders. Experienced in data quality management and data governance practices to ensure data accuracy and compliance. Strong understanding of machine learning algorithms and techniques, with experience in implementing data-driven solutions for predictive modeling and analysis. Proven ability to work in cross-functional teams and collaborate effectively with business stakeholders, data scientists, and software developers. Adept at identifying and resolving data quality issues, ensuring accuracy and completeness of data for downstream analytics and reporting. Experienced in working with both structured and unstructured data, as well as a variety of data storage and processing technologies such as Hadoop, Spark, and AWS services. Skilled in designing and implementing scalable and efficient data solutions to support business needs, with a focus on performance optimization and cost reduction. Knowledgeable in data security and privacy regulations, and able to implement appropriate security measures to protect sensitive data. Effective communicator and collaborator, able to work closely with cross-functional teams including data scientists, analysts, and business stakeholders to understand requirements and deliver solutions that meet their needs. Experienced in using Jira for agile project management, issue tracking, and team collaboration. Proficient in configuring Jira workflows, issue types, custom fields, and project permissions to meet project requirements. Experienced in using Confluence as a collaborative wiki for documentation, knowledge sharing, and team communication. Proficient in creating and organizing Confluence spaces, pages, and templates to support project documentation and workflows. Experienced in using Git and version control systems, such as GitHub and Bitbucket, for source code management, collaboration, and continuous integration and deployment. Proficient in creating and managing Git repositories, branches, tags, and pull requests. Skilled in configuring and using CI/CD tools, such as Travis CI and Jenkins, to automate the software development and deployment process. Experienced in designing and implementing relational and non-relational databases, such as MySQL, PostgreSQL, and MongoDB, to support data-driven applications and systems. Proficient in creating and optimizing SQL queries and stored procedures to extract and transform data. Skilled in database administration tasks, such as backup and recovery, performance tuning, and security management. Proficient in data warehousing concepts and technologies, including data integration, data aggregation, and data governance. Experienced in developing and maintaining data quality standards and processes. Skilled in building and optimizing data pipelines for large-scale data processing and analysis. Expertise in designing and implementing scalable and efficient data storage solutions. Proven ability to collaborate with cross-functional teams and stakeholders to understand business requirements and deliver data-driven solutions. Familiarity with cloud-based data platforms such as AWS, Azure, and Google Cloud. Strong analytical and problem-solving skills, with a keen eye for detail. Knowledge of machine learning and data science concepts, with experience in deploying models to production environments. Proficient in data visualization tools and techniques to effectively communicate insights to stakeholders.

Overview

5
5
years of professional experience

Work History

Data Engineer

RBC
04.2021 - Current
  • Built and maintained data pipelines for real-time and batch processing using tools such as Apache Kafka, Apache Airflow, and AWS Lambda
  • Worked with data scientists to develop and implemented machine learning models and deployed them into production using technologies such as TensorFlow, PyTorch, and sci-kit-learn
  • Developed and maintained data validation and reconciliation processes to ensure data accuracy and consistency across multiple systems
  • Designed and implemented data encryption and data masking techniques to protect sensitive data and comply with data privacy regulations such as GDPR and CCPA
  • Conducted performance testing and benchmarking of data processing systems using tools such as JMeter and Gatling to ensure scalability and efficiency
  • Built and maintained data processing infrastructure using containerization technologies such as Docker and Kubernetes, and orchestrated data processing workflows using tools such as Apache Beam and Apache Flink
  • Designed and implemented data archiving and retention policies to ensure proper historical data management and compliance with legal and regulatory requirements
  • Developed and maintained data integration solutions using tools such as Talend and Informatica and conducted data mapping and data transformation activities to ensure seamless integration with external systems
  • Developed and implemented data governance policies and procedures to ensure the accuracy, integrity, and availability of data, and conducted data profiling and data quality assessments to identify and resolved data quality issues
  • Designed and implemented ETL workflows using Python and SQL to process and store large volumes of structured and unstructured data and conducted performance tuning and optimization of data processing pipelines using Spark and Hadoop
  • Developed and maintained data models and architectures for various projects such as customer profiling and personalization, optimized query performance using SQL and NoSQL databases, and developed and maintained automated data quality checks to ensure data accuracy and completeness
  • Developed and maintained data visualization dashboards using Tableau and Power BI and integrated them with the data processing pipelines to provide real-time insights and conducted data modeling and schema design for various projects such as customer segmentation and churn prediction
  • Designed and implemented data warehousing systems using Snowflake and Redshift, optimized query performance using indexing and partitioning techniques, and developed and maintained custom Python libraries for data processing and machine learning
  • Conducted data profiling and analysis to identify trends and insights and developed custom Python scripts and SQL queries to automate data analysis and implemented error handling and recovery mechanisms for data processing pipelines using tools such as Apache NiFi
  • Deployed data processing and analytics systems on cloud platforms such as AWS using services such as EMR, EC2, and S3, implemented security measures such as VPCs and IAM roles, and developed and maintained disaster recovery and business continuity plans for data processing and analytics systems.

Data Engineer

HDFC Life
01.2019 - 03.2021
  • Analyzed complex data sets using SQL, Excel, and Python to identify trends, patterns, and insights to inform business decisions
  • Developed and maintained dashboards and reports using visualization tools like Tableau and Power BI to provide actionable insights to stakeholders
  • Conducted data profiling and data quality assessments to ensure data accuracy and completeness and identified data quality issues
  • Designed and implemented data models and schemas to support data analysis and reporting requirements
  • Collaborated with cross-functional teams to develop and implement data-driven strategies to improve business operations and drive growth
  • Developed and maintained ETL processes to extract, transform, and load data from various sources into the data warehouse
  • Identified and analyzed key performance indicators (KPIs) to monitor business performance and made data-driven recommendations to improve KPIs
  • Developed and maintained predictive models using machine learning techniques to forecast business outcomes and identified potential risks and opportunities
  • Conducted ad-hoc analysis and data mining to support business initiatives and answered complex business questions
  • Communicated insights and findings to stakeholders through presentations, reports, and visualizations, ensuring clear and concise messaging for both technical and non-technical audiences
  • Conduct statistical analysis on large datasets to identify patterns and trends, using techniques such as regression, clustering, and classification
  • Developed and maintained data dictionaries and data catalogs to ensure the consistency and accuracy of data across the organization
  • Collaborated with business stakeholders to understand their requirements and translate them into data analysis and reporting solutions
  • Monitor data quality metrics to ensure compliance with established data quality standards and identify opportunities for improvement
  • Developed and maintained automated data pipelines to streamline data collection, processing, and analysis
  • Developed and maintained machine learning models to predict customer behavior, forecast sales, and identify potential risks and opportunities
  • Monitored industry trends and best practices in data analysis and reporting to ensure continuous improvement of data-related processes and capabilities
  • Identified data sources and conduct data discovery to ensure that all relevant data is available for analysis.

Education

Bachelor Of Engineering in Computer Science and Engineering -

04.2019

Skills

  • SQL
  • Hadoop
  • Spark
  • Kafka
  • Python
  • Java
  • Apache Nifi
  • AWS
  • Azure
  • PowerBI
  • Tableau
  • Data Warehousing
  • Data Modelling
  • ETL
  • Amazon Redshift
  • Google BigQuery
  • Snowflake
  • GDPR
  • HIPAA
  • GIT
  • Agile methodologies

Projects

Social Media Analytics Project, 12/2018

Analyzed social media data to gain insights into customer behavior and preferences., Used Python to collect and preprocess data from social media platforms like Twitter, Instagram, and Facebook. Apply data visualization techniques to create engaging dashboards and reports that highlight trends and patterns in customer behavior. Used sentiment analysis to understand customer sentiment towards products or services., Provided insights to business stakeholders on how to improve customer experience and engagement on social media platforms.


 Health Care Analytics Project, 06/2017

Analyzed healthcare data to identify potential areas for improvement in patient care and outcomes., Collected and preprocess data from electronic health records (EHRs) using SQL. Developed statistical models and predictive analytics to identify risk factors for certain diseases and health conditions. Used data visualization tools like Tableau to present findings and insights to stakeholders., Provided recommendations to healthcare providers on how to improve patient outcomes and reduce healthcare costs through data-driven insights.

Timeline

Data Engineer

RBC
04.2021 - Current

Data Engineer

HDFC Life
01.2019 - 03.2021

Bachelor Of Engineering in Computer Science and Engineering -

JAHNAVI PUNURU Data Engineer