Summary
Overview
Work History
Education
Skills
Projects
Certification
Timeline
Generic

APURVAKUMAR PATEL

Windsor,ON

Summary

Over 1 year of work experience in Big Data and Data Engineering. Building Data Pipelines, Partitioning HDFS, Data Migration using Big Data tools and technologies. Industrial experienced in Data Processioning and Data Streaming and Big Data Analytics. Well-established communication, teamwork, and problem-solving skills established while working collaboratively with project teams, colleagues, and clients.

Overview

1
1
year of professional experience
1
1
Certification

Work History

Data Engineer

LTIMindtree
01.2022 - 12.2022
  • Implemented automation through Apache Airflow to streamline data processing workflows and reduce manual intervention by 40%, improving overall reliability and minimizing errors in data pipelines.
  • Developed and executed comprehensive data processing strategy using PySpark, Apache Hive, and Apache Pig in Microsoft Azure cloud computing environment, resulting in 30% increase in data pipeline efficiency.
  • Utilized Azure Cosmos DB, Azure SQL Storage, and Azure Data Lake to effectively manage large volumes of data, resulting in a 20% reduction in storage costs while maintaining high performance and scalability.
  • Managed and maintained Apache Hive for data warehousing, improving query performance by 40%.
  • Streamlined data transformations and ETL tasks with Apache Pig, reducing processing errors by 25%.
  • Designed and automated data workflows with Apache Airflow, leading to a 50% increase in data processing efficiency.

Education

Master of Applied Computing -

University of Windsor
Windsor, ON
06.2024

Bachelor of Engineering - Computer Engineering

Gujarat Technological University
Gujarat,India
06.2021

Skills

Technical Skills

  • Big Data Technologies: Apache Spark, Apache Hadoop, Hive, Apache Pig, Apache Airflow
  • Programming Languages: Python (expert), Java (intermediate), C, and Bash Scripting
  • Databases: MS SQL Server, MySQL, PostgreSQL, Azure Cosmos DB, Azure SQL
  • Frameworks: Pandas, NumPy, and PySpark
  • Cloud Platform: Microsoft Azure, Azure Databricks
  • Project Management Tools: Jira, GIT
  • Microsoft Office Suite - Word, Excel, Powerpoint
  • Soft Skills: Strong Work Ethics, Ability to work in team, Problem-solving skills, Analytical/Quantitative skills, Written Communication

Projects

Sentiment Analysis of Tweets of Top 10 US Airlines

  • Developed a Python-based Dashboard using Streamlit to analyze customer sentiment in tweets about their travel experiences with leading US airlines.
  • Employed multiple machine learning classification algorithms to categorize tweet content into positive, neutral, or negative sentiments.
  • Visualized geographical data to identify states and airports where customers expressed varying sentiments, enabling airlines to target specific areas for service improvements.
  • Created word clouds for positive, neutral, and negative words found in tweets, aiding in prioritizing and strategizing actionable improvements based on customer feedback.

Academic Team Project - Client Server File Transfer with Load Balancer 

  • Developed a C++ client-server file transfer system for a master's program.
  • Allowed clients to request and receive compressed files (tar.gz) from the server.
  • Collaborated in a team with specific responsibilities:
  • Apurvakumar Patel: Server and Mirror(another server) implementation.
  • Sagar Vivek Pandya: Client development.
  • Supported Linux and Windows Subsystem for Linux.
  • Server handled concurrent connections with forked child processes.
  • Client featured user-friendly commands for file requests, searches, and filtering.
  • Ensured server and mirror ran on distinct machines/terminals.
  • Managed client distribution between server and mirror, processing the first six connections with the server and the next six with the mirror.
  • Implemented alternating client connections between the server and mirror for subsequent requests.

Certification

  • Python for Everybody Specialization - University Of Michigan, Jun 2020
  • Continuous Integration and Continuous Delivery With Gitlab - LinkedIn Learning, April 2023
  • Learning Hadoop - LinkedIn Learning, April 2023
  • Linux Server Management and Security - University Of Colorado, April 2023

Timeline

Data Engineer

LTIMindtree
01.2022 - 12.2022

Master of Applied Computing -

University of Windsor

Bachelor of Engineering - Computer Engineering

Gujarat Technological University
APURVAKUMAR PATEL