Senior Data Engineer with 14+ years of experience designing and optimizing scalable data pipelines using Big Data (CDP), Spark, Spark Streaming, PySpark, Hadoop, Java/J2EE, Scala, Python, Kafka, Apache NIFI, AWS, GCP, AKKA, and ZIO technologies. Led successful projects at top financial institutions to enhance data processing efficiency and achieve significant performance improvements. Skilled in data modeling, ETL processes, and cross-functional collaboration to deliver impactful data solutions using agile methodologies. Proficient in applying AI/ML algorithms, statistical methods, and data visualization techniques to uncover insights and optimize processes. Adept at working with large datasets using Python and SQL programming languages and tools such as PySpark, Hadoop, and Tableau. Strong analytical and problem-solving skills with a track record of delivering actionable insights. Committed to continuous learning and staying updated with the latest trends in data science and artificial intelligence to support data-driven decision-making. Expertise in building real-time data streaming solutions using Spark Streaming, Kafka Streams, AKKA Streams, Apache Fink, Apache NIFI, and Flume. Designed and implemented high-performance and scalable solutions using various Hadoop ecosystem tools like Pig, Hive, Sqoop, Spark, Zookeeper, Solr, and Kafka. Designed, configured, and deployed Amazon Web Services (AWS) for multiple applications utilizing the AWS stack (EMR, EC2, S3, RDS, Redshift, Cloud Formation, Glue, Cloud Watch SQS, and IAM), focusing on high availability fault tolerance and auto-scaling. Experience in application design and implementation using the GCP stack (Virtual machines Cloud function Cloud run Cloud Prod Cloud SQL Big-Query Airflow STS APIGEE Databricks Google storage and Cloud Logger). Experience in implementing modern architecture solutions like Lakehouse event streaming microservices and domain-driven design architecture patterns.
Java 8