Experience Senior Data Engineer with around 7+ years in Data Architecture, Modeling, ETL, and Database Management and proficient in Statistical Analysis and Scalable Solutions for large datasets. Experience in the entire Software Development Life Cycle (SDLC) including Requirements Analysis, Design, Development, Testing, Deployment, and Support, with proficiency in Agile methodologies. Proficient in implementing Data warehouse solutions using demonstrated expertise in migrating data from on-premises databases to Confidential Redshift, RDS, S3, and Azure Data Lake. Experience in Data Integration and Data Warehousing using various ETL tools Informatica PowerCenter, AWS Glue, SQL Server Integration Services (SSIS), Talend. Strong knowledge in designing and developingdashboards by extracting data from different sources like SQL Server, Oracle, SAP, Flat Files, Excel files, XML Files. Hands on Experience with AWS Snowflake cloud data warehouse and AWS S3 bucket for integrating data from multiple source system which include loading nested JSON formatted data into Snowflake table. Expertise in Installation, Configuration, and Migration, Troubleshooting and Maintenance of Splunk. Expertise AWS Lambada function and API Gateway, to submit data via API Gateway that is accessible via Lambda function. Developed the Python automation script for consuming the Data subjects request from AWS snowflake tables and post the data to adobe analytics privacy API. Extensive experience in designing DataStage server and parallel jobs, data profiling, UNIX Shell scripting and SQL/PL SQL development. Experience in developing Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats. Creating Applications on Splunk to analyze Big Data. Proficient in AWS services like Lambda, Kinesis, DynamoDB, S3, Cloud Watch. Acted as build and release engineer, deployed the services by VSTS (Azure DevOps) pipeline. Created and Maintained pipelines to manage the IAC for all the applications. Extensive experience in DataStage/Quality stage development projects for data cleansing, data standardization (Name and Address standardization, US postal address verification, Geo coding etc.). Exposure to implementation and operations of data governance, data strategy, data management and solutions Expertise in Cloudera, Horton works Hadoop, and Azure systems handling massive data volumes in distributed environments. Strong understanding of data technologies for big data processing. Created an Azure SQL database, monitored it, and restored it. Migrated Microsoft SQL server to Azure SQL database. Experience with Azure Cloud, Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, Azure Analytical services, Big Data Technologies (Apache Spark), and Data Bricks is preferred. Use Lambda functions and Step Functions to trigger Glue Jobs and orchestrate the data pipeline. Extensive experience developing and implementing cloud architecture on Microsoft Azure. Hands on working capability with MuleSoft components, Mule Expression Language (MEL) workflow, Any point Studio, Enterprise Service Bus (ESB), API Manger and RAML, REST, SOAP. Design and develop Solutions using C#, ASP.NET Core, Web API, Microsoft Azure techniques. Excellent understanding of connecting Azure Data Factory V2 with a range of data sources and processing the data utilizing pipelines, pipeline parameters, activities, activity parameters, and manually/window-based/event-based task scheduling. Experience in developing enterprise-level solutions using batch processing (using Apache Pig) and streaming framework (using Spark Streaming, Apache Kafka & Apache Flink). Developed Automated scripts to do the migration using Unix shell scripting, Python, Oracle/TD SQL, TD Macros and Procedures. Worked on ETL Migration services by creating and deploying AWS Lambda functions to provide a serverless data pipeline that can be written to Glue Catalog and queried from Athena. Understand the latest features like (Azure DevOps, OMS, NSG Rules, etc..,) introduced by Microsoft Azure and utilize it for existing business applications. Developed ETL pipelines in and out of the data warehouse using a mix of Python and Snowflake’s Snow SQL and Writing SQL queries against Snowflake. Experience in Python programming with packages such as NumPy, Matplotlib, SciPy, and Pandas. Extensive experience creating Web Services with the Python programming language, including implementation of JSON-based RESTful and XML-based SOAP web services. Experience in writing complex Python scripts with Object-Oriented principles such as class creation, constructors, overloading, and modules. Proficient with BI tools like Tableau and Power BI, data interpretation, modeling, data analysis, and reporting with the ability to assist in directing planning based on insights. Designed and implemented end-to-end data pipelines, integrating diverse data sources seamlessly into Business Intelligence tools such as Tableau and Power BI, facilitating real-time analytics and reporting. Proficient with Azure Data Lake Services (ADLS), Databricks &python Notebooks formats, Databricks Delta lakes& Amazon Web Services (AWS). Utilized Azure Functions and event-driven architectures using Azure Event Grid, Azure Event Hub, and Azure Service Bus for building scalable and event-driven data processing workflows. Worked on Data Migration from Teradata to AWS Snowflake Environment using Python and BI tools. Proficient in Spark architecture, Spark Core, Spark SQL, and Spark Streaming. Skilled in PySpark for interactive analysis, batch processing, and stream processing applications. Developed shell scripts for job automation, which will generate the log file for every job. Extensive Spark Architecture experience in performance tuning, Spark Core, Spark SQL, Data Frame, Spark Streaming, Deployment modes, fault tolerance, and execution hierarchy for enhanced efficiency. Expertise in using Kafka for log aggregation solution with low latency processing and distributed data consumption and widely used Enterprise Integration Patterns (EIPs). Designed and developed Flink pipelines to consume streaming data from Kafka and applied business logic to massage and transform and serialize raw data. Translated Java code to Scala code as part of Info sum pipeline build. Spark-Streaming APIs facilitated real-time transformations and actions for the common learner data model, ingesting data from Kinesis in near real-time.