Accomplished Data Engineer with 6+ years of experience in designing and implementing scalable, secure, and cloud-native data architectures across AWS, Azure, and GCP. Strong expertise in building, optimizing, and maintaining ETL/ELT pipelines for batch and real-time ingestion using Azure Data Factory, AWS Glue, Databricks, and Apache Airflow. Hands-on experience in developing enterprise-grade data warehouses using Snowflake, Redshift, BigQuery, and Azure Synapse Analytics, supporting large-scale analytics and reporting. Skilled in distributed processing and advanced analytics using Apache Spark, PySpark, Databricks, and Hadoop ecosystem components. Proficient in real-time data streaming technologies including Kafka, Azure Event Hub, and Amazon Kinesis for transactional, IoT, and log data pipelines. Strong programming and scripting expertise in Python, SQL, and Shell, with proven ability to implement custom UDFs and complex transformation logic. Experienced in designing and managing modern data lakes (Azure Data Lake Gen2, AWS S3, GCP Cloud Storage) to support structured, semi-structured, and unstructured datasets. Adept at developing OLAP/OLTP systems, data models, and data marts that enhance business intelligence, reporting accuracy, and decision-making speed. Demonstrated success in automating pipeline deployments, scheduling, and monitoring using CI/CD tools such as Jenkins, Azure DevOps, and GitLab. Strong knowledge of data governance, quality, and security best practices including IAM, encryption, partitioning strategies, and cost optimization. Experienced with visualization and BI tools like Power BI, Tableau, and Kibana to enable self-service analytics and enterprise-wide insights. Proficient in working with relational databases (Oracle, MySQL, SQL Server, PostgreSQL) and NoSQL stores (MongoDB, Cassandra, DynamoDB). Collaborative communicator with proven ability to partner with cross-functional teams and stakeholders to deliver high-impact, business-aligned data solutions. Committed to delivering cost-efficient, scalable, and future-ready data engineering solutions with a focus on performance, maintainability, and operational excellence.
Environment: Azure Data Factory, DataBricks, Snowflake, SQL, Python, PySpark, Azure Logic Apps, Azure Event Hub, Azure Stream Analytics, Azure DevOps, Oracle, Teradata, MongoDB, Elasticsearch, Kibana, OLAP, OLT.
Environment: AWS (Redshift, S3, Glue, EMR, Lambda, Kinesis, Athena, Cloud Watch, EC2, VPC), Python, PySpark, SQL, Hive, Spark, ETL/ELT Frameworks, Power BI, Tableau, Jenkins, Unix/Linux, Data Warehousing, Data Governance & Security.
Environment: Azure Data Lake Gen2, Azure Data Factory (ADF), Azure Data bricks, Azure Synapse Analytics, Azure SQL Database, Azure Event Hub, Azure Stream Analytics, PySpark, Spark SQL, Python, Power BI, SQL Server, Delta Lake.