
Data Platform Engineer with over four years of experience in designing and building scalable, cloud-native data platforms on AWS. Expertise in real-time and batch data processing using Apache Kafka, Spark, and Snowflake, effectively supporting high-volume distributed systems with over 10 million events per day. Proficient in DataOps, ETL/ELT pipelines, and Infrastructure as Code (Terraform), emphasizing performance optimization, data governance, and reliability with an impressive 99.9% uptime. Committed to enabling analytics and AI/ML workloads through robust data architecture while fostering collaboration within cross-functional teams and mentoring engineers to deliver enterprise-grade, data-driven solutions.
Data Engineering & Architecture: Data Platform Engineering, Data Mesh (Data-as-a-Product), ETL/ELT Pipelines, Data Warehousing, Data Lakes, Distributed Systems
Programming & Querying: Python, SQL (Advanced Query Optimization), Shell Scripting
Big Data & Streaming: Apache Kafka, Apache Spark (Structured Streaming), Hadoop
Cloud & Infrastructure: AWS (S3, Lambda, IAM), Terraform (IaC), CI/CD Pipelines, Kubernetes, Docker
Data Platforms & Tools: Snowflake, dbt, Apache Airflow
Data Governance & Quality: Data Validation, Data Monitoring & Alerting, Data Security & Compliance
DevOps & DataOps: GitHub Actions, Infrastructure Automation, Observability (Prometheus, Grafana)
AI/ML Enablement: ML Pipelines, Predictive Analytics, Data Preparation for AI/ML Systems
Serverless computing
Collaboration and communication
Infrastructure as Code
Cloud infrastructure management
Linux system administration
DevOps methodologies