Cloud and Data Engineering Architect with extensive expertise in big data processing, performance optimization, and modern data processing frameworks. Proficient in building scalable, fault-tolerant systems leveraging Kubernetes, Hadoop, and Spark for distributed data processing and storage. Demonstrated expertise in telecommunications systems involving DOCSIS specifications, network analytics, and horizontal scaling strategies. Adept at resolving complex memory and performance issues in real-time and distributed systems.
Optimized analytics jobs using vectorization in Pandas and NumPy, reducing query processing time by 50% and addressing memory usage and performance issues.
• Improved ML model performance by addressing ensemble model weaknesses and applying hyperparameter tuning, K-fold cross-validation, and other optimization techniques.
• Developed a Python library to auto-generate HTML model cards for various use cases.
• Backend/ML developer and architect at Guavus with expertise in Kafka, ElasticSearch, Postgres, HBase, Apache Spark, Kubernetes, and Spring Boot.
• Led the EDA of NTT Alarms data, driving the analytics team to create models with 90% accuracy and 80% precision/recall for alarm prediction.
• Spearheaded Scala-based ML data pipelines for AlarmIQ and OPSIQ, enabling network service providers to monitor and predict alarms and user experience degradation, handling 5K+ records per second.
• Developed a Scala-based load testing application using Gatling, integrated with Jenkins, supporting KeyCloak authentication and custom field generation, to be used across teams for testing.
• Developed a management VM on Citrix Xen hypervisor for centralized control of all VMs in a Netscaler/CloudBridge instance.
• As a DOCSIS 3.0 expert and C++ developer, I designed the multicast control plane and led the implementation of the cable modem registration FSM for Motorola CCAPs.
Scala, Java, Python, C, C
Apache Spark, Kafka, HBase, ElasticSearch, Postgres
Kubernetes / Helm, Docker
CI/CD, Jenkins
Network Analytics, DOCSIS, TCP/IP, Netflow, BGP
Machine Learning, Support Vector Machines, Logistic Regression, Random Forests, K-Means Clustering
OCI Cloud, AWS
Numpy, Pandas, Shap, DropWizard, SpringBoot
Profiling, tuning CPU/memory-intensive workloads, horizontal scaling