14+ years of overall experience in the IT Industry
Extensive experience in system design and architecture
A passionate leader with proven experience in technical leadership, coaching and developing people.
Strategically guided teams in customer facing roles to deliver results.
Successfully implemented ITIL/ITSM, DevOps, CI/CD, and Site Reliability Engineering (SRE) practices.
Experienced in hiring and coaching a team of talented engineers to build highly available and scalable solutions in AWS and Azure clouds to achieve 99.99% uptime and reduce operational costs by 30%.
Overview
14
14
years of professional experience
6
6
Certifications
Work History
Support Engineer, Azure Monitoring
Microsoft
11.2021 - Current
As a Support Engineer in the Azure Monitoring Enterprise team, I am
Responsible for coaching and developing 8 Support Engineers (7 Delivery Partner Engineers in Costa Rica and 1 FTE in the US) by conducting PCMS reviews, providing technical guidance to increase case resolution speed and also suggesting improvements to deliver outstanding customer support experience.
Serve as a Technology Evangelist to our customers by empowering them to achieve more with Microsoft's Azure Monitoring products including Application Insights, Log Analytics Workspace, Azure Monitor Alerts, Azure Monitoring Essentials, Container Insights and, Azure Managed Grafana.
Collaborated with Support Engineers globally to troubleshoot issues on multiple services on Azure including, but not limited to Azure APIMs, Azure Web Apps, Function Apps, Logic Apps, Blob Storage, Azure Kubernetes Services (AKS), and Azure Virtual Desktop (AVD), Azure Key Vaults, Entra ID (previously Azure Active Directory), App Gateway and Load balancers.
Created a global impact by contributing to timely review and update of Troubleshooting Guides (TSGs) and also public-facing documentation in the Azure Monitoring space to improve customer satisfaction and reduce case volume (case deflection approach)
Consistently sought product feedback from customers and collaborated with the Product Engineering teams to improve the UI experience by introducing new features in the Azure Monitoring products.
Empowered customers to implement cost-effective monitoring solutions on Azure by adhering to the best practices.
Partnered with Technical Advisors, Escalation Engineers and my direct manager to review the quality of case work exhibited by Support Engineers and identified areas for improvements.
Participated in daily triages/case bashes and provided technical guidance to peers to drive faster resolution.
Lead team meetings by delivering presentations to improve overall customer expectation management and effective case handling strategies.
Participated in Executive Challenges in Hackathon 2023 by suggesting AI/Copilot tool for Application Insights product which uses Open AI to create Kusto queries and human readable summary to users.
Link: HackBox (microsoft.com)
Lead Site Reliability Engineer
Royal Bank of Canada, RBC
06.2017 - 10.2021
Designed and formed the SRE team from the ground up. Led a team of 5 SREs and 4 developer interns that supported the SRE adoption at the bank.
Developed SRE framework, process, and procedures and conducted training sessions for various LOBs to adopt SRE practices.
Coached the team to build PoC dashboards using Grafana to monitor the SLI & SLO metrics for Windows and Linux systems hosted on hybrid cloud.
Developed and implemented chaos engineering strategies that increase system reliability and performance and fault tolerance through tooling, automation, and process optimization.
Developed strategies for application teams to ensure 99.99% uptime of all mission critical applications hosted on Windows, Linux and Open Shift systems.
Created and managed fully automated CI/CD pipelines to improve software delivery with DevOps and Site Reliability Engineering (SRE) practices.
Used tools like Git, Azure DevOps, Jenkins, and Maven to manage deployments to public cloud environments and Open Shift environments.
Supported the deployments of monolithic applications to Docker Containers, broke a monolith into microservices and deployed with ZERO downtime using VMs on cloud, Elastic Container Service (ECS)/Azure Container Registry (ACR), GitHub and Docker.
Assisted the Cyber Security teams to scan the system and applications for potential vulnerabilities.
Saved approx. 165+ hours of manual work by automating the remediation steps to mitigate the risk of cyber-attacks on Windows Server 2012/16/19, Red Hat Enterprise Linux 6/7, Unix Solaris.
Partnered with the application managers across various Lines of Businesses to strategically develop capacity requirements for new infrastructure on public clouds and assessed their current environments using monitoring tools such as Zabbix and Prometheus.
Provided cases of computational requests and introduced strategies for resizing auto scaling that saved nearly $500,000 infrastructure budget in 2020-21.
Coached and mentored a team of Engineers to consolidate Data Centers in Canada. Migrated over 20% of the physical servers to VMWare with a minimal downtime and
assisted the server build teams to build custom OS images.
Evaluated monitoring tools such as Data Dog, Splunk, Dynatrace, Zabbix, and Prometheus. Built a monitoring stack based on “Predictive Analysis” that analyzes past incidents using AIOps and Data Analytics to predict future system outages.
Led the implementation of Dynatrace and Grafana on AWS, PCF, and Azure. Created Dashboards for operations and application teams.
Implemented synthetic monitoring across 20+ applications that send notifications to the teams.
Built a Global Server Automation Dashboard (GSAD) using REACT (Node.JS & JavaScript) which improved the efficiency of support and application teams by 32% since 2019 by developing PowerShell scripts, converting them as APIs on the dashboard to perform routine systems administration and maintenance tasks.
Azure Cloud Engineer
Rogers Communications
11.2016 - 06.2017
Built, maintained, and tuned the Azure Cloud Infrastructure (including hybrid PaaS/IaaS) with an emphasis to improve the system’s reliability, performance and maintain 99.99% availability.
Seamlessly provisioned Windows and Linux VMs on Microsoft Azure to drive the performance and stability of Developers by using industry standard IaC tools like ARM Templates and Ansible.
Setup application performance monitoring (APM) tools to develop customized dashboards for monitoring and troubleshooting performance issues in Windows and Linux Servers in production.
Proactively reviewed code and scheduled deployment in production using Team Foundation Server (TFS) and regular application maintenance activities such as IIS upgrades, creating and renewing SSL certificates, and setup applications on load balancers to increase high availability.
Executed web application load and performance testing with Microsoft Visual Studio 2017
Implemented Windows Virtual Desktop (WVD) from ground zero and delivered knowledge share sessions to peers.
Collaborated with the application development and enterprise operations teams to optimize the systems and applications performance by implementing F5 load balancers to scale and adapt to modern high available application practices.
Migrated 1200 user mailboxes from exchange server to office 365 using Exchange Admin Center (EAC) and used lift and shift approach to migrate on-premises VMs to Azure.
Senior Windows and Virtualization Engineer
Union Gas
10.2014 - 11.2016
I worked as a Systems Administration consultant for companies such as Union Gas, Unisys Sentia, Group of Gold Line and implemented Data Center migration, AD migration Exchange upgrade projects.
Collaborated with Infrastructure Architects to plan, design, test and implement migration strategies for transition of infrastructure from physical to virtual.
Facilitated change management activities with cross-functional team members and managers and ensure RFCs are effectively communicated and documented.
Collaborated with sales and technical teams to offer solutions to customers’ business problems and reshape their existing infrastructure to avoid Capital and Operational Costs
Led the systems administrators to design and implement infrastructure solutions in complex project environments – some projects include Hyper-V Failover Clustering, Desktop/Server Virtualization, File and Print Sharing, Active Directory, Exchange Server migration, SQL Clustering, GPO Administration, and IIS implementation projects.
Developed a thorough migration strategy (P2V and V2V) for Windows and Linux servers following regular monitoring systems and ITIL practices to identify and avoid intrusions.
Leveraged industry standard practices to migrate the existing infrastructure to VMware and reduce Capital Expenses (CapEx) and Operational Expenses (OpEx) by 15%
Architected Windows7 rollout/upgrade projects using System Center Configuration Manager (SCCM)
Configured Systems Center Operations Manager (SCOM) to achieve overall data center monitoring - including assessing the performance, availability and health of the IT systems using a streamlined management console.
Lead Systems Engineer
Ray Labs Technologies
11.2009 - 11.2012
This is my first role in IT where I have worked for public sectors in India to install new and rebuild existing Windows and Linux Infrastructure.
I wrote PowerShell scripts to create Active Directory Groups, users, and role assignments.
Configured periodic OS snapshots using Volume Shadow Copy Service (VSS) and monitored system health using Nagios and SCOM.
Coached peers on adoption of best practices to improve overall system build efficiency by designing custom OS templates to automate server and desktop deployments using Microsoft Deployment Toolkit (MDT).
Developed a migration strategy to reduce the hardware footprint and saved 30% of Operational Costs with the adoption of VMWare and Hyper-V platforms.
Configured, deployed and troubleshot IIS and Oracle web logic application servers on HP UX 11i, Solaris 6/7 and Windows Operating systems.