Seasoned IT professional with over 15 years of experience in production application support, project management, and team leadership. Proven track record in managing large-scale IT operations, ensuring system reliability, and leading cross-functional teams to deliver high-quality support. Adept at troubleshooting complex issues, optimizing performance, and implementing strategic improvements to enhance service delivery.
Cloud Platforms: AWS
Technical Skills: Application Support, Incident Management, Problem Resolution, Root Cause Analysis, ITIL, Monitoring Tools, Scripting (Python, Bash), SQL, Oracle, Windows/Linux Administration
Monitoring Tools: Prometheus, Grafana, Datadog, Splunk, Centreon, ITRS Geneos, PagerDuty, Autosys, Tidal Scheduler
Automation & Orchestration: Kubernetes, Docker, Terraform
Ticketing Tools: Service Now, JIRA
Site Reliability: Incident Management, Problem Management, Change Management, Monitoring Systems, On-call Experience, Service Level Objectives (SLOs), Fault Tolerance, Disaster Recovery
Management Skills: Team Leadership, Project Management, Strategic Planning, Vendor Management, SLA Negotiation, Performance Management, Continuous Improvement, Training and Development
Soft Skills: Excellent Communication, Problem-Solving, Analytical Thinking, Customer Service, Stakeholder Management, Conflict Resolution