Results-driven professional offering a progressive, 9+ years career in SRE / DevOPS / AdminOPS
Knowledgeable in common SCM practices , such as branching and code merge
Experience working with CI/CD implementation a microservices architecture
Solid experience with Linux ( Ubuntu, CentOS, RedHat ) administration
Experience with container orchestration system such as AWS ECS or Kubernetes
Provisioned infrastructure deployment using Terraform , Puppet , and Pulumi
Strong knowledge of current Microsoft environments including desktop and server operating systems, remote desktop services and server, system and network administration (Active Directory).
Experience in working with SaaS application and administered Unix and Linux based systems.
Proficient in automating various tasks using powershell , bash , and python .
Experience in AWS , Azure , Google Cloud and/or other Cloud based solutions.
Experience in implementing security measures like configuring hardware and software firewalls, worked with Cisco and Fortinet systems in an IT environment.
Experience in deploying VMWare server farms , SAN/storage and backup systems like Veeam and Acronis.
Excellent communication skills and strong critical thinking and analytical skills required for problem solving
Successful at optimizing security standards, improving planning processes and managing systems implementation. Knowledgeable about disaster recovery planning , roadmapping and team development .
Administer 40+ servers in a Windows and Linux servers environment with 99.999% SLA uptime while implementing routing changes
Overview
10
10
years of professional experience
2
2
years of post-secondary education
Work History
Site Reliability Engineer
GFL Environmental
11.2022 - Current
Reduced downtime for critical applications by proactively addressing potential issues monitored through Dynatrace and performed regular maintenance and updates.
Onboarded Dyntrace monitoring environment to all the applications
Created different set of Dynatrace dashboards for business, Tier 1 support, and SRE's
Improved the applications availability from 96% SLA to 99.99% SLA
Defined SLO's , SLI's for each and every application
Implemented cost-saving measures by optimizing resource utilization across cloud-based infrastructure environments and monitored through Cloud custodian
Improved incident management workflows in ServiceNow by creating comprehensive documentation on troubleshooting procedures and common issues resolution steps.
Conducted root-cause analyses after major incidents to identify areas for process improvement or technical enhancement opportunities.
Mentored junior engineers, sharing knowledge of best practices for site reliability engineering methodologies.
Optimized infrastructure performance by conducting thorough analyses of KPI's and data .
Ensured compliance with relevant industry regulations regarding data privacy standards by actively participating in SOC 2 and SOX audits assessments.
Developed a system that automatically detects problematic process within an environment and restart them, thus improving application availability
Implemented automation to reduce repetitive manual processes by 50%, using Ansible and Terraform .
Dev/IT Operations Professional
Thoughtwire
05.2021 - 09.2022
Developed and implemented performance improvement strategies and plans to promote continuous improvement.
Automated deployment using Terraform and Ansible
Deploy, Configure, Maintain, Compute on Azure Cloud
Worked within applicable standards, policies and regulatory guidelines to promote safe and secure working environment.
Implemented code repositories in Github and setup automated CI/CD pipeline for all product lines
Provisioned resources in Azure using Terraform templates
Implemented Azure DevOps solutions for better CI/CD pipelines
Optimized network performance by conducting regular audits and adjusting configurations as needed.
Streamlined IT operations processes for enhanced efficiency and reduced downtime.
Developed IT policies and procedures, promoting a secure and compliant computing environment.
Collaborated with cross-functional teams to design and implement company-wide technology solutions.
AWS Dev Operations Engineer
Amazon
04.2017 - 01.2019
Created a cloud based monitoring system using AWS cloud watch to track health of all servers
Developed automations for provisioning compute instance and storage using Terraform
Provisioned resources in AWS using Cloud formation templates
Built workflows to create immutable infrastructure in AWS
Developed Ansible playbook and automated the execution of routine linux scripts
Design and deploy web applications utilizing AWS stack , including VPC , EC2 , ELB , Security Groups , S3 , IAM
Maintain Cloud environment by performing continues patching, upgrading OS kernel , Remediating security vulnerabilities
Provides training and mentor new systems administrator or cloud engineers
Conducted cost analysis exercises for identifying opportunities to optimize spending across various AWS resources without compromising performance or functionality.
Maintained compliance with industry regulations by conducting regular audits of AWS configurations and implementing necessary remediation measures.
Streamlined code delivery processes by implementing containerization strategies using Docker and Kubernetes.
Improved system efficiency by automating routine tasks and implementing continuous integration pipelines.
IT Operations Analyst
Dell Technologies
04.2014 - 03.2017
Migrated over 70+ application instances from Oracle cloud to VMware and EMC cloud
Consulted, Planned, Designed, and Implemented cloud solutions with various customers
Established continuous build environments to speedup the development process
Monitored the performance of systems in a cloud based computing environment, including overall system health , reliability , performance and cost
Ensured 100% of all project documentation was created and updated, include design, development and deployment documentation
Shared best practices and guided engineers while implementing infrastructure as code, using CloudFormation
Enhanced data integrity through implementation of rigorous backup strategies and periodic validation checks.
Assisted in budget planning activities by tracking expenses related to hardware procurement, software licensing, vendor services contracts.
Education
Master of Science - Technology Innovation Management
Carleton University
Ottawa, ON
01.2019 - 08.2020
Accomplishments
Process Flows
ITIL incident management service processes; JIRA ITSM ; Service Now ; Remedy ; Zendesk ; Freshdesk ; Desk.com , Azure DevOps
Software
VMware applications; Hyper-V, Salesforce; Crowdstrike, Jenkins, Terraform, Kubernetes,Docker, ,AWS, Azure,Ansible, Puppet, Python, Dynatrace, Grafana, Prometheus, Datadog, NewRelic, PowerShell and Bash scripting, HTML, CSS, MySQL, Orange, Service Now, Atlassian, GIT, GitHub, Windows Server 2012,2016,2019,2022