Summary
Overview
Work History
Education
Accomplishments
Additional Information
Timeline
Generic

Surya Koritala

SRE / DevOps / IT Ops
Toronto,ON

Summary

  • Results-driven professional offering a progressive, 9+ years career in SRE / DevOPS / AdminOPS
  • Knowledgeable in common SCM practices , such as branching and code merge
  • Experience working with CI/CD implementation a microservices architecture
  • Solid experience with Linux ( Ubuntu, CentOS, RedHat ) administration
  • Experience with container orchestration system such as AWS ECS or Kubernetes
  • Provisioned infrastructure deployment using Terraform , Puppet , and Pulumi
  • Strong knowledge of current Microsoft environments including desktop and server operating systems, remote desktop services and server, system and network administration (Active Directory).
  • Experience in working with SaaS application and administered Unix and Linux based systems.
  • Proficient in automating various tasks using powershell , bash , and python .
  • Experience in AWS , Azure , Google Cloud and/or other Cloud based solutions.
  • Experience in implementing security measures like configuring hardware and software firewalls, worked with Cisco and Fortinet systems in an IT environment.
  • Experience in deploying VMWare server farms , SAN/storage and backup systems like Veeam and Acronis.
  • Excellent communication skills and strong critical thinking and analytical skills required for problem solving
  • Successful at optimizing security standards, improving planning processes and managing systems implementation. Knowledgeable about disaster recovery planning , roadmapping and team development .
  • Administer 40+ servers in a Windows and Linux servers environment with 99.999% SLA uptime while implementing routing changes

Overview

10
10
years of professional experience
2
2
years of post-secondary education

Work History

Site Reliability Engineer

GFL Environmental
11.2022 - Current
    • Reduced downtime for critical applications by proactively addressing potential issues monitored through Dynatrace and performed regular maintenance and updates.
    • Onboarded Dyntrace monitoring environment to all the applications
    • Created different set of Dynatrace dashboards for business, Tier 1 support, and SRE's
    • Improved the applications availability from 96% SLA to 99.99% SLA
    • Defined SLO's , SLI's for each and every application
    • Implemented cost-saving measures by optimizing resource utilization across cloud-based infrastructure environments and monitored through Cloud custodian
    • Improved incident management workflows in ServiceNow by creating comprehensive documentation on troubleshooting procedures and common issues resolution steps.
    • Conducted root-cause analyses after major incidents to identify areas for process improvement or technical enhancement opportunities.
    • Mentored junior engineers, sharing knowledge of best practices for site reliability engineering methodologies.
    • Optimized infrastructure performance by conducting thorough analyses of KPI's and data .
    • Ensured compliance with relevant industry regulations regarding data privacy standards by actively participating in SOC 2 and SOX audits assessments.
    • Developed a system that automatically detects problematic process within an environment and restart them, thus improving application availability
    • Implemented automation to reduce repetitive manual processes by 50%, using Ansible and Terraform .

Dev/IT Operations Professional

Thoughtwire
05.2021 - 09.2022
    • Developed and implemented performance improvement strategies and plans to promote continuous improvement.
    • Automated deployment using Terraform and Ansible
    • Deploy, Configure, Maintain, Compute on Azure Cloud
    • Worked within applicable standards, policies and regulatory guidelines to promote safe and secure working environment.
    • Deployed and maintained AKS and GKE
    • Implemented DRP / BCP plans
    • Managed existing Azure cloud environments , automation , monitoring , metrics , disaster recovery/backups , and capacity planning
    • Implemented code repositories in Github and setup automated CI/CD pipeline for all product lines
    • Provisioned resources in Azure using Terraform templates
    • Implemented Azure DevOps solutions for better CI/CD pipelines
    • Optimized network performance by conducting regular audits and adjusting configurations as needed.
    • Streamlined IT operations processes for enhanced efficiency and reduced downtime.
    • Developed IT policies and procedures, promoting a secure and compliant computing environment.
    • Collaborated with cross-functional teams to design and implement company-wide technology solutions.

AWS Dev Operations Engineer

Amazon
04.2017 - 01.2019
  • Created a cloud based monitoring system using AWS cloud watch to track health of all servers
  • Developed automations for provisioning compute instance and storage using Terraform
  • Provisioned resources in AWS using Cloud formation templates
  • Built workflows to create immutable infrastructure in AWS
  • Developed Ansible playbook and automated the execution of routine linux scripts
  • Design and deploy web applications utilizing AWS stack , including VPC , EC2 , ELB , Security Groups , S3 , IAM
  • Maintain Cloud environment by performing continues patching, upgrading OS kernel , Remediating security vulnerabilities
  • Provides training and mentor new systems administrator or cloud engineers
  • Conducted cost analysis exercises for identifying opportunities to optimize spending across various AWS resources without compromising performance or functionality.
  • Maintained compliance with industry regulations by conducting regular audits of AWS configurations and implementing necessary remediation measures.
  • Streamlined code delivery processes by implementing containerization strategies using Docker and Kubernetes.
  • Improved system efficiency by automating routine tasks and implementing continuous integration pipelines.

IT Operations Analyst

Dell Technologies
04.2014 - 03.2017
  • Migrated over 70+ application instances from Oracle cloud to VMware and EMC cloud
  • Consulted, Planned, Designed, and Implemented cloud solutions with various customers
  • Established continuous build environments to speedup the development process
  • Monitored the performance of systems in a cloud based computing environment, including overall system health , reliability , performance and cost
  • Ensured 100% of all project documentation was created and updated, include design, development and deployment documentation
  • Shared best practices and guided engineers while implementing infrastructure as code, using CloudFormation
  • Enhanced data integrity through implementation of rigorous backup strategies and periodic validation checks.
  • Assisted in budget planning activities by tracking expenses related to hardware procurement, software licensing, vendor services contracts.

Education

Master of Science - Technology Innovation Management

Carleton University
Ottawa, ON
01.2019 - 08.2020

Accomplishments

    Process Flows

  • ITIL incident management service processes; JIRA ITSM ; Service Now ; Remedy ; Zendesk ; Freshdesk ; Desk.com , Azure DevOps
  • Software

  • VMware applications; Hyper-V, Salesforce; Crowdstrike, Jenkins, Terraform, Kubernetes,Docker, ,AWS, Azure,Ansible, Puppet, Python, Dynatrace, Grafana, Prometheus, Datadog, NewRelic, PowerShell and Bash scripting, HTML, CSS, MySQL, Orange, Service Now, Atlassian, GIT, GitHub, Windows Server 2012,2016,2019,2022
  • Browsers

  • Chrome; Safari; Firefox; MS Edge; IE; Opera
  • Hardware

  • PCs, Laptops, Telephony Systems, Printers, Routers, Modems
  • Networking

  • LAN & VPN/Remote Connectivity, TCP/IP, DNS, DHCP, Network Switches
  • Platforms

  • Windows, Unix, NetWare Servers, Citrix, Linux (RedHat CentOS), RedHat

Additional Information

Achievements & Certifications

  • Received "Dell Champion" award for the year 2016, Highest award for an Individual Contributor
  • Received 2 Gold and 1 Platinum awards and multiple accolades from stakeholders

Timeline

Site Reliability Engineer

GFL Environmental
11.2022 - Current

Dev/IT Operations Professional

Thoughtwire
05.2021 - 09.2022

Master of Science - Technology Innovation Management

Carleton University
01.2019 - 08.2020

AWS Dev Operations Engineer

Amazon
04.2017 - 01.2019

IT Operations Analyst

Dell Technologies
04.2014 - 03.2017
Surya KoritalaSRE / DevOps / IT Ops