Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Mrunalini Sara

Sunnyvale,CA

Summary

Knowledgeable SRE lead/Manager with solid background in maintaining and enhancing system reliability. Demonstrated success in implementing automated solutions to streamline processes and improve system performance, security posture. Proven ability to troubleshoot complex issues and optimize infrastructure through proactive monitoring and collaboration.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Site Reliability Engineer

KlearNow
01.2020 - Current
  • Managed and optimized cloud infrastructure across AWS services like EC2, S3, VPC, IAM, RDS, and Lambda to enhance scalability, reliability, and security
  • Evaluated new technologies and tools to enhance overall system performance, stability, and security.
  • Led and implemented enterprise-wide DevOps initiatives, including resulting in 40% improved scalability and 25% cost reduction through automated resource management.
  • Contributed to the ongoing refinement of internal processes and procedures within the site reliability engineering discipline through regular reviews, updates, and knowledge sharing activities.
  • Established comprehensive monitoring and observability framework using Prometheus, Grafana, and Datadog, implementing custom dashboards and alerting policies that improved incident response time by 25% and enabled proactive system health.
  • Proficient in Kubernetes cluster management, including designing and maintaining production-grade clusters with high availability and auto- scaling.

Integration Engineer II

Kaiser Permanente
01.2018 - 01.2020
  • Monitored system performance, optimized processes, and implemented fixes to reduce downtime and improve stability from restarts to preventative solutions.
  • Possessed in-depth knowledge of inbound and outbound integration point data flows, supporting seamless system operations.
  • Utilized AWS Cloud Watch, Prometheus, and Grafana for monitoring cloud services and clusters, ensuring efficient operations
  • Strategic and Tactical management of the continuous integration, deployment, delivery, monitoring, maintenance, development, upgrade and support of all product systems
  • Implemented security measures within integrations to protect sensitive data from unauthorized access or use.
  • Strong sense of accountability and commitment to problem solving, backed by a curiosity to dig deep and identify root causes.

Unix/Linux Administrator

Safeway
01.2015 - 01.2018
  • Resolve operational problems within the defined schedules, excellent customer satisfaction and service level agreements
  • Analyze root causes of operational malfunctions and provide resolutions
  • Lead responsibility role for remediating compliance vulnerabilities within the Data Platform Management Service
  • Handle escalated issues and follow-up on outstanding issues promptly
  • Develop preventive measures and document issue resolution procedures
  • Recommend process improvements to improve operational efficiency cost and effectiveness
  • Evaluate current operational processes and recommend improvements

WebSphere/Infrastructure Administrator

AT&T
01.2014 - 01.2015
  • Engineering of systems administration-related solutions for various projects and operational needs
  • Configure Alerts - system performance and capacity alerts for items such as CPU, RAM, storage, etc
  • Apply OS patches and upgrades on a regular basis and upgrade administrative tools and utilities
  • Install new server hardware and software per standard operating procedures and supports infrastructure applications
  • Provision and support servers for the development and QA of applications created by other teams and divisions
  • Monitor systems and provide information and statistics and/or act on Perform daily system monitoring, verifying the integrity and availability of all hardware, server resources, systems and key processes, reviewing system and application logs, and verifying completion of scheduled jobs such as backups

Education

Master of Science - Computer Engineering

Southern Illinois University, Carbondale
Carbondale, IL
12-2013

Skills

  • AWS/Infrastructure design
  • Elastic Kubernetes Service - Microservices architecture
  • Log analysis, Promethus-Grafana, Observability tools, Datadog
  • System monitoring
  • Infrastructure automation, deployment strategies and rollout plans
  • Capacity planning
  • Cost control/Cost-benefit analysis
  • Linux administration

Certification

AWS Certified Solutions Architect

Timeline

Site Reliability Engineer

KlearNow
01.2020 - Current

Integration Engineer II

Kaiser Permanente
01.2018 - 01.2020

Unix/Linux Administrator

Safeway
01.2015 - 01.2018

WebSphere/Infrastructure Administrator

AT&T
01.2014 - 01.2015

Master of Science - Computer Engineering

Southern Illinois University, Carbondale
Mrunalini Sara