Overview
Work History
Education
Skills
Projects
Websites
Timeline
Generic

Sarah Ki

Corona

Overview

3
3
years of professional experience

Work History

System Engineer - Reliability & Analytics

The Boeing Company
05.2024 - Current
  • Analyzed large-scale 777 operator fleet datasets to evaluate reliability performance and deliver data-driven insights informing operational decisions.
  • Built predictive regression and time-series models in Python to forecast climate-related impacts and support strategic planning.
  • Automated data extraction, transformation, and analysis workflows using Python, reducing processing time and improving analytical efficiency.
  • Developed automated data quality checks in Python to validate operator datasets, improve data integrity, and enable high-quality downstream analysis.
  • Collaborated with customers and cross-functional stakeholders to ingest operational data, compare fleet performance against Boeing reliability recommendations, and translate insights into actionable recommendations.

Capstone: Transparent ICD-9 Coding Assistant

UC Berkeley
Berkeley
05.2024 - 12.2025
  • Designed and built an end-to-end ICD-9 code recommendation system that automatically assigns top-k diagnosis codes from clinical discharge summaries (MIMIC-III), reducing manual medical coding effort while improving auditability.
  • Implemented a hybrid retrieval–reranking pipeline that combines fine-tuned medical text embeddings (MedCPT) with large language model rerankers (Gemini Flash) to improve ICD-9 code relevance and ranking quality.
  • Developed evidence span extraction to surface concise, human-readable clinical justifications for each predicted code, enabling transparency for medical coders and patients.
  • Evaluated system performance using Precision@k, Recall@k, Micro-F1, and Macro-F1, beginning with a focused diabetes ICD case study and extending evaluation to the full ICD code set.
  • Built an interactive Streamlit web application supporting coder workflows (review, accept/reject codes) and a patient-facing view for code transparency and trust.

System and Data Analyst - Flight Operations Support

The Boeing Company
11.2022 - 05.2024
  • Created a Python tool to automate the BCS Service Request analysis for different operators
  • Constructed operational and interactive report in Jira dashboard using MDX to query and visualize data on Flight Operations Support product quality escapes
  • Analyzed and identified quality metrics (M3s) to improve Flight Operations Support products up to 80%
  • Integrated application components and databases across computing platforms using Teradata to data mine real-time Flight Operations data
  • Communicated business requirements and test results with offshore developers to manage projects
  • Developed and executed tests to validate system functionality against specification
  • Coordinated with Flight Operations teams to develop Python tools to automate workflow tracking processes with Jira REST API and Python scripts
  • Performed research of process, applications, systems and data to support identification of functional requirements for application or system design
  • Interpreted and translated application operational requirements into functional specifications

Education

Master of Science - Information and Data Science

University of California, Berkeley
Berkeley
12-2025

Bachelor of Science - Bioengineering, Bioinformatics

University of California, San Diego
San Diego
01.2021

Skills

  • Programming: Python, SQL
  • Machine learning: regression, classification, CNNs, time series
  • NLP & LLM systems: embeddings, retrieval, and reranking, seq-to-seq models (T5, BART), LLM APIs (Gemini) for relevance scoring
  • Evaluation & experimentation: Precision@k, Recall@k, Micro/Macro-F1, A/B testing
  • Visualization & Communication: Tableau, Matplotlib, ggplot2
  • Tools & Platforms: AWS, Docker, Git, Streamlit

Projects

  • ICD-9 / ICD-10 Recommendation System: Built an end-to-end medical coding system using hybrid retrieval and LLM-based reranking to recommend relevant ICD codes from clinical notes, scaling from a diabetes-focused prototype to all ICD-9 codes and evaluating performance with Precision@k, Recall@k, and F1.
  • Data Visualization & Policy Analytics: Developed interactive dashboards and visual narratives using Congressional bill data to analyze legislative activity across sessions, regimes, and policy areas, translating complex political data into accessible insights for non-technical audiences.
  • Korean NLP & Cross-Lingual Processing: Researched and implemented Korean NLP models for grammatical error correction and cross-lingual retrieval, leveraging large-scale Korean text datasets and sequence-to-sequence architectures to address gaps in informal Korean language tooling.

Timeline

Capstone: Transparent ICD-9 Coding Assistant

UC Berkeley
05.2024 - 12.2025

System Engineer - Reliability & Analytics

The Boeing Company
05.2024 - Current

System and Data Analyst - Flight Operations Support

The Boeing Company
11.2022 - 05.2024

Master of Science - Information and Data Science

University of California, Berkeley

Bachelor of Science - Bioengineering, Bioinformatics

University of California, San Diego
Sarah Ki