Sarah Ki

Corona

Overview

years of professional experience

The Boeing Company

05.2024 - Current

Analyzed large-scale 777 operator fleet datasets to evaluate reliability performance and deliver data-driven insights informing operational decisions.
Built predictive regression and time-series models in Python to forecast climate-related impacts and support strategic planning.
Automated data extraction, transformation, and analysis workflows using Python, reducing processing time and improving analytical efficiency.
Developed automated data quality checks in Python to validate operator datasets, improve data integrity, and enable high-quality downstream analysis.
Collaborated with customers and cross-functional stakeholders to ingest operational data, compare fleet performance against Boeing reliability recommendations, and translate insights into actionable recommendations.

UC Berkeley

Berkeley

05.2024 - 12.2025

Designed and built an end-to-end ICD-9 code recommendation system that automatically assigns top-k diagnosis codes from clinical discharge summaries (MIMIC-III), reducing manual medical coding effort while improving auditability.
Implemented a hybrid retrieval–reranking pipeline that combines fine-tuned medical text embeddings (MedCPT) with large language model rerankers (Gemini Flash) to improve ICD-9 code relevance and ranking quality.
Developed evidence span extraction to surface concise, human-readable clinical justifications for each predicted code, enabling transparency for medical coders and patients.
Evaluated system performance using Precision@k, Recall@k, Micro-F1, and Macro-F1, beginning with a focused diabetes ICD case study and extending evaluation to the full ICD code set.
Built an interactive Streamlit web application supporting coder workflows (review, accept/reject codes) and a patient-facing view for code transparency and trust.

The Boeing Company

11.2022 - 05.2024

Created a Python tool to automate the BCS Service Request analysis for different operators
Constructed operational and interactive report in Jira dashboard using MDX to query and visualize data on Flight Operations Support product quality escapes
Analyzed and identified quality metrics (M3s) to improve Flight Operations Support products up to 80%
Integrated application components and databases across computing platforms using Teradata to data mine real-time Flight Operations data
Communicated business requirements and test results with offshore developers to manage projects
Developed and executed tests to validate system functionality against specification
Coordinated with Flight Operations teams to develop Python tools to automate workflow tracking processes with Jira REST API and Python scripts
Performed research of process, applications, systems and data to support identification of functional requirements for application or system design
Interpreted and translated application operational requirements into functional specifications

University of California, Berkeley

Berkeley

12-2025

University of California, San Diego

San Diego

01.2021

Programming: Python, SQL
Machine learning: regression, classification, CNNs, time series
NLP & LLM systems: embeddings, retrieval, and reranking, seq-to-seq models (T5, BART), LLM APIs (Gemini) for relevance scoring

Evaluation & experimentation: Precision@k, Recall@k, Micro/Macro-F1, A/B testing
Visualization & Communication: Tableau, Matplotlib, ggplot2
Tools & Platforms: AWS, Docker, Git, Streamlit

ICD-9 / ICD-10 Recommendation System: Built an end-to-end medical coding system using hybrid retrieval and LLM-based reranking to recommend relevant ICD codes from clinical notes, scaling from a diabetes-focused prototype to all ICD-9 codes and evaluating performance with Precision@k, Recall@k, and F1.
Data Visualization & Policy Analytics: Developed interactive dashboards and visual narratives using Congressional bill data to analyze legislative activity across sessions, regimes, and policy areas, translating complex political data into accessible insights for non-technical audiences.
Korean NLP & Cross-Lingual Processing: Researched and implemented Korean NLP models for grammatical error correction and cross-lingual retrieval, leveraging large-scale Korean text datasets and sequence-to-sequence architectures to address gaps in informal Korean language tooling.

UC Berkeley

05.2024 - 12.2025

The Boeing Company

05.2024 - Current

The Boeing Company

11.2022 - 05.2024

University of California, Berkeley

University of California, San Diego