Upcoming Data Scientist skilled in compiling, transforming, and analyzing complex data using software. Well-versed in machine learning and large dataset management with a strong ability to identify patterns and solve business problems.
Content Moderation Final Project, Link, 04/2024, Conducted comparative analysis of Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers (BERT), with BERT demonstrating superior performance. Developed Flask web app for text and image classification. Integrated the Google Cloud Vision API for text extraction from images. Provided real-time classification results as probabilities of appropriateness or inappropriateness based on the BERT model. Deployed the Flask application on Google App Engine for global access. Data Processing Framework Evaluation, 12/2023, Conducted comprehensive evaluation and comparison of Apache Beam and Amazon Kinesis. Designed and executed experiments to assess performance, scalability, and latency of both frameworks by monitoring CPU and memory utilization while measuring processing speed. Systematically increased data volume and complexity to evaluate scalability under varying workloads. R Shiny Project, Link, 03/2023, Developed an R Shiny project analyzing the relationship between immigration trends and earnings, featuring separate components for Earnings and Immigration. Visualized US immigration (2011-2020) and Median Household Earnings (2009-2021). Customizable features: scale adjustments, PDF/CSV dataset downloads.