Software Developer (Applied AI Research Team)
Western AI
09.2025 - 03.2026
- Working in a team of applied AI developers to build a plagiarism detection system that combines token similarity, semantic embeddings, and output-based behavior comparison
- Implementing TF-IDF and CodeBERT embedding pipelines to detect similarity across structurally transformed code (e.g., recursion ↔ iteration, list comprehensions ↔ loops)
- Developing a Multilayer Perceptron (MLP) meta-classifier that integrates multiple similarity signals into a single plagiarism risk score
- Using the HuggingFace “The Stack” dataset with streaming and filtering to efficiently preprocess large-scale, licensed code corpora
- Co-designing a Streamlit UI to allow users to upload code files and receive similarity breakdowns and risk assessments
- Collaborating through weekly design reviews, paired debugging sessions, and shared Git workflows to maintain code quality and accountability
- Preparing to present the project at the Canadian Undergraduate Conference on Artificial Intelligence (CUCAI) 2026
- In Progress
