Summary
Overview
Work History
Education
Skills
Websites
Publications
Languages
Software
Work Preference
Work Availability
Timeline
Generic
Andrew Marete

Andrew Marete

Senior Data Scientist
Ottawa,ON

Summary

Seasoned data science researcher specializing in Computational Biology, excelling in machine learning, statistical programming, and advanced coding in R and Python. Actively manages and mentors junior data scientists, driving their professional growth. Passionate about leveraging data science and business analytics to achieve impactful results, dedicated to advancing a career in this dynamic field.

Overview

14
14
years of professional experience

Work History

Senior Data Scientist

Canadian Transportation Agency (CTA)
09.2021 - Current
  • Collaborated with cross-functional teams to identify data sources and improve data quality.
  • Established standardized methodologies for reproducible research within the data science team.
  • Maintained the development cycle for the Agency's risk assessment tool (RT), including configuration and updates using UML standards.
  • Managed the design and adaptation of data collection tools for risk software configuration.
  • Liaised with IT developers, managers, and DBAs to communicate business requirements, facilitating collaboration across departments.
  • Developed data and risk tool model diagrams, effectively communicating complex technical concepts to non-technical stakeholders.
  • Created and maintained Enforcement and consumer protection dashboards using R and Microsoft Power BI.
  • Used consumer protection dashboards as status update tools for the Agency Chairperson.
  • Collaborated with NRC Data Scientists to create a machine learning solution that reduced complaints processing time by two weeks.
  • Delivered comprehensive reports highlighting key trends and anomalies, presenting findings to senior management.
  • Analyzed large datasets to identify trends, patterns, and hidden stories.
  • Designed and deployed automated ETL processes to transform raw data into usable formats, reducing manual data entry errors.
  • Developed and presented customized reports in visually appealing formats.

Senior Statistical Programmer and Software Developer

Lactanet Canada
01.2021 - 06.2024
  • Optimized statistical programming processes by implementing advanced coding techniques, resulting in more efficient data analysis and reporting.
  • Developed custom macros for repetitive tasks, significantly reducing manual efforts required from team members.
  • Provided expert consultation on statistical methodology for various projects, enhancing overall study designs and analytical approaches.
  • Fostered a positive work environment by promoting collaboration, open communication, and continuous learning among team members.
  • Developed in-house applications designed for company needs.

Postdoctoral Fellow

Agriculture and Agri-Food Canada
01.2018 - 08.2021
  • Created a mapping spreadsheet to map data from different clinically infected bovines to various regions in Canada.
  • Created and maintained a genomic database in SQL to hold structured DNA data.
  • Cleaned, validated, imputed, and analyzed genomic data with millions of data points to identify genetic markers associated with Bovine Johne's disease.
  • Applied linear regression and machine learning approaches to millions of data points obtained from DNA data.
  • Used Python programming language to create and maintain statistical pipelines.
  • Designed robust experiments that produced reliable data, enabling accurate interpretation of results.

Ph.D. Research Fellow

Aarhus University
01.2014 - 01.2018
  • Conducted data mining and data visualization assignments.
  • Conducted case-control studies on economically essential phenotypes (traits) by applying linear regression models and restricted maximum likelihood estimations.
  • Created and published an algorithm for detecting genes associated with clinical mastitis disease in dairy cattle.
  • Estimated gene interactions using Bayesian methods that may eventually cause illness due to environmental modifications (i.e., epigenetics).
  • Enhanced research quality by implementing rigorous methodologies and data analysis techniques.
  • Mentored junior researchers, providing guidance on study design, data analysis, and manuscript preparation.

Senior Research Technician

International Livestock Research Institute
01.2011 - 01.2014
  • Assisted in gap analysis for a USD 40 million project identifying the highest yielding breed for medium-income farmers in Kenya, funded by the Bill and Melinda Gates Foundation.
  • Created questionnaires for data collection using Open Data Kit.
    Trained questionnaire administrators on administering the questionnaire.
  • Created and managed an SQL livestock database, generating production reports for stakeholders.
  • Cleaned, categorized, and analyzed genomic data to identify the best cross-bred genotype.
  • Mentored junior research technicians, fostering a collaborative and supportive work environment.
  • Conducted literature reviews to inform experimental design and stay current on industry advancements.

Education

Ph.D. - Statistical Genetics (Computational Biology)

Aarhus University
Denmark
01.2018

Master of Science - Quantitative Genetics

Nairobi University
Kenya
01.2012

Bachelor of Science - Range Management (Agricultural Economics/Statistics)

Nairobi University
Kenya
01.2009

Skills

  • Multivariate Analysis
  • Transfer Learning
  • Neural Networks
  • Feature Engineering
  • Data acquisition
  • Data preparation
  • Machine learning
  • Statistical modeling
  • Advanced programming
  • Data visualization
  • Model deployment
  • Software development
  • Bioinformatics
  • DNA/RNA-seq Analysis

Publications

  • A Meta-Analysis Including Pre-Selected Sequence Variants Associated with Seven Traits in Three French Dairy Cattle Populations (2018)., doi.org/10.3389/fgene.2018.00522, In this study, we performed an initial linear regression on three populations, then compared the result using a meta-analysis to get the genomic regions private within a population or common between the three populations. This study highlighted the importance of combining data from many data sources.
  • A system-based analysis of the genetic determinism of udder conformation and health phenotypes across three French dairy cattle breeds (2018)., doi.org/10.1371/journal.pone.0199931, In this study, we looked for gene ontologies affecting udder health in five bovine populations. Using weighted gene co-expression networks based on partial correlation and information theory approaches, we identified ten candidate genes that can be used to improve mammary gland health.
  • Identification of long non-coding RNA associated with bovine Johne's disease using a combination of neural networks and logistic regression (2021)., Frontiers in Veterinary Science 8 (2021): 209, This study identified Johne's disease genomic signatures that environmental changes might activate and, in the future, may be targeted for removal to reduce the prevalence of the disease. For the data modelling, I used a combination of linear regression and recurrent neural networks to train the model.

Languages

English
Full Professional
French
Professional Working

Software

Python Programming

R Statistical Programming

C

Microsoft Power BI

SQL

Bash

Unix/Linux

Work Preference

Work Type

Part TimeContract Work

Work Location

RemoteHybrid

Important To Me

Work-life balanceCareer advancementHealthcare benefitsCompany Culture

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Senior Data Scientist

Canadian Transportation Agency (CTA)
09.2021 - Current

Senior Statistical Programmer and Software Developer

Lactanet Canada
01.2021 - 06.2024

Postdoctoral Fellow

Agriculture and Agri-Food Canada
01.2018 - 08.2021

Ph.D. Research Fellow

Aarhus University
01.2014 - 01.2018

Senior Research Technician

International Livestock Research Institute
01.2011 - 01.2014

Ph.D. - Statistical Genetics (Computational Biology)

Aarhus University

Master of Science - Quantitative Genetics

Nairobi University

Bachelor of Science - Range Management (Agricultural Economics/Statistics)

Nairobi University
Andrew MareteSenior Data Scientist