Publication Date

3-1-2021

Journal

PLOS Computational Biology

DOI

10.1371/journal.pcbi.1008671

PMID

33661899

PMCID

PMC7932115

PubMedCentral® Posted Date

3-4-2021

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Keywords

Computational Biology, Data Science, Humans, Machine Learning, Models, Biological, Models, Statistical, Software

Abstract

Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propose a hands-on training for overfitting that is suitable for introductory level courses and can be carried out on its own or embedded within any data science course. We use workflow-based design of machine learning pipelines, experimentation-based teaching, and hands-on approach that focuses on concepts rather than underlying mathematics. We here detail the data analysis workflows we use in training and motivate them from the viewpoint of teaching goals. Our proposed approach relies on Orange, an open-source data science toolbox that combines data visualization and machine learning, and that is tailored for education in machine learning and explorative data analysis.

Download

Included in

Medical Education Commons, Medical Sciences Commons, Medical Specialties Commons

COinS

Faculty and Staff Publications

Hands-On Training About Overfitting

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Keywords

Abstract

Included in

Search

Browse

Author Corner

More Info

Library

Faculty and Staff Publications

Hands-On Training About Overfitting

Authors

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Published Open-Access

Keywords

Abstract

Included in

Share

Search

Browse

Author Corner

More Info

Library