name: inverse layout: true class: center, middle, inverse
---
# Introduction to Machine learning
Anup Kumar
last_modification
Updated:
purl
PURL
:
gxy.io/GTN:S00136
text-document
Plain-text slides
|
Tip:
press
P
to view the presenter notes |
arrow-keys
Use arrow keys to move between slides
??? Presenter notes contain extra information which might be useful if you intend to use these slides for teaching. Press `P` again to switch presenter notes off Press `C` to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other. Useful when presenting. --- ## Requirements Before diving into this slide deck, we recommend you to have a look at: - [Introduction to Galaxy Analyses](/training-material/topics/introduction) --- ### <i class="far fa-question-circle" aria-hidden="true"></i><span class="visually-hidden">question</span> Questions - What is machine learning? - Why is it useful? - What are its different approaches? --- ### <i class="fas fa-bullseye" aria-hidden="true"></i><span class="visually-hidden">objectives</span> Objectives - Provide the basics of machine learning and its variants. - Learn how to do classification using the training and test data. - Learn how to use Galaxy's machine learning tools. --- # Contents - What is machine learning? - Types of machine learning - Techniques for - Hyperparameter optimisation - Learning and evaluation of models - Various applications of machine learning --- # Machine learning .pull-left[ - Learns patterns from data - Comprises of different fields - Linear algebra, statistics and probability - Programming - Data analysis - Visualization - Applicable to data from multiple fields - protein and DNA sequences, weather data, stock and house prices, images ... ] .pull-right[ ] --- # Variants of ML .center[ ] --- # Classification .pull-left[ - Supervised learning - Learn/predict classes or targets - Find decision boundary - Linear and non-linear boundaries - Algorithms are classifiers - Examples - Tumor or no tumor - Rain or no rain - ... ] .pull-right[ ] --- # Classification dataset - Breast tumor dataset - Features and target .center[ ] --- # Regression .pull-left[ - Supervised learning - Targets are real numbers - Find fitting curve - Linear or non-linear curves - Algorithms are regressors - Examples: - Temperature forecast - Stock/house prices - ... ] .pull-right[ ] --- # Regression dataset - Body fat dataset - features and target .center[ ] --- # Hyperparameter optimisation .pull-left[ - Grid search - Random search ] .pull-right[ ] --- # Learning and evaluation .pull-left[ - K-fold cross-validation - Dataset in K equal parts - Part == fold - Learn on training set - Evaluate on validation set ] .pull-right[ ] --- # Learning and evaluation .pull-left[ - Training and test sets - Learn on training set - Evaluate on test set ] .pull-right[ ] --- # Applications of machine learning .pull-left[ - BioInformatics - Protein structure prediction - Drug response prediction - Biological age prediction - Biomedical image analysis - ... - Computer vision/image recognition - Natural language processing - Speech recognition - ... ] .pull-right[ ] --- # References - Machine learning for everyone - hhttps://vas3k.com/blog/machine_learning/ - Breast cancer dataset - https://archive.ics.uci.edu/dataset/15/breast+cancer+wisconsin+original - Body fat dataset - https://rstudio-pubs-static.s3.amazonaws.com/65314_c0d1e5696cdd4e93a3784ea67f9e3d34.html --- # For additional references, please see tutorial's References section --- - Galaxy Training Materials ([training.galaxyproject.org](https://training.galaxyproject.org))  ??? - If you would like to learn more about Galaxy, there are a large number of tutorials available. - These tutorials cover a wide range of scientific domains. --- # Getting Help - **Help Forum** ([help.galaxyproject.org](https://help.galaxyproject.org))  - **Gitter Chat** - [Main Chat](https://gitter.im/galaxyproject/Lobby) - [Galaxy Training Chat](https://gitter.im/Galaxy-Training-Network/Lobby) - Many more channels (scientific domains, developers, admins) ??? - If you get stuck, there are ways to get help. - You can ask your questions on the help forum. - Or you can chat with the community on Gitter. --- # Join an event - Many Galaxy events across the globe - Event Horizon: [galaxyproject.org/events](https://galaxyproject.org/events)  ??? - There are frequent Galaxy events all around the world. - You can find upcoming events on the Galaxy Event Horizon. --- ### <i class="fas fa-key" aria-hidden="true"></i><span class="visually-hidden">keypoints</span> Key points - Machine learning algorithms learn features from data. - It is used for multiple tasks such as classification, regression, clustering and so on. - Multiple learning tasks can be performed using Galaxy's machine learning tools. - For the classification and regression tasks, data is divided into training and test sets. - Each sample/record in the training data has a category/class/label. - A machine learning algorithm learns features from the training data and do predictions on the test data. --- ## Thank You! This material is the result of a collaborative work. Thanks to the [Galaxy Training Network](https://training.galaxyproject.org) and all the contributors!
Author(s)
Anup Kumar
Reviewers
Tutorial Content is licensed under
Creative Commons Attribution 4.0 International License
.