Tidy data & Visualisation with R | - CCMAR -
 

Tidy data & Visualisation with R

Wednesday, October 26, 2022 to Friday, October 28, 2022
 
Course Description 

In research, a significant amount of effort is spent cleaning data to get it ready for visualisation and analysis. This course will teach you how to convert your messy dataset into tidy data using R for easy data visual exploration and analysis.

For this, you will be introduced to the concept of tidy data, and guided through the programmatic steps in R required to structure datasets according to its principles. Additionally, we will show how tidy data works well together with graphical functions of the R package ggplot2 thus facilitating initial exploration and analysis of the data. Finally, a “bring-your-own-data day” (optional day 3) will allow a restricted group of participants to obtain individual feedback and consulting on how to tidy up and visually inspect their own research datasets.

Pre-requisites 

1. Attendees should bring their own laptop computer with R and RStudio installed

2. You need to have a basic understanding of R programming. If you need to brush up on your R skills, a hands-on R introductory tutorial will be made available for self-learning so that you can get familiarized with basic R before the course.

3. Attendees wishing to bring their own data on the third day for tidying and brief graphical visualisation will be subject to pre-selection (limited places available). These participants should provide a short motivation letter (around 250 words), that must include a detailed description of the dataset and the scientific problem under study. 

Agenda 

Day 1 | Introduction to data wrangling
- Raw and processed data
- Components of tidy data
- Messy datasets
- Converting messy data to tidy data using the tidyverse packages
- Hands-on exercises.

 

Day 2 | Visualising tidy data using ggplot2
- Overview of available data visualisation depending on the variable type (numeric continuous, numeric discrete, categorical nominal, categorical ordinal).
- Using ggplot2 to create data visualisations: 
   - how to manipulate/select data to supply to ggplot2 (using the pipe ( %>% ) operator);
   - how to map aesthetics ( aes() );
   - how to add on layers (e.g. geom_histogram() or geom_point() );
   - how to use scales (e.g. scale_fill_brewer() );
   - faceting specifications (e.g. facet_grid() );
   - coordinate systems (e.g. coord_flip() ).
- Hands-on exercises.

 

Day 3 | Bring your own data day (optional & limited places available)
- Individual consulting for data wrangling and visual exploration. 

Scientific Organisation 
Ramiro Magno
Isabel Duarte
Instructors 

Ramiro Magno

CINTESIS, UAlg
R. Magno is a computational biologist with experience in the fields of developmental biology and biomedical sciences.

Isabel Duarte

CINTESIS, UAlg
I. Duarte is a computational biologist specialized in sequence analysis and OMICs data analysis using R. She has developed several R packages published in GitHub.
Venue 

University of Algarve l Gambelas Campus l Building 7 l Room 1.39d

Location 
Type of Training 
Advanced Training
Presentation Language  
English