Fermer

Advanced data management & manipulation using R - in coll. with CUSO EE

23 & 30 September 2021

Speaker

  • Dr Jan Wunder, Wunder Consulting Wald ZH
  • Dr. Tina Cornioley, University Zürich

Objectives

Participants will be able to apply R as a powerful tool to manage, manipulate and analyse their own data sets. Particularly, there are going to learn:

  • the basic concepts of data structures & data management in R
  • the application of fast and efficient libraries specifically designed for the analysis large data sets
  • how to connect R to data bases and access them using SQL queries

Content

The analysis of large data sets (“big data”) is becoming increasingly important in science and elsewhere. In this course you will learn how to use R to manage and manipulate large data sets, i.e. to sort, merge, subset, aggregate and reshape data, including outlier detection and gap filling algorithms.

For advanced data manipulation, we are going to use novel developments such as dplyr (“A Grammar of Data Manipulation”), the pipe operator (%>%) for simpler R-coding and data.table for the fast aggregation of large data sets. Furthermore, we will have a closer look at R-data base connections, SQL queries and the creation of new data bases from R.

Depending on the course progress, there will be scope for individuals to work on small projects and/ or their own data sets.

 

Course outline:

  1. Data structures
  2. Data management (merge, sort, reshape,...)
  3. “The data.table way” (data.table)
  4. “The grammar of data manipulation” (dplyr)
  5. Tidying up messy data (tidyr, NAs & outliers)
  6. Databases (ODB)
  7. Reporting (knitr)

 

The completion of an homework will be requested after the end of the course.

Requirements for attending and completing the workshop

Familiarity with R before attending the workshop or previous attendance of an introductory course to R.

For information: An Introduction to R, 7-10 June 2021, University of Lausanne

Bring your own laptop to the workshop with recent versions of R and R-Studio installed. Make sure that your laptop is properly connecting to the University of Neuchâtel's wifi or eduroam WLAN.

 

Course completion requirements:

  • Attendance – Presence and active participation is required during the entire course.
  • Home work - Participants are required to hand in a home work consisting of several exercises before 29 October.

Please, reserve a day after the course for the completion of the homework!

1.0 ECTS will be attributed only after the completion of the homework.

General information

Date: 23 & 30 September 2021 (2 days)

Schedule: 9.30-16.30 - more information on CUSO E&E web site

Venue: online with Zoom

ECTS: 1.0 (Research tools) - only after completion of the homework

Evaluation: Full attendance, active participation and completion of an homework

Information: Please contact the Doctoral Program Coordinator Pauline Fritsch, or see CUSO E&E web site

Registration fee: free

Registration

  • This course is free and open to all PhD students. However, until 29 August priority is given to PhD students enrolled into the CUSO Doctoral Program E&E and "Interuniversity doctoral program in organismal biology".
  • Post-docs are welcome as long as places are available.
  • Maximum number of participants: 16 (minimum 8 participants)

Registration through the web only: closed.

Cancellation deadline: cancellation policy (CHF 50) if you cancel your registration after 8 September 2021.