from £399


15, 16 & 18 November 2021

'The course had a good balance between "lecture" and "workshop" and very well organised on Zoom! Both speakers were very available, friendly and helpful. Thank you all (speakers and organisers) for this great course!' (Estelle Moubarak, Sussex University, 22 Sep 2020)

Dates and Timings:

15 November 2 - 4 pm (optional Introduction to R session), 16 & 18 November 9:30 - 4:30pm.


Please continue to express your interest in a face to face course here 




Students £399               Professionals £499

Early Bird discounts, available until 30 September! (then £50 more)

Short Course Description:

This 2-day interactive online course will help you understand the benefits of data transformation tools (such as R). The course also includes an optional "Introduction to R" session for those not familiar with the software.

You will learn about aspects such as auditable workflow, repeatability, time-saving, improving efficiency and reduced risk of data loss. You will participate in practical data transformation tool exercises using real environmental datasets to combine and manipulate datasets in different formats from different sources, leading to analysis-ready data. The course also covers cleaning and validation of datasets and best-practice for documentation of scripts and workflows.

We will do many practical exercises. For these, you will be in virtual breakout rooms in pairs or small groups. The facilitators will move from room to room to help you with your exercises

Learning outcome:

By the end of the course, you will have gained sufficient data transformation skills and knowledge to apply this to your own datasets and projects.

Course objectives:

The course will help you to understand and improve your knowledge and skills on:

  1. The benefits of data transformation tools
  2. How data transformation tools (DTT) help users meet quality standards
  3. How to read in multiple datasets from source in different formats
  4. How DTTs can be used to clean and validate data.
  5. How to tidy data and get it 'analysis-ready'.
  6. Common data transformation operations
  7. How to combine and integrate datasets from different sources
  8. How to write data to different formats
  9. Overview of how to combine the use of different DTTs (e.g. R and Python)
  10. Introduction to data management issues and best practice when working with data

Hardware/ Software requirements:

You will need a laptop or desktop computer. A second external screen will be an advantage (but is not essential).

We will use Zoom to deliver the training course. There are 5 ways to join Zoom (and at least one of them will work for you!). We will provide more information about Zoom with the joining instructions and at the start of the course. You can find more information about Zoom on our FAQ page.

We will do lots of practical exercises, so you can continue working on your skills immediately after the training course.  You will need to install this software before the course starts.

We will explain in the joining instructions how to download and install the software.


18 places


Beginner – Intermediate (some basic knowledge of R will be an advantage, but not essential)

We are expecting you to have basic data management skills in MS Excel.

If you are an absolute beginner, we invite you to join a free R familiarisation session on Monday 1 Feb at 3 pm. We will give you confidence that you have installed everything correctly

We will also provide some guidance for semi-structured self-paced learning (an introduction to R Studio and R). You can do this in your own time before the course starts.

Target Audience:

Anyone who is looking to work with data in a reproducible manner and currently works mainly in spreadsheets or is looking to prepare data for analysis in R.

e.g. MSc /PhD/ Early Career Researchers/ Ecologists / Environmental Scientists / Environmental Consultants

Please continue to express your interest in a face to face course here 

Previous Course Participants said:

'The course had a good balance between "lecture" and "workshop" and very well organised on Zoom! Both speakers were very available, friendly and helpful. Thank you all (speakers and organisers) for this great course!' (Estelle Moubarak, Sussex University, 22 Sep 2020)

'I liked how thoroughly learners' questions were dealt with. Training style was very good, created a good learning environment with a detailed manual, and was well organised throughout the two days. ' (online learner, 22 Sep 2020)

Course leader:

David Leaver, Environmental Data Scientist and Informatics Liaison officer, UKCEH

David has a background in Chemistry and Atmospheric Sciences and works with scientists, application developers and data managers to improve data management, dissemination and science capabilities in UKCEH. He has developed tools to organise, transform and analyse data from UK-wide pollutant monitoring, ensuring the quality and traceability of results submitted to stakeholders. David has developed and delivered successful courses in relational databases and data transformation in UKCEH over a number of years.