Location

Online

Cost

from £429

Date

autumn/winter 2023

The online course in September 2022 had 91% positive feedback.

Location:

Interactive online course using Zoom.

Future courses in person (location tbc)

Cost: 

Students from £429

Professionals from £479

Above price includes early bird discount - £70 more thereafter. 

Date:

Please express your interest here for the next in-person or online course, so we can fix more dates!

Short course description:

This 2-day interactive online course will help you understand the benefits of data transformation tools (such as R). The course also includes an "Introduction to R" session for those not familiar with the software.

You will learn about aspects such as auditable workflow, repeatability, time-saving, improving efficiency and reduced risk of data loss. You will participate in practical data transformation tool exercises using real environmental datasets to combine and manipulate datasets in different formats from different sources, leading to analysis-ready data. The course also covers cleaning and validation of datasets and best-practice for documentation of scripts and workflows.

We will do many practical exercises. For these, you will be in virtual breakout rooms in pairs or small groups. The facilitators will move from room to room to help you with your exercises. 

Learning outcome:

By the end of the course, you will have gained sufficient data transformation skills and knowledge to apply this to your own datasets and projects.

Course objectives:

The course will help you to understand and improve your knowledge and skills on:

  1. The benefits of data transformation tools
  2. How data transformation tools (DTT) help users meet quality standards
  3. How to read in multiple datasets from source in different formats
  4. How DTTs can be used to clean and validate data.
  5. How to tidy data and get it 'analysis-ready'.
  6. Common data transformation operations
  7. How to combine and integrate datasets from different sources
  8. Introduction to data management issues and best practice when working with data

Target audience:

Anyone who is looking to work with data in a reproducible manner and currently works mainly in spreadsheets or is looking to prepare data for analysis in R.

e.g. MSc /PhD/ Early career researchers/ Ecologists / Environmental scientists / Environmental consultants

Level:

Beginner – Intermediate (some basic knowledge of R will be an advantage

We are expecting you to have basic data management skills in MS Excel.

If you are a beginner, we invite you to join a free R familiarisation session on Monday 15 May 2-4 pm. We will give you confidence that you have installed everything correctly

We will also provide some guidance for semi-structured self-paced learning (an introduction to R Studio and R). You can do this in your own time before the course starts.

Places:

18 places

Hardware and software requirements:

You will need a laptop or desktop computer. A second external screen will be an advantage (but is not essential). Having a webcam is desirable (but not essential). If you plan to participate from an open-plan office or noisy environment, please wear headphones with a built-in microphone.

We will use Zoom to deliver the training course. There are 5 ways to join Zoom (and at least one of them will work for you!). We will provide more information about Zoom with the joining instructions and at the start of the course. You can find more information about Zoom on our FAQ page.

We will do lots of practical exercises, so you can continue working on your skills immediately after the training course.  You will need to install this software before the course starts.

We will explain in the joining instructions how to download and install the software.

Course leader:

David Leaver, Environmental data scientist and data steward, UKCEH

David has a background in Chemistry and Atmospheric Sciences and works with scientists, application developers and data managers to improve data management, dissemination and science capabilities in UKCEH. He has developed tools to organise, transform and analyse data from UK-wide pollutant monitoring, ensuring the quality and traceability of results submitted to stakeholders. David has developed and delivered successful courses in relational databases and data transformation in UKCEH over a number of years.

Co-trainer:

Edward Carnell, Spatial Data Analyst, UKCEH 

Ed is a Spatial Data Analyst, specialising in the modelling of atmospheric emissions and their effect on human health and to sensitive habitats. His work includes producing high-resolution emission maps of air pollutants and greenhouse gases for the UK National Atmospheric Emission Inventory, as well as collaborating with international partners. He uses a code-based approach for data analysis and is a keen advocate of data transparency and quality assurance. He has taught training courses in QGIS, R and transforming environmental data.

Previous course participants said:

The online course in September 2022 had 91% positive feedback.

“Clear explanations. easy to follow structure. all the instructors were very helpful and approachable. Thank you! “ (Course participant, September 2022)

“I liked the fact that there were lots of exercises to do which gave us the chance to apply some of the techniques we'd learnt throughout the course” (Course participant, September 2022)

“The whole course was very interesting and well composed. I liked the combination of lecture and engaging practical exercises.”  (Course participant, September 2022)

'The course had a good balance between "lecture" and "workshop" and very well organised on Zoom! Both speakers were very available, friendly and helpful. Thank you all (speakers and organisers) for this great course!' (Estelle Moubarak, Sussex University, 22 Sep 2020)

'I liked how thoroughly learners' questions were dealt with. Training style was very good, created a good learning environment with a detailed manual, and was well organised throughout the two days. ' (online learner, 22 Sep 2020)