Playing with data in R

It’s coming up to that time of year again when the students get introduced to R> and the joys of the command line, which means that they can’t simply click and type a new value the way that they can when working with a spreadsheet.

R uses a variety of data types and one (data frames) has the familiar grid structure familiar from spreadsheets. I’m going to cover some of the ways we can manipulate data frames within R, and well as one of the ‘gotchas’ that can catch you out.

First of all, we need some data to play with.

sample_id <- c(1:20)
length <- c(rnorm(10, mean=22.1, sd=3), rnorm(10, mean=18.2, sd=3))
weight <- c(rnorm(10, mean=900, sd=125), rnorm(10, mean=700,sd=125)
site <- c(rep(1:2, each=5),rep(1:2, each=5))
sex <- c(rep("M", each=10), rep("F", each=10))
my_sample <- data.frame(sample_id, length, weight, site, sex)

The rnorm function selects data from a normal distribution, so the length line gives us ten samples around a mean of 22.1 with a standard deviation of 3 and ten samples around a mean of 18.2 also with a standard deviation of 3. The rep function is a quick way of entering repeating values. The result of all these commands is a data frame called my_sample, which we can examine by typing in the name:

sample_id length weight site sex
1 1 23.09771 899.9570 1 M
2 2 19.51399 819.5591 1 M
3 3 21.79052 893.0299 1 M
4 4 24.84175 822.7836 1 M

There’s a number of ways we can look at a column. We can use the $ notation (my_sample$weight), which is my preferred method for referring to a single column. I tend not to use attach(my_sample) because if I have two data frames and the same variable name in both then things start to get confusing if both have been attached at the same time.
The second way is my_sample[, 3], which gives me the third column. Note that in R, indexing starts at one rather than at zero as is the case in many programming languages. This second notation has the form dataframe[row, col] and by leaving the row part blank R defaults to giving us all rows.
The third method (and my least favourite) is the double bracket method: my_sample[[3]]. Any of these three methods give exactly the same result.

If we want to look at rows we use the dataframe[row, col] notation but this time with the column part left blank, so for example


gives us the fourth row and


gives us the first ten rows.

This technique of selecting only particular rows or columns is called slicing and gives a lot of control over manipulating our data. What about if we want rows (or more likely) columns that aren’t next to each other in the data frame? The [row, col] format expects a single number (object) separated by commas. We can use the c() function to create a variable that contains more than one value:

my_sample[,c(2,3,5)] #gives us length, weight and sex columns.

and these slices can themselves be saved as data frames.

subset_of_sample <- my_sample[,c(2,3,5)]

Suppose I want all the male samples as one subset and all the females as another. Because the males and females are grouped within the data set I could use row slicing to do it:

female_sample <- my_sample[11:20,] #rows 11-20

but what if that wasn’t the case? We can use the value in the column to decide whether to select it.

males <- my_sample$sex == "M"

This gives a variable that is the same length as the column and consists only of false or true values depending on whether the value was “M” or not, and we can use this variable when we slice the data frame. The logical value means that we select the row if the value in ‘males‘ are ‘TRUE‘.

male_sample <- my_sample[males,]
head(male_sample) # head displays just the top few rows.
head(male_sample, 3) # or we can be specific about how many

We can look at a particular row and column using:

my_sample[[3]][4] # third column (weight), fourth row

which is good if we suddenly discover from our lab notebook that this particular value is incorrect, because we can read a single value directly into that place without affecting the rest of the data frame.

my_sample$weight[4] <- 900.2154

The next stage is that I want to add the weight:length ratio to the data. I can calculate the values using the existing columns and then put the results into a new column that I create at the same time:

my_sample$ratio <- my_sample$weight / my_sample$length

I can also do the same thing if I need to change an existing column. Suppose I have multiple samples and I need to renumber the sample IDs to avoid confusion.

my_sample$sample_id <- my_sample$sample_id+100

Now I want to add some new data to the bottom of the data frame. First I’m going to add a row using a method that works, and then I’ll cover one of the surprises that R can spring on you.

mean(my_sample$length) # just testing
new_row <- list(121, 18.72391, 710.1846, 1, "F", 710.1846/18.72391)
my_sample <- rbind(my_sample, new_row)
mean(my_sample$length) # still works
tail(my_sample) # bottom few rows

The two lines doing the work here are the second and third. First I create the data for a new row using the list() function, and then use the rbind() function (which stands for row bind) to add it to the bottom of the data frame. The list() function is the critical part as we’ll see later. Now for the gotcha. I try to add a new row:

new_row2 <- c(122, 17.99321, 698.7815, 1, "F", 698.7815/17.99321)
my_sample <- rbind(my_sample,new_row2)

This time I’m using the c() function. It seems to have worked, but if I try to find the mean of my length column I get an error:

mean(my_sample$length) # gives error

So we might think we can remove the ‘bad’ row and have a rethink:

my_sample <- my_sample[1:21,]

but that still doesn’t work. We then take a look at the row we added and find that R has converted the values to text.

class(new_row2) # class tells what data type the variable is

That’s because the c() function needs data that’s all of the same type. R got around the problem by converting the other values into text without telling us, although we should at least be grateful it did the calculation for the ratio before the conversion. That’s called ‘type coercion’ by R or ‘casting’ by programmers, but why can’t we calculate the mean even though we’ve deleted the bad row?


See where it says ‘chr’? R wants all the values in a single column to be the same data type so it’s converted each of the columns to text as well. So that little ‘c’ in the second example of adding a row has had the effect of basically converting our entire data frame to text values rather than numbers. Depending on the size and importance of our data set that’s either an ‘oopsie’ or a moment of sheer blind panic. The good news is that we can switch it back.

my_sample$sample_id <- as.integer(my_sample$sample_id)
my_sample$length <- as.numeric(my_sample$length)
my_sample$weight <- as.numeric(my_sample$weight)
my_sample$site <- as.factor(my_sample$site)
my_sample$ratio <- as.numeric(my_sample$ratio)

These ‘as.type‘ functions convert from one data type to another (where possible), and can get us out of trouble if R does some type coercion in the background without us noticing. R has its quirks, but it’s a powerful package and well worth spending some time on if you need to move beyond basic statistics.


BYOD4L Day Four: Collaborating

The topic for day four of BYOD4L is collaboration. We were given two scenarios: one of a student finding it difficult to contribute to a group project because of other commitments, and one of a lecturer who wants to connect to other academics and give her students a wider exposure to the subject.


Photo: Alan Cann (CC BY-NC-SA 2.0)

The student can’t make the meetings that are being arranged because they are are in the week and she works full time. I see this problem as having two parts – her need to take part in the meetings (discussion) and her need to share and access various project documents (resources). The obvious solution for discussion is to use asynchronous techniques, for example, a closed Google+ or Facebook group. However, synchronous techniques are also possible. She could see if an evening meeting could be arranged instead using a tool like to check availability. If she could argue that her course is of benefit to her employer she could ask for a degree of flexible working and either attend the weekday meetings face-to-face or via a Google hangout. The downside is that this requires a degree of self-confidence that she may not feel, particularly as she feels unable to ask her fellow students for a weekend meeting. Documents can be worked on collaboratively through Google docs, or shared through services such as Dropbox or Google drive.

Google hangouts are one area I would like to explore in my own practice. Despite being invited to many, their timing always seems to clash with something else. I’ve also taken part in many webinars although it’s arguable how ‘collaborative’ they are. A lot depends on the facilitator and the number of participants and their inputs.

Collaboration, I think, is a emergent property of the other activities (connecting and communicating) in a similar way to offline networking where you have to build a relationship and level of trust before you can work together.

BYOD4L Day Three – Curating

The topic for day three of #BYOD4L is curating, and I thought I’d reflect on it through considering my sources, my stores, my outputs and finish with other potential channels.

Sources screenshot

Photo: info_grrl (CC BY-SA 2.0)

I use RSS feeds a lot, and currently have almost 450 feeds organised into thirteen categories, with around half of those feeds being directly related to work. The categories include work, web development and coding, personal, environment, journals and trial. I use Feedly  at the moment, although over the past couple of years I’ve tried quite a few readers. I used Google Reader, then moved to the Old Reader when Google shut their service down. Then I resurrected my old bloglines account, before finally moving to Feedly. So one of the important things for me about a reader is that it needs really good import and export facilities.

Twitter is another main source. I’m following nearly 1300 people on my @DuncanGreenhill account (in nine lists), plus a number who are on lists but not followed (as I explained in my day two reflection). My other account (@ourCommonOceans) has only been set up for about eight weeks and is following 135 people, many of whom were on the ‘Marine Issues’ list on my named account.

In Google+ I’ve circled around 1400 people in eighteen circles, and have a couple of other circles where I’m the only member. That seems strange, to circle yourself, but I have a ‘read later’ circle so that I can temporarily bookmark a post but keep it within the Google+ service.

In Facebook I’ve joined a number of groups concerned with marine conservation, including groups set up to help with identification. So what I have there is access to a group of very knowledgeable people that can identify organisms down to species level, which is an incredible learning opportunity for me (I live quite a distance from the sea so my ID skills are quite rusty).

My final source is email. I get some information from mailing lists, but I also get a lot of journal contents alerts through email.


I use Diigo for bookmarking, having migrated from delicious a couple of years ago. Although I did use the network features in Delicious, I tend not to use them in Diigo, but I couldn’t really give you a reason why. Perhaps because while bookmarks can be mobile, networks are much less so.

I use a wiki running on my PC to ‘store’ my notes and thoughts. There will be project documentation, course notes, notes I’ve made on reading particular papers, and many others. It’s an unusual approach, but one that works for me. It makes the process active – I have to physically collect and order my thoughts before I can enter them into the wiki. There is a time penalty to this, but I’ve found the advantages outweigh the benefits. In effect, it’s an external sum of knowledge, actively added to, searchable, and easily amended. I know some use blogs for the same purpose, but I prefer to have a more private space where I don’t want to worry about how it appears to others while I work through something.

And of course I use the usual network/local folders for document storage, and for citation management I use an open source Java application JabRef.


My outputs are Twitter, Google+ and my two blogs. My named twitter account has over 600 followers and my marine account has 24. On Google+, I’m in the circles of 667 people. The blogs are relatively low traffic, although that’s improving. For this blog, I think that’s down to the frequency of posts, and for Our Common Oceans I think it’s down to only really becoming active with that blog over the last couple of months.

Other channels and final thoughts

I may consider more overtly curated outputs such as or would need to be carefully used. I get quite a few notifications on Twitter and often they’re not focussed enough for me – there’s simply not enough signal to outweigh the noise. If I were to use it, I would have to take great care choosing and filtering my sources, and I’d probably steer away from a daily publication towards a weekly one. The aim would be to give decent content once a week, rather than a daily dose of (mostly) irrelevance. I’ve found that the scoop.its that I get via RSS feeds tend to be of a higher quality.

One thing that I’ve noticed from my posts over the past three days is that I categorise the bejebus out of everything – Twitter lists, Google circles, Feedly categories – mainly I think to create smaller contexts that I can switch between as and when I need to. It reminds my a lot of David Allen’s Getting Things Done, where the lists of next actions (to do list) is organised not by project but by context or location. And a final reflection for today – if information had calories, I think my health would be in serious trouble.

BYOD4L Day Two: Communicating – Channels and Contexts

The topic for day two of BYOD4L is communicating, and we were asked to visualise our own patterns and channels of communication. I used Freemind to create a mind map. The first question was how do I organise this? By channel or by context? I chose to organise by context, which gave me personal, professional and oCo. oCo stands for ‘our common oceans’ and is a marine conservation blog I’ve recently started. It sits between the two because it’s more than a personal interest (I studied the subject at university), but neither is it my employment. I think where to place it depends on a sense of identity i.e. is it a professional context because you have expertise in the area, or is it only professional if you’re being paid for doing it?

My communication – channels and contexts

My communication – channels and contexts
(Click for larger version)

Next, I came to the conclusion that it’s a mess, with a lot of overlap between different contexts. I have two twitter accounts. The original is @DuncanGreenhill, which has been for both personal and professional use, and @ourCommonOceans for the marine issues. I don’t see a conflict between the two because I only tweet what I’d be happy saying to (or discussing with) someone face-to-face. I am interested in politics so I may retweet things that others may disagree with. That’s fine. I know some would argue against a dual-purpose account because a future employer might google me, look at my profile and decide I’m not for them based on what they see. That’s entirely within their rights. I’d argue that by using the account for both purposes responsibly I’m demonstrating attitudes and characteristics that they might find valuable within their organisation.

I think it’s just a historical accident that there is so much overlap. As the number of people I follow has increased I’ve turned to lists to keep things manageable. Everyone I follow is on a list, but not everyone on a list is someone I follow. So for example all the accounts I’m interested in for work are on the education list, but except for a few, I don’t follow the individual accounts. That means at work, I can focus on the work lists in Tweetdeck without the timeline on my phone becoming too swamped out of work hours.

I use circles on Google+ in much the same way as lists on twitter – to compartmentalise interests so that I can deal with them and switch between them as I need to. If I see something on one of the other, more personally-focused lists, I can favourite it for later. So, for example, when lunch time comes around I can read those I’ve bookmarked (by favouriting) or switch to the ‘politics and news’ list to catch up with what’s going on. Facebook remains resolutely in the personal zone, although some pages and groups feed into the marine area.

So, what do I need to change? That’s a tricky one. Although I’m happy with the twitter account under my own name being used for two purposes, it might be worth thinking about how I more clearly differentiate between the personal and professional contexts, and hopefully without going to the extreme of having a third account to keep on top of.

Turning to the scenario of the busy student with work, family and study commitments, I’d say look to connect outside the campus using whatever channels the other students are using, and compartmentalise where possible. So if his colleagues are on Twitter, use Twitter. If they have a Facebook group, join that. Meet them where they are, and look for other more interactive opportunities to meet online, for example, Google hangouts to make up for the face-to-face contact he feels he’s missing out on.

BYOD4L Day One – Connecting

This week the Bring Your Own Device for Learning (#BYOD4L) course is running. I missed it last time so it’s great to have a chance to participate this time around. It’s a little unusual in that it only lasts for five days, with a new topic each day, and there is no registration for the course – people turn up as and when to take part in the activities.

The topic for the first day is connecting, and we’re given two scenarios on video. The first is a final year student wanting to get an up-to-date perspective on wellbeing for her final year project, and the second was a lecturer who didn’t see the point of using mobile technology in their teaching and learning. I’ve decided to explore the first scenario, because it links to my recent experience of starting again from scratch when I started a marine conservation blog and twitter account.

Joined hands:

Photo: Lotus Caroll CC BY-NC-SA 2.0

The student is not happy that the traditional academic sources such as books and journals are showing her what the reality of the field is. It’s not just that she wants to find out what the cutting edge research is, but also who is doing the research, and how that research in being applied in real initiatives to real people. She has a real opportunity here, not only to make her project thesis fresh and relevant, but also to develop some very useful transferable skills.

First, find a few key people or organisations. Look at who is publishing at the moment. What research groups do they work in? Where do they publish? If there are key papers that are a couple of years old, then look at who cited those papers. She wants an international perspective so she can go beyond academia by looking at national health provision in a few key countries – what are they doing around wellbeing? What about the World Health organisation – what are they doing?
Once this shortlist has been identified then she can start to build her network. Are any of them on twitter? If so, she can follow them and start to connect and collaborate. If they’re prominent in their field, then it can be very helpful to look at who they follow, because you’re seeing who or what they consider influential. Google+ and linkedin are also useful avenues for networking, but linkedin tends to be more restrictive in who you can connect to.

Networking is a key skill in most careers, and will be very valuable once she graduates, but can be difficult or scary in the early stages. She could talk to her university’s career service and see if they could offer any advice or training, but the important thing is to practice (especially for those of us who aren’t natural extroverts). An interesting point I came across recently around networking is that it is often people on the periphery of our ‘network’ that prove the most useful when trying to gather information outside our normal circle, precisely because they have access to people that are outside our circle.
So, in summary, I’d advise her to use the resources she does have access to to identify who she wants to connect with, and then actively network through social media or blogs.

Assessment with portfolios – a neglected opportunity?

I finally got around to reading Tony Bates’ blog post on the outlook for online learning in the short and medium term. It’s an in-depth post, but I want to concentrate on one particular aspect. Tony puts forward the idea that the traditional lecture-based course will gradually disappear as learning shifts from the transmission of information to knowledge management. Later, he talks about an increase in the use of portfolios for assessment, and to my mind these two are a natural fit. This is because as teaching shifts from the presentation of a package of content to students meeting a common set of criteria traditional written assessment becomes increasingly less fit for purpose.

Clipboard and pen

Photo credit: Dave Crosby (CC BY-SA)

The idea of assessment by portfolio in higher education isn’t new. A few years ago, I saw Lewis Elton give the plenary at the teaching and learning conference at the University of Birmingham, where he suggested using portfolios as a replacement for the UK system of degree classifications. The plenary covered much of the ground in his 2004 paper ‘Should classification of the UK honours degree have a future?’ and provoked much discussion. I believe there is a case to be made for portfolio assessment to have a greater role in higher education. As Elton says, the “measurement of achievement has become more important than achievement itself”. In a recent article discussing competition in education, Natalie Bennett (the leader of the Green Party here in the UK) quoted the Education Reform Act 1988, which said that education should: “promote the spiritual, moral, cultural, mental and physical development of pupils … and prepare pupils … for the opportunities, responsibilities and experiences of adult life.” The quote relates specifically to the context of secondary education (11-18 years old), but it’s still a pretty powerful quote. And if that’s the case for school children then surely it should hold for university and higher-level study when their capacity to engage in activities that promote those values is arguably higher? We may want assessment that is both valid (measures what it is designed to measure) and reliable (gives consistent results) for very good reasons (such as quality control and comparisons across institutions), but that restricts the types of assessment we can do. What we get (across the education system as a whole) is assessment that comes down to a number or grade, and that steers the testing towards that which is easily measured, for example, a 1,500 word essay or a three hour exam.

So what does assessment by portfolio give us that conventional assessment doesn’t? First, it’s an active process and in a well designed assessment the learner has to engage over a period of time, evaluating and reflecting on their learning. Students can revisit their work and re-purpose it, for example, for use within a presentation to a potential employer. This does bring in issues of ownership and access after the end of the course, but I’ll leave those aside for now, and in any case those criticisms isn’t unique to portfolios.

Secondly, higher order and transferable skills can be assessed more effectively than they could through a conventional written assessment. For example, selecting items for inclusion will involve evaluating individual items, reflecting on their purpose and value, and synthesising the collection into a coherent whole. This does mean that the learners need to have a high level of independent learning skills, which may not be the case. A supportive pedagogical design with clear scaffolding and direction can help develop these skills. Another point is that this form of assessment is authentic – it assesses in a direct analogue of how use these higher order skills within a workplace. Support and direction need to be explicit, not only on the process of the assessment, but also on its purpose. My daughter used an eportfolio as a record of achievement during her degree, but didn’t see the point and questioned why they couldn’t simply submit their assignments and leave it at that. Another acquaintance is studying for a primary PGCE on a course that uses the Mahara eportfolio and said that it’s almost universally hated, mainly it seems because they find the user interface unintuitive.

The portfolio can be an integrating influence that draws the rest of the course together for students. They have been used very successfully on a Master’s level distance learning course by the Open University (Mason et al, 2004). The course consisted of four modules with two items being selected from each module for inclusion. Two thirds of students were positive about the role of the portfolio as an integrating element in the course.

So what are the downsides? Well, most criticisms centre around the issues of reliability and validity, but that brings us back to Elton’s statement that achievement is second place to the measurement of achievement. Elton also said that the “prime purpose of assessment should be to encourage good learning” (Elton, 2004), and that there should be a bias towards success and not failure. He’s not referring to grade inflation, but that we should move away from the deficiency model of traditional assessment. This brings to mind the perennial debate around standards, ‘dumbing down’, and whether assessments should act as a gate-keeper (norm-referenced) or as a marker of achievement (criteria-referenced). Should everyone on a course be able to get a first class degree? Absolutely, if they’ve met the standards for a first-class degree, but I can imagine the outcry if a department were to award first class honours to an entire cohort, supposing they were lucky enough to get such a cohort in the first place.

Elton recommends that if something can be assessed reliably and validly then grade it conventionally. If something can be assessed validly but not reliably then it should be graded pass/fail using experienced examiners. If it can’t be assessed either reliably or validly then it should be reported on (again by experienced examiners) rather than graded. Knight (2002) had a similar argument, stating that summative assessment should be restricted to “that which can be reliably, affordably and fairly assessed”. Skills development and other aspects of the curriculum should be formatively assessed. Portfolios, then, blur the lines between formative and summative assessment.

Portfolios have great potential for assessment, provided that they are used wisely and that significant effort is made to change the students’ focus away from the product. It’s like getting an essay for homework at school – the essay isn’t the homework, the homework is the process of research, drafting, revision and synthesis, and the printed essay is just the evidence you did it. The sticking point is portfolios they exist within an educational ecosystem that functions to support and validate conventional assessment, and that is likely to change only slowly.


Bates, T. (2014). 2020 Vision: Outlook for online learning in 2014 and way beyond.

Bennett, N. (2014) Let’s Get Heretical on Education: Competition Has Failed.

Elton, L. (2004). Should classification of the UK honours degree have a future? Assessment and Evaluation in Higher Education, 29(4), 415–422.

Knight, P. (2002). Summative assessment in higher education: practices in disarray. Studies in Higher Education, 27, 275–286.

Mason, R. Pegler, C. and Weller, M. (2004). E-portfolios: an assessment tool for online courses. British Journal of Educational Technology, 35(6), 717–727.

Two fishy MOOCs

A few weeks ago, I completed two MOOCs that ran at the same time and covered similar subject areas (at least at first glance), so I thought I’d ‘compare and contrast’ the two. One was the University of Southampton’s Exploring Our Oceans course on Futurelearn, and Duke University’s Marine Megafauna course, which ran on Coursera. I do have a background in the subject – I did a degree in Marine Biology and Zoology at Bangor University  so my aim was to look at the courses from a professional (educational technology) viewpoint while refreshing my knowledge of a subject I love.

Photo credit: Strobilomyces

Photo credit: Strobilomyces

Although both courses involved the oceans they did focus on different disciplines. Southampton’s course was more of an oceanography course while the marine megafauna course, as the name suggests, used the enigmatic big beasties to draw in and hold the students’ attention. Both courses could be described as xMOOCs although, as Grainne Conole has pointed out recently, there are much more nuanced ways of describing and classifying MOOCs. Any comparisons have to take the platform into account because it isn’t a neutral actor, as we can see in the way video is used on Coursera and assessment is done on Futurelearn.

Who are the students?

The marine megafauna course largely replicates a standard model of undergraduate education placed online, and doesn’t seem to assume any existing knowledge, although with a background in the subject I might be missing something. The Southampton course also doesn’t assume existing knowledge but here the approach is different with the target demographic that of what I’ll call the ‘curious amateur’. In other words, someone who comes to the subject with curiosity, passion, but who may have little experience of the subject or studying recently. As well as not assuming existing knowledge, Exploring Our Oceans also had material explicitly marked as advanced and optional so that participants could explore a particular area in more depth.

Video. And more video.

Both courses make frequent use of video. Marine Megafauna, like many of the courses on Coursera, uses video as its primary way of delivering content. There were five to eight videos per week, mostly as video lectures with other video clips, simulations, and audio embedded within them. Futurelearn delivers learning materials in a very linear manner so for example, in week three there will be items 3.1, 3.2, etc. Some of these were videos (complete with pdf transcript), but some were text-based where that was more appropriate. And that’s as it should be – video, useful as it is, is not the one medium to ‘rule them all’. In fact, one way that I’ll catch up on a MOOC is to read the video transcript and skip to particular points if I need to any graphics to help with my understanding. Video needs to be appropriate and offer something that the participant can’t get more easily or faster through different media, and for the majority of the time the Exploring our Oceans did that. Production values were high. We saw staff filmed on the quayside, on ships and in labs explaining the issues and the science from authentic environments. Related to this, here’s an example of poor practice with video. I’m enrolled on another Futurelearn MOOC with a single academic as the lead educator. At the start of every video the academic introduces themselves and their academic affiliation as thought we’ve never met them before. It’s week five. There are multiple videos each week – it’s not like we’re going to forget who they are between step 5.2 and step 5.5.

What didn’t I like?

I felt Marine Megafauna was a little heavy on taxonomy initially as we had introductions to each group of animals. Taxonomy is important. For example, the worms that live around hydrothermal vents (and who made appearances on both courses), have moved phylum since I did my degree, and major groupings within the gastropods have also been revised in 2005 and later. I would have preferred an introduction to group X (including taxonomy) followed by exploring that group’s ecology, conservation issues and adaptations to life in the ocean in more detail. You could compare to other groups at that point or have a summary/compare and contrast section later in the course, which would serve as a good synthesis of the course so far. As it was, it felt like we were marking time until we got to the interesting parts, and course retention might have suffered at that point. For the Southampton course, the parts I disliked were outside the control of the staff. Futurelearn uses a commenting system at the bottom of the page, similar to that of blogs, rather than the forums found on other platforms. In one way, that’s good in that it keeps the comments within context, but bad in that it prevents participants from starting their own discussions and searching comments is a non-starter. The other thing I didn’t like about the Southampton course was the assessment, which I’ll come back to later.

What did I like?

In Exploring Our Oceans I liked the range of other activities that we were asked to do. We shared images, planned an expedition, and did a practical. Yes, a real life, who made that mess in the kitchen practical on water masses and stratification using salt and food dye. In Marine Megafauna, I enjoyed the three peer assessments and the fact that scientific papers were an explicit part of each weeks’ activities. We would have between one and three PLoS ONE papers each week, and the material within them was assessed through the weekly quizzes. There were supporting materials for those unused to making sense of journal articles. Exploring Our Oceans did use some journal articles when discussing how new species were described and named, but not as an integral part of the course.


This was the area in which I found the biggest difference between the two courses, partly I think due to the different target participants (‘undergraduate-ish’ versus ‘curious amateur’), but largely due to the restrictions of the platform. Marine Megafauna had weekly quizzes with between 20 and 25 multiple choice questions, including questions that (unusually for MOOCs) went beyond factual recall. There were three attempts allowed per quiz with the best result counting. Each quiz contributed 10% to the final course mark. There were also three peer assessments – a Google Earth assignment, a species profile, and a report on a conservation issue for a particular species. The Google Earth assignment was largely quantitative and functioned as the peer marker training for the following two.

Exploring our Oceans had quizzes of five to six multiple choice questions, with three attempts per question and a sliding scale of marks (three marks for a correct answer on the first attempt down to one mark for a correct answer on the last attempt). But this is a platform issue. At a recent conference, someone who had authored Futurelearn quizzes gave their opinion on the process, the polite version of which was “nightmare”. I have seen peer assessment used successfully on other Futurelearn courses so it is possible, but it wasn’t used within this course.

Personally, I preferred the longer assessment for a number of reasons. First, it tests me and gives me a realistic idea of how I’m doing, rather than getting a good mark for remembering something from lecture one and guessing the other four questions. Secondly, more questions means fewer marks per question, so one area of difficulty or confusion doesn’t drag my score down. Thirdly, and regardless of how it contributes to the final course mark, I see it as formative, something to help me. I want to be tested. I want to know that I ‘got it’; I also want to know that my result (formative or not) actually means something and that means rigorous assessments. This may not be the same for everyone and a more rigorous assessment may discourage those participants who only see assessment as summative and lead them to believe that they are ‘failing’ rather than being shown what they need to work on.

Some final thoughts

If I didn’t already know the subject, what would I prefer? I think I’d prefer the approach of Exploring our Oceans but with the assessment of Marine Megafauna, with a clear explanation of why that form of assessment is being used. I really enjoyed both courses so if you’re interested in marine science, then I’d say keep an eye out for their next run.

P.S. Santa? Put one of these on my Christmas list please. Ta.