Striving for a personal best – who judges?

In this post I’m going to look at some of the issues surrounding a form of assessment you’re probably familiar with, but not with the terminology that goes with it: ipsative assessment. So what is ipsative assessment? It’s a means of assessment that measures the student’s progress, not by comparing to some external (and often arbitrary) standard, but by comparing the student’s achievement now to their achievement in the past. It asks the questions: Did I improve? By how much? It avoids the questions: What mark did I get? Did Fred do better than me?

This shift from external validation to internal comparison is an important one, and has a number of implications. Firstly, when done well it moves the focus towards personal growth and development. This, to me, is what education should be about. Arguably it’s the only measure that really matters long term, although the secondary effects of gaining qualifications and meaningful employment are also important. Education in this sense vastly exceeds the education-as-job-training model, developing citizenship as well as knowledge. My model of education always reminds me of a line from the film Dances with Wolves, when Kicking Bear (Grahame Greene) is talking to John Dunbar (Kevin Costner): “I was just thinking that of all the trails in this life there is one that matters most. It is the trail of a true human being. I think you are on this trail and it is good to see.” This idea is not new and goes back to Ancient Greece, where the responsibilities of citizenship were deeply connected to everyday life.

coaching word cloud

Coaching word cloud

Ipsative assessment in widespread in sports, and is often practised in conjunction with a coach – it’s the process behind the concept of a ‘personal best’, where achieving a personal best is a cause for celebration regardless of the result compared to others. I have my own boat (a GP14) and sail at a local club. It’s not a competitive boat. That’s not an excuse – the boat was built in 1973 and still has the same sails (which were old then) as when I bought the boat around 2000, so I’m not going to be winning any club races, even if I were a better sailor 🙂 . But it doesn’t matter: what matters is did I have a good race? How well did I spot the wind shifts? How well did I round the marks? In short, how am I sailing in this race compared to my own skills? Which, of course, is the only thing I can actually control. In this way, ipsative assessment is ‘performance in practice’, and as such is a prime example of authentic assessment.

The school system here in the UK uses a measure called ‘value-added’, which looks at the change in scores between various stages of schooling, with examination or test scores being the primary accountability measure. The downside is that if this measure is used to judge schools rather than record their performance then there will be pressure to game the system, which means that value-added isn’t measuring what it’s supposed to. In addition, I recall reading a blog recently where a teacher was criticising value-added because it assumed that the class that were tested in year 6 contained the same children that were tested in year 1. Their particular school had a high turnover of children because of local circumstances so the assumption didn’t hold. How on earth can you measure progress without tying the data to an individual? Surely without that link value-added has no value at all?

What I like about ipsative assessment is that the attention is focused on what has been achieved rather than what has not. It also gives us an additional chance to engage learners with taking responsibility for their own learning and that’s crucial for ipsative assessment to be effective, although achieving it can be problematic. When my daughter was taking her degree each assignment had a self-assessment sheet that was a copy of the one used by the tutors. The students were supposed to judge their own performance and then compare it to the tutor’s sheet when the work was returned. My daughter, despite many conversations about why self-evaluation was useful and what the educational purpose was, would simply tick the middle box all the way down. In effect, she deferred to an external authority to judge her work.

Conceptually, there is also a link to assessment for learning (A4L). While A4L allows the teacher to gauge student knowledge it can also be seen as ipsative assessment for the teacher that then feeds into their reflective practice.

A key question is how can we formalise ipsative assessment without losing the ethos behind it? We need structures and procedures to mediate and support the process, but the last thing education needs (especially schools-based education in the UK) is another accountability stick with which to beat the teachers. Firstly, if the process is simply a tick-box exercise then it’s not being allocated the importance it needs, and neither student nor teacher will take it seriously. Secondly, it’s vital that the process is student-owned. The student must be taking part in an active way with evaluating and processing their ipsative feedback for them to get the gains it offers. As Hughes has pointed out, the student needs to be the main participant, not the teacher.

In a previous post I described Jeremy Levesley’s approach to assessment, and this could fit quite nicely with ipsative assessment. Suppose we use Prof. Levesley’s approach so that we’re getting to a mark or grade quickly and then use the freed time to put more effort into the ipsative process? We get a mark to meet our accreditation needs (and those students who will still fixate on the mark above all else), and we get to develop learner independence and the self-assessment capabilities of our students. It seems like a win-win, but would a hybrid approach work, or are we just contaminating the ipsative process? I believe it could be done if the academic systems we choose to adopt within our course reinforces the practices we wish to see in our students.

The reason I think ipsative assessment isn’t more prominent in education at the moment is the relentless focus on education (and students) as a product, rather than education as a process, as Cory Doctorow recently observed in his wide-ranging video on openness, privacy and trust in education. And that’s the wrong focus. Why train students to perform in systems that are so unrepresentative of the world beyond the campus when we could teach them to judge themselves and extend beyond their personal best?

Assessment: it’s a variable not a constant

The second post in this short series is going to be about assessment in general. How reliable is it? What are we actually trying to measure? Are we more concerned with how we measure rather than what we measure?"Assessment" You keep using that word. I do not think it means what you think it means

Reliability (assessing consistently), and validity (correctly assessing the thing we want to assess) are key concepts in assessment, but how well are we achieving these aims in practice? We pretend that assessment and marking is an objective, repeatable process, but like much of the rest of education, it’s messier and more untidy than that.

Tom Sherrington, points out that marks aren’t uniform in size. Some marks are easier and some harder to achieve so that everyone has a chance to demonstrate their learning. We can also use ‘harder’ marks to distinguish between grades, as was the case with Hannah’s sweets in a previous post. The aggregate scores are also not directly comparable. If two students have a score of 36 out of 50 then this implies equal achievement and level of knowledge but this may not be the case when we take into account where within the assessment those marks came from. If that’s the case when we’re talking about a science exam, then when it comes to more essay-based disciplines is it any surprise that “there is no correct score for an essay?”. And if you think marking is variable at school, then marking in higher education may come as a shock.

There is a big difference between assessment of learning (summative) and assessment for learning (formative), and as I mentioned in the first post in this series we tend to ignore the formative in favour of the summative. Interestingly, assessments often try to combine the two, which then begs the question: are we doing either well? Assessment is typically summative, with some feedback included in the hope that it ‘feeds forward’. The downside is that from the perspective of the student the point at which that feedback would be useful has already passed because they see this assessment as having ended – they focus on the mark since that tells them ‘how they did’. The feedback isn’t really assessment for learning, but rather an explanation of the assessment of learning, with an few added implications for future practice.

Jeremy Levesley recently gave a presentation describing his approach to assessment in a third year mathematics module. His aim is to spend as much time as possible in communication initiated by students, and to engineer situations in which that is promoted. Different assessment activities are targeted to the grade boundaries so that, for example, the question ‘is this first-class work?’ can be answered yes/no/maybe. Basic skills tests decide pass or fail, unseen exam questions decide 2i or 2ii, and student topic work decides whether the work is first-class or not. I really like this approach because it’s an example of how assessment can straddle both the formative and summative assessment camps. Because the conversations are student-initiated they can serve as assessment for learning, involving students in taking responsibility for managing their own learning. The summative nature of the assessment explicitly recognises the inherent variability in marking, and takes advantage of it by targeting it to a single answer (yes/no/maybe). This gives a corresponding reduction in workload for the tutor, freeing up time to be used more productively elsewhere.

Assessment can be misused. There can be a change of emphasis towards what can be measured relatively easily rather than what is valuable. For example, character education is an emerging area in UK school education and I can see its value (provided that by character we don’t mean conformity), but should we be ‘measuring’ it? This emphasis on measurement has two impacts. First, the valuable (but difficult to measure) gets replaced with some easier to measure proxy, so character might be measured by how many days students are involved in community volunteering. Secondly, the proxy becomes the measurement. Originally, SATs were supposed to sample student knowledge and skills at a particular point in time. What has happened is that SATs have come to dominate teaching activities as the scores have become the means by which schools are deemed to be failing or succeeding. What started as a useful benchmark has become an accountability stick with which to beat schools and teachers. Also, the pseudo-statistical judgements made about school performance are highly dubious if we’re looking for robust and valid measurements, as the Icing on the Cake blog frequently documents.

I love Carol Dweck’s work on mindset. Judgements of a student (whether from inside a school or outside) have a significant effect on student achievement because students can see themselves as failing in comparison to others, or failing because the school is failing, but this ignores the internal achievement that may be occurring. A runner may come, say, 439th in a marathon, but might have knocked ten minutes off their previous best time. They didn’t win the race, but their performance was the best of their life, so how can that be judged as a ‘failure’?
A policy being floated at the moment here in the UK is that of ‘secondary readiness’. Students take SATs at the end of key stage two (age 11). Previously, their scores would have been recorded and passed to their secondary (high) school, but the proposal now is to introduce the idea of ‘failing’ these tests. If the scores aren’t high enough then they are to retake them. This has the potential for children to be labelled (by themselves or others) as failures at a time of major social upheaval (the move to secondary school) and just before they hit puberty. Now, what could possibly go wrong with that? 🙂

I understand the need for accountability and having commonly understood standards of achievement, but the greatest educational impacts are not those projected outwards, but those reflected inwards. We like to know how we’re doing in relation to others, but more importantly how we’re doing in relation to our own ‘personal best’. That’s the subject for the next post.

Hannah got the sweets, who got indigestion?

Last Thursday in the UK around half a million 15 and 16-year olds took a GCSE maths exam, specifically the second paper in the non-calculator exam. By Friday the exam was trending on twitter (#EdexcelMaths), with one particular question attracting attention:

There are n sweets in a bag.
6 of the sweets are orange.
The rest of the sweets are yellow.

Hannah takes at random a sweet from the bag.
She eats the sweet.

Hannah then takes at random another sweet from the bag.
She eats the sweet.

The probability that Hannah eats two orange sweets is 1/3.

(a) Show that n2 – n – 90 = 0

I had a quick attempt and after one unproductive sidetrack I got the answer. So why am I writing about this? Because it fits in with the other posts on assessment I’m doing, and to explore some of the issues around it.

First, the actual  mathematical content is pretty straightforward – you only need to know how to do three things: calculate a probability without replacement, multiply fractions and rearrange an equation. This is hardly Sheldon Cooper territory.

The exam board has two tiers for the qualification (foundation and higher) and probability without replacement is only explicitly mentioned for the higher tier. The exam has been quoted as saying the question was aimed at those students who would achieve the highest grades (A and A*), and I think grade discrimination is a fair approach. I did ask my daughter (who’s currently revising and taking A-level maths) and she said she wasn’t sure she would have been able to answer it at 16. For non-UK readers, GCSE exams are taken at the end of compulsory schooling and A-levels are taken at 18, typically as a route to studying at university.

So why my unease with students finding it difficult? There’s always the charge of dumbing down levelled at exams but I don’t think that’s it. True, when I did my maths exam at that age the syllabus included calculus of polynomials and their applications, which now is only introduced at A-level, but they were different qualifications – GCSEs were only introduced after my school career had ended. I think my unease comes from the fact that I think this shouldn’t have been seen as a difficult question. Donald Clark has blogged seven reasons why he agrees with the children and thinks it wasn’t a fair question, some of which I agree with and most that I don’t.

There’s a couple of factors involved here. I recall reading a study where they looked at who could answer questions with the same maths content but that were written in different ways. That study found that questions written as word questions rather than equations were consistently harder to answer, even though there was no difference in the actual mathematical content. Secondly, I think it’s the way that maths is taught as rules and recipes to follow rather than a creative problem solving activity. This is not a criticism of the teachers because I think that it’s taught that way precisely because of the pressures that have (politically) been placed on education. As I’ve mentioned before I’m a big fan of Jo Boaler’s approach and it’s emphasis on flexibility and application of technique rather than stamp-collecting formulae. Donald Clark makes the distinction between functional maths (maths for a practical purpose such as employment) and the type of maths typically found in exams, but I think that’s a false dichotomy in this case. As Stephen Downes said “… what this question tells me is the difference between learning some mathematics and thinking mathematically.”. The difference between functional and theoretical maths (at this level) starts to disappear when we think mathematically – maths becomes a toolbox of skills to be applied to the problem at hand, rather than a particular formula in a particular topic to be remembered.

And if you’re wondering what the answer was:

The solution to Hannah's sweets

The solution

Playing with data in R

It’s coming up to that time of year again when the students get introduced to R> and the joys of the command line, which means that they can’t simply click and type a new value the way that they can when working with a spreadsheet.

R uses a variety of data types and one (data frames) has the familiar grid structure familiar from spreadsheets. I’m going to cover some of the ways we can manipulate data frames within R, and well as one of the ‘gotchas’ that can catch you out.

First of all, we need some data to play with.

sample_id <- c(1:20)
length <- c(rnorm(10, mean=22.1, sd=3), rnorm(10, mean=18.2, sd=3))
weight <- c(rnorm(10, mean=900, sd=125), rnorm(10, mean=700,sd=125)
site <- c(rep(1:2, each=5),rep(1:2, each=5))
sex <- c(rep("M", each=10), rep("F", each=10))
my_sample <- data.frame(sample_id, length, weight, site, sex)

The rnorm function selects data from a normal distribution, so the length line gives us ten samples around a mean of 22.1 with a standard deviation of 3 and ten samples around a mean of 18.2 also with a standard deviation of 3. The rep function is a quick way of entering repeating values. The result of all these commands is a data frame called my_sample, which we can examine by typing in the name:

my_sample
sample_id length weight site sex
1 1 23.09771 899.9570 1 M
2 2 19.51399 819.5591 1 M
3 3 21.79052 893.0299 1 M
4 4 24.84175 822.7836 1 M

There’s a number of ways we can look at a column. We can use the $ notation (my_sample$weight), which is my preferred method for referring to a single column. I tend not to use attach(my_sample) because if I have two data frames and the same variable name in both then things start to get confusing if both have been attached at the same time.
The second way is my_sample[, 3], which gives me the third column. Note that in R, indexing starts at one rather than at zero as is the case in many programming languages. This second notation has the form dataframe[row, col] and by leaving the row part blank R defaults to giving us all rows.
The third method (and my least favourite) is the double bracket method: my_sample[[3]]. Any of these three methods give exactly the same result.

If we want to look at rows we use the dataframe[row, col] notation but this time with the column part left blank, so for example

my_sample[4,]

gives us the fourth row and

my_sample[1:10,]

gives us the first ten rows.

This technique of selecting only particular rows or columns is called slicing and gives a lot of control over manipulating our data. What about if we want rows (or more likely) columns that aren’t next to each other in the data frame? The [row, col] format expects a single number (object) separated by commas. We can use the c() function to create a variable that contains more than one value:

my_sample[,c(2,3,5)] #gives us length, weight and sex columns.

and these slices can themselves be saved as data frames.

subset_of_sample <- my_sample[,c(2,3,5)]

Suppose I want all the male samples as one subset and all the females as another. Because the males and females are grouped within the data set I could use row slicing to do it:

female_sample <- my_sample[11:20,] #rows 11-20

but what if that wasn’t the case? We can use the value in the column to decide whether to select it.

males <- my_sample$sex == "M"
males

This gives a variable that is the same length as the column and consists only of false or true values depending on whether the value was “M” or not, and we can use this variable when we slice the data frame. The logical value means that we select the row if the value in ‘males‘ are ‘TRUE‘.

male_sample <- my_sample[males,]
head(male_sample) # head displays just the top few rows.
head(male_sample, 3) # or we can be specific about how many

We can look at a particular row and column using:

my_sample$weight[4]
my_sample[[3]][4] # third column (weight), fourth row

which is good if we suddenly discover from our lab notebook that this particular value is incorrect, because we can read a single value directly into that place without affecting the rest of the data frame.

my_sample$weight[4] <- 900.2154

The next stage is that I want to add the weight:length ratio to the data. I can calculate the values using the existing columns and then put the results into a new column that I create at the same time:

my_sample$ratio <- my_sample$weight / my_sample$length
head(my_sample)

I can also do the same thing if I need to change an existing column. Suppose I have multiple samples and I need to renumber the sample IDs to avoid confusion.

my_sample$sample_id <- my_sample$sample_id+100
head(my_sample)

Now I want to add some new data to the bottom of the data frame. First I’m going to add a row using a method that works, and then I’ll cover one of the surprises that R can spring on you.

mean(my_sample$length) # just testing
new_row <- list(121, 18.72391, 710.1846, 1, "F", 710.1846/18.72391)
my_sample <- rbind(my_sample, new_row)
mean(my_sample$length) # still works
tail(my_sample) # bottom few rows

The two lines doing the work here are the second and third. First I create the data for a new row using the list() function, and then use the rbind() function (which stands for row bind) to add it to the bottom of the data frame. The list() function is the critical part as we’ll see later. Now for the gotcha. I try to add a new row:

new_row2 <- c(122, 17.99321, 698.7815, 1, "F", 698.7815/17.99321)
my_sample <- rbind(my_sample,new_row2)

This time I’m using the c() function. It seems to have worked, but if I try to find the mean of my length column I get an error:

mean(my_sample$length) # gives error

So we might think we can remove the ‘bad’ row and have a rethink:

my_sample <- my_sample[1:21,]
mean(my_sample$length)

but that still doesn’t work. We then take a look at the row we added and find that R has converted the values to text.

new_row2
class(new_row2) # class tells what data type the variable is

That’s because the c() function needs data that’s all of the same type. R got around the problem by converting the other values into text without telling us, although we should at least be grateful it did the calculation for the ratio before the conversion. That’s called ‘type coercion’ by R or ‘casting’ by programmers, but why can’t we calculate the mean even though we’ve deleted the bad row?

str(my_sample)

See where it says ‘chr’? R wants all the values in a single column to be the same data type so it’s converted each of the columns to text as well. So that little ‘c’ in the second example of adding a row has had the effect of basically converting our entire data frame to text values rather than numbers. Depending on the size and importance of our data set that’s either an ‘oopsie’ or a moment of sheer blind panic. The good news is that we can switch it back.

my_sample$sample_id <- as.integer(my_sample$sample_id)
my_sample$length <- as.numeric(my_sample$length)
my_sample$weight <- as.numeric(my_sample$weight)
my_sample$site <- as.factor(my_sample$site)
my_sample$ratio <- as.numeric(my_sample$ratio)

These ‘as.type‘ functions convert from one data type to another (where possible), and can get us out of trouble if R does some type coercion in the background without us noticing. R has its quirks, but it’s a powerful package and well worth spending some time on if you need to move beyond basic statistics.

BYOD4L Day Four: Collaborating

The topic for day four of BYOD4L is collaboration. We were given two scenarios: one of a student finding it difficult to contribute to a group project because of other commitments, and one of a lecturer who wants to connect to other academics and give her students a wider exposure to the subject.

Collboration

Photo: Alan Cann (CC BY-NC-SA 2.0)

The student can’t make the meetings that are being arranged because they are are in the week and she works full time. I see this problem as having two parts – her need to take part in the meetings (discussion) and her need to share and access various project documents (resources). The obvious solution for discussion is to use asynchronous techniques, for example, a closed Google+ or Facebook group. However, synchronous techniques are also possible. She could see if an evening meeting could be arranged instead using a tool like Doodle.com to check availability. If she could argue that her course is of benefit to her employer she could ask for a degree of flexible working and either attend the weekday meetings face-to-face or via a Google hangout. The downside is that this requires a degree of self-confidence that she may not feel, particularly as she feels unable to ask her fellow students for a weekend meeting. Documents can be worked on collaboratively through Google docs, or shared through services such as Dropbox or Google drive.

Google hangouts are one area I would like to explore in my own practice. Despite being invited to many, their timing always seems to clash with something else. I’ve also taken part in many webinars although it’s arguable how ‘collaborative’ they are. A lot depends on the facilitator and the number of participants and their inputs.

Collaboration, I think, is a emergent property of the other activities (connecting and communicating) in a similar way to offline networking where you have to build a relationship and level of trust before you can work together.

Two fishy MOOCs

A few weeks ago, I completed two MOOCs that ran at the same time and covered similar subject areas (at least at first glance), so I thought I’d ‘compare and contrast’ the two. One was the University of Southampton’s Exploring Our Oceans course on Futurelearn, and Duke University’s Marine Megafauna course, which ran on Coursera. I do have a background in the subject – I did a degree in Marine Biology and Zoology at Bangor University  so my aim was to look at the courses from a professional (educational technology) viewpoint while refreshing my knowledge of a subject I love.

Photo credit: Strobilomyces

Photo credit: Strobilomyces

Although both courses involved the oceans they did focus on different disciplines. Southampton’s course was more of an oceanography course while the marine megafauna course, as the name suggests, used the enigmatic big beasties to draw in and hold the students’ attention. Both courses could be described as xMOOCs although, as Grainne Conole has pointed out recently, there are much more nuanced ways of describing and classifying MOOCs. Any comparisons have to take the platform into account because it isn’t a neutral actor, as we can see in the way video is used on Coursera and assessment is done on Futurelearn.

Who are the students?

The marine megafauna course largely replicates a standard model of undergraduate education placed online, and doesn’t seem to assume any existing knowledge, although with a background in the subject I might be missing something. The Southampton course also doesn’t assume existing knowledge but here the approach is different with the target demographic that of what I’ll call the ‘curious amateur’. In other words, someone who comes to the subject with curiosity, passion, but who may have little experience of the subject or studying recently. As well as not assuming existing knowledge, Exploring Our Oceans also had material explicitly marked as advanced and optional so that participants could explore a particular area in more depth.

Video. And more video.

Both courses make frequent use of video. Marine Megafauna, like many of the courses on Coursera, uses video as its primary way of delivering content. There were five to eight videos per week, mostly as video lectures with other video clips, simulations, and audio embedded within them. Futurelearn delivers learning materials in a very linear manner so for example, in week three there will be items 3.1, 3.2, etc. Some of these were videos (complete with pdf transcript), but some were text-based where that was more appropriate. And that’s as it should be – video, useful as it is, is not the one medium to ‘rule them all’. In fact, one way that I’ll catch up on a MOOC is to read the video transcript and skip to particular points if I need to any graphics to help with my understanding. Video needs to be appropriate and offer something that the participant can’t get more easily or faster through different media, and for the majority of the time the Exploring our Oceans did that. Production values were high. We saw staff filmed on the quayside, on ships and in labs explaining the issues and the science from authentic environments. Related to this, here’s an example of poor practice with video. I’m enrolled on another Futurelearn MOOC with a single academic as the lead educator. At the start of every video the academic introduces themselves and their academic affiliation as thought we’ve never met them before. It’s week five. There are multiple videos each week – it’s not like we’re going to forget who they are between step 5.2 and step 5.5.

What didn’t I like?

I felt Marine Megafauna was a little heavy on taxonomy initially as we had introductions to each group of animals. Taxonomy is important. For example, the worms that live around hydrothermal vents (and who made appearances on both courses), have moved phylum since I did my degree, and major groupings within the gastropods have also been revised in 2005 and later. I would have preferred an introduction to group X (including taxonomy) followed by exploring that group’s ecology, conservation issues and adaptations to life in the ocean in more detail. You could compare to other groups at that point or have a summary/compare and contrast section later in the course, which would serve as a good synthesis of the course so far. As it was, it felt like we were marking time until we got to the interesting parts, and course retention might have suffered at that point. For the Southampton course, the parts I disliked were outside the control of the staff. Futurelearn uses a commenting system at the bottom of the page, similar to that of blogs, rather than the forums found on other platforms. In one way, that’s good in that it keeps the comments within context, but bad in that it prevents participants from starting their own discussions and searching comments is a non-starter. The other thing I didn’t like about the Southampton course was the assessment, which I’ll come back to later.

What did I like?

In Exploring Our Oceans I liked the range of other activities that we were asked to do. We shared images, planned an expedition, and did a practical. Yes, a real life, who made that mess in the kitchen practical on water masses and stratification using salt and food dye. In Marine Megafauna, I enjoyed the three peer assessments and the fact that scientific papers were an explicit part of each weeks’ activities. We would have between one and three PLoS ONE papers each week, and the material within them was assessed through the weekly quizzes. There were supporting materials for those unused to making sense of journal articles. Exploring Our Oceans did use some journal articles when discussing how new species were described and named, but not as an integral part of the course.

Assessment

This was the area in which I found the biggest difference between the two courses, partly I think due to the different target participants (‘undergraduate-ish’ versus ‘curious amateur’), but largely due to the restrictions of the platform. Marine Megafauna had weekly quizzes with between 20 and 25 multiple choice questions, including questions that (unusually for MOOCs) went beyond factual recall. There were three attempts allowed per quiz with the best result counting. Each quiz contributed 10% to the final course mark. There were also three peer assessments – a Google Earth assignment, a species profile, and a report on a conservation issue for a particular species. The Google Earth assignment was largely quantitative and functioned as the peer marker training for the following two.

Exploring our Oceans had quizzes of five to six multiple choice questions, with three attempts per question and a sliding scale of marks (three marks for a correct answer on the first attempt down to one mark for a correct answer on the last attempt). But this is a platform issue. At a recent conference, someone who had authored Futurelearn quizzes gave their opinion on the process, the polite version of which was “nightmare”. I have seen peer assessment used successfully on other Futurelearn courses so it is possible, but it wasn’t used within this course.

Personally, I preferred the longer assessment for a number of reasons. First, it tests me and gives me a realistic idea of how I’m doing, rather than getting a good mark for remembering something from lecture one and guessing the other four questions. Secondly, more questions means fewer marks per question, so one area of difficulty or confusion doesn’t drag my score down. Thirdly, and regardless of how it contributes to the final course mark, I see it as formative, something to help me. I want to be tested. I want to know that I ‘got it’; I also want to know that my result (formative or not) actually means something and that means rigorous assessments. This may not be the same for everyone and a more rigorous assessment may discourage those participants who only see assessment as summative and lead them to believe that they are ‘failing’ rather than being shown what they need to work on.

Some final thoughts

If I didn’t already know the subject, what would I prefer? I think I’d prefer the approach of Exploring our Oceans but with the assessment of Marine Megafauna, with a clear explanation of why that form of assessment is being used. I really enjoyed both courses so if you’re interested in marine science, then I’d say keep an eye out for their next run.

P.S. Santa? Put one of these on my Christmas list please. Ta.

Plagiarism in MOOCs – whose loss?

I’m enrolled on a few MOOCs at the moment (no surprise there), some for work and some for personal interest. The two for personal interest are the Marine Megafauna course from Duke University on Coursera, and the University of Southampton’s Exploring the Oceans course on Futurelearn, which has just finished. I’ll do a post comparing the two approaches and platforms in another post, but what I want to talk about here is the issue of plagiarism that was flagged up in an email for the marine megafauna course recently.

The Marine Megafauna course uses peer review on written assignments, with a workflow of student X submits in the first week, student X marks five other assignments and self-assesses their own submission the following week, and then gets their marks the week after that. The assignment we had was to write a profile of a species for a general audience. There were a number of sections to the profile and the marking criteria were explicit so it was relatively easy to get high marks provided you followed the criteria and didn’t pick an obscure species that had little published research. I picked the leatherback turtle partly because its range extends into UK waters, and partly because the largest leatherback ever recorded washed ashore at Harlech in North Wales in 1988.

While I hadn’t been concerned with whether the assignments I evaluated were plagiarised or not a forum thread on plagiarism became quite animated and led to the course email. The position stated in the email was that “plagiarism is legally and morally a form of fraud”, but that “we wish to keep student evaluations focused on the substance of the assignment”. The email also states “students are not required to evaluate the plagiarism status of the assignments they receive” but then goes on to give advice about when it would be appropriate to zero mark if plagiarism is found. Initially, this made me feel uneasy, and I’ve yet to finalise my thoughts on the issue, so what follows is a little ‘thinking out loud’.

First of all, I’m talking specifically about plagiarism in MOOCs, not within higher education in general, where I have more conventional views. I have a number of questions:

  • If plagiarism is fraud, then who is being defrauded here and of what?
  • Is it appropriate to punish for plagiarism in a learning environment where there is no qualification or credential on offer (leaving aside the issue of signature track)?
  • Is it appropriate to punish for plagiarism with little or no training or guidance on what constitutes plagiarism?

The approach on Marine Megafauna mimics the processes of traditional higher education, but I would question if that’s appropriate. In traditional HE, there is a clear power structure and demarcation of roles. Students cede authority to academics and receive rewards (grades and qualifications) in return for their academic labour. A useful (although imperfect) analogy would be that of employer and employee. The employee conforms to the demands of the employer in expectation of the reward (salary) that they will receive later. In a MOOC that all goes out of the window because the analogy is closer to that of someone doing voluntary work and it becomes a lot more difficult (and ethically dubious) for the ’employer’ to criticise the ‘worker’ for something such as turning up late, for example. Likewise in MOOCs, the student is a free agent studying for reasons other than gaining a formal qualification. In the academic-student scenario there is an implied contract, and breaking the terms of that contract by presenting the work of another as your own carries penalties and punishments. But where is the contract in the MOOC? The only thing I’m receiving is the knowledge and skills I gain from the course and if I cheat, I only end up cheating myself (assuming I’m not signed up for something like specialisms or signature track). True, there is the honour code and a declaration that the work is the student’s own, but still: if plagiarism is fraud, then who is being defrauded here and of what? And what of the case where the plagiarism consists of content from wikipedia, where the content is explicitly licensed for re-use?

There is also the issue that the students had not been given any guidance on what constitutes plagiarism either as a submitting student or as a marker, probably I suspect because the course team weren’t expecting students to consider that. Student attitudes varied with some not concerned (“We’re not supposed to hunt for plagiarism”) while others were using online services to check for plagiarism. In fact, one of the reviewers of my submission gave the final feedback of “I’ve checked your text in … and had 90% originality.” But a low originality score is meaningless without context, and there were some cases where students had very little idea of what was plagiarism and what was not. One student questioned if their work would show as plagiarised because they’d typed it up into a word file before hand. Another explicitly asked if finding a match to a source that gave the size and dimensions of the animal counted as plagiarism. In other words, was quoting the basic biological facts of the animal plagiarism or not? With this level of awareness amongst students how can it be reasonable to use students to police plagiarism, however informally? And why should students have knowledge about the issue – they’re doing the course for fun or interest, with perhaps little recent experience of educational settings.

The third assignment is still to be marked. Personally, I won’t be checking for plagiarism – as one of the students on the forum said: “That’s not my call”. If a student wants to cheat themselves, that’s their loss. If the student is on signature track (which I won’t know) then they’ve paid a fee to the institution and it’s their job to check for plagiarism. E-learning is not an offline course put online, and that applies to the culture as well as the learning materials themselves.