« Semantic web and copyright | Main | Section 108 Study Group Report has been published »

Turnitin wins important victory in fight to combat plagiarism (and the bloat of copyright)

To the relief of many a high school, college and university administrator, Turnitin's system for helping teachers identify possible cases of plagiarism got a pass from the judge earlier this month. AV v. iParadigms (District Court, Eastern District of Virginia).

If you are not familiar with Turnitin, it's an application that teachers can use to compare their students' papers with Turnitin's database of previously compared papers and papers available from other sources to detect instances of suspicious similarity. Turnitin enables teachers to investigate originality, and at the teacher's option, take action as warranted. Students have to agree to a set of terms and conditions when they submit their papers, among which is a term that relieves Turnitin from any liability for anything resulting from the use of the system (a pretty vanilla disclaimer of liability, actually).

Of interest to me, having been asked on many occasions to opine about the legality of the "archive" feature, that is, the feature that saves a copy of each submitted paper to become a part of the comparative database, the school district in this case had authorized Turnitin to archive its students' papers, and the students had to agree to use the service or get a zero on the assignment requiring it. Thus, the students were not given a real choice about whether to agree to have their papers archived. I always thought that it was important (and so advised) to give the students a choice up front, when they signed up for the class, so that they understood that use of Turnitin was a term of the offering of the class, that one would agree to the terms of the Turnitin user agreement. Students confronted with this choice really have a choice in our higher ed environment anyway, where use of the application is rarely across the board (ie, only some faculty elect to use it). This case tested a tougher proposition, from my perspective: whether a student without a real choice about using the service can agree to the terms of the user agreement (having had to in order to get a grade) but then *modify* those terms by writing on the paper at the time of submission that the student did not authorize archiving. That's what the plaintiffs in this case did, and their attorney argued that Turnitin's archiving of the papers in violation of this attempt to change the user agreement terms infringed the students' copyrights.

No way, says Judge Hilton. (Ok, he didn't really say that. That's what I am saying.)

The court determined that the parties had entered into valid agreements (clickwraps are enforceable agreements), that the limitation on liability was enforceable and that the attempt to modify the terms of the contract failed because the user agreement indicated immediately (in its first line) that use of the service was conditioned upon the acceptance of the terms without modification. A number of other claims and defenses were all rejected by the court, and I'll leave it to the really curious to read the rest of the case, but I do want to note that the court also undertook a fair use analysis.

It should be noted that iParadigms pled fair use as an alternative defense in the event that its contract terms had failed to protect it from liability. Because the court found that the contract did in fact protect iParadigm from liability, it would seem that the fair use analysis was dicta. It was unnecessary for the court to undertake the analysis to dispose of the case. But it did the analysis anyway. Thus, while I would hesitate to cite this analysis, it does give us some insight into how this court views the 4-part test. The analysis leans heavily on recent cases like Perfect 10 v. Google, that compare speculative harms to copyright owners with the enormous public benefit of transformative uses like indexing and come to the entirely unremarkable conclusion that such uses are pretty much exactly what fair use is supposed to be all about. Let's see, Virginia is in which circuit.... the 4th circuit. So we now have a very nice representation among the circuits (9th, 2nd, 4th) of recent fair use analyses that find that massive copying and using in their entirety, even creative works, for new commercial uses that provide significant public benefit, is a fair use 1) when there is no or only speculative market harm to the market for the original (all of the Google search cases so far) and 2) even in the case of a mature market for licensing the works (the Grateful Dead poster case, Bill Graham Archives v. Dorling Kindersly). Lookin' good for creative fair use.

I have heard some folks gripe about these types of cases, that uses involving the Web and indexing and such are not really transformative. Courts don't seem to buy that right now. Maybe it's just that transformative is the only label we have to clearly identify uses we just can't afford to subject to the control of individual copyright owners. There simply are many more uses in this digital era that benefit the public without seriously interfering with incentives to create, uses that need to be free from transaction costs, permission fees, holdouts, etc. I am quite convinced, in fact, that the numbers of uses that really ought to be outside the control of a creator of a work are much larger than even these cases suggest. But, it's expensive to broaden the range of free uses one fair use case at a time. I guess we can thank these 4 students and their attorney for taking one on the chin for the greater good. Or, put another way, the lower courts are doing what Congress seems incapable of doing -- ratcheting down instead of up.

Comments (10)

David Bozak:

The question I have (not having used turnitin.com) has to do with requesting the original paper. Faculty member A h has an originality report that shows a text match with a paper from faculty member B's course. A requests a copy of the paper from turnitin.com (that is where the request goes, right?). They then forward that to faculty member B (correct?). What do they ask? Is faculty member B authorized to allow a copy of the original paper sent to A? Is B signing off that the original paper can be sent to A? Does the paper's author have any say in the matter?

If the author doesn't have any control, what controls B's answer? If there is no issue governing sending the original paper, then why is B even asked? Why wouldn't turnitin.com just send the paper?

I don't understand what this process controls for and wonder why it is done at all. Of is it cover for iParadigm in the case of privacy violations (possible violations)?

I'd appreciate hearing some background on this portion of the use of turnitin.com.

Thanks

Joe Clark:

The choice argument does provide an out, regardless of the legality of Turnitin's policies. It doesn't matter if it's legal if I am not bound to it, I suppose, no matter how reprehensible the practice may seem to some.

But can you clarify further the options I have as a student when Turnitin is required by the only instructor who teaches a course I need to graduate, or teaches the only offering available in the next 5 semesters? I can submit (in both senses) to Turnitin, or I can not graduate? Is that the choice I am offered? Further, if I am a minor student in a school district that uses it, would you also clarify some of my options there?

This recent ruling does not seem to speak to the legality of Turnitin at all, but instead only to the propriety of a procedure used by some attempting to resist the conditions under which it is used. As such it is only a "victory" from a corporate cost-containment perspective.

Further, the application of fair use vs. "the greater good" seems specious in this case when the beneficiary is, primarily, a private for-profit corporation. Plagiarism detection and prevention is indeed a common good, but like many forms of criminal activity its reduction need not come at the expense of individual rights in every case, and the battle against plagiarism does not inherently require that a private party maintain a database of prior work.

Disclaimer: I am not an attorney. :-)

David, I haven't used Turnitin either, but from what I had heard earlier from those who asked my advice about it, and from reading the description in the case, you are mistaken about the idea of a request for a look at the "match" document going to the teacher of the class where the match was produced. To my knowledge, no such request is given, nor would it need to be given, for three reasons.

Once the student has agreed to the terms of Turnitin's clickwrap agreement, Turnitin probably has all the authority it needs in its contract (that is, the contract probably includes among its terms a right to supply this copy, if requested). Secondly, as the court determined, even if the contract was inadequate to give Turnitin this right, the court found that the use was a fair use. Finally, I would note that even if someone *did* need to be asked, it wouldn't be the teacher -- it would be the student, because the student is the one who owns the copyright. Similarly to the copyright issues, all issues of privacy are obviated by the student's agreement to the clickwrap license.

Sandy Thatcher:

Georgia, you seem to be applauding the "dicta" here about fair use, but you have elsewhere been critical of courts' making fair-use decisions on grounds of alleged public benefit and then going through the four-factor analysis just to justify the decision already reached on utilitarian grounds. But there can be no limit to what any court perceives to be "public benefit": the dissenting judge in the Perfect 10 case pointed out how his colleagues just decided what U.S. public policy should be with regard to credit card companies and then created an argument to reach that conclusion. It is NOT the role of the courts to make decisions about public policy, as this dissenting judge complained. Construing such functional uses as inherently "transformative" seems to me a dangerous way of interpreting fair use. Remember what Judge Newman said in the Texaco case (in the 2nd circuit): photocopying may be of great social utility, but it has nothing to do with fair use as traditionally construed. I think the Kindersley case can be seen as falling with a traditional fair use analysis; not so the Ninth Circuit cases. So we'll all wait eagerly to see how the 2nd Circuit comes out on the Google case.

David Bozak:

Georgia,

Thanks for the response. But in the recent UMCP online course on integrity, the lawyer in this case argued differently. As I understand it, when the originality report comes to me (as a faculty member), matching text is linked to a website via URL or to a previously submitted paper. I gather I have the ability to click on that link, requesting the original paper. The lawyer did that and received a paper, the original paper in its entirety. (The argument here had to do with privacy info and FERPA, as the paper had the student's name, school, grade, and other personal info still on it).

Now, a major part of iParadigm's defense is that they take a paper and digitally "fingerprint" it. That is what is stored and used in the matching service. This is the source, I believe, of the comparison with Perfect 10, Inc. v. Google, Inc. - the thumbnail *isn't* the original and cannot replace the original. But if iParadigm also maintains the original paper, to be distributed upon request (and approval by whom?), then the comparison is not as appropriate, no?

And I can swear I've read (though I can't find it quickly right now) that iParadigm says they don't disclose to 3rd parties (perhaps that is qualified by "without permission"?), yet in the amended complaint, it is clear that the request of an original paper (source of a match in an Originality Report) took place over a weekend and within hours of a request, making it less likely or unlikely that the original instructor or the original author had given permission. If true, doesn't that muddy the waters a bit?

This is a very grey area, for me, and we're working on a policy statement for the use of plagiarism detection services (whether Turnitin, myDropbox, etc.) and at this point I'm very strongly in favor of disallowing the archiving of papers. Expert testimony in the case argued that very few matches in originality reports were from previously submitted papers - almost all were from online sources. If true then archiving gets you little increase in detecting matches and eliminates the entire legal muddle this lawsuit raised.

But then, I'm not a lawyer. I'm just a faculty member concerned about academic integrity...

Thanks for letting me respond.

-David

Joe Clark:

The manual is publicly available online at http://turnitin.com/static/training.html, meaning those of us who've never seen the software under discussion don't have to guess what it does.

It says:

"If an area of submission text is matched to a source in the student paper database on Turnitin, it will be listed as student papers. Direct Source Comparison is not available to students for student paper matches. Instructor users are able to send an e-mail request to the instructor who received the matching paper. If one instructor user profile controls the class containing both papers, that instructor user is able to see the paper in direct source comparison."

Note that the students are not consulted on either end.

Instructors can prevent TII from putting a submitted paper into the database, but if so, the paper cannot be compared with existing submissions - only with the other sources that TII compares to. That's also pointed out in the manual.

I hope that helps.

This post has generated more comments than a whole year's worth of posts...

I wish I were a Turnitin expert, but I am not. I did visit the site that Joe linked to above (minus the comma) and saw the text he cited, but can only note that it does not comport with the facts as reported in the case. Assuming that the court was not misled about the nature of what Turnitin does when it returns to a teacher a match for comparison, I would have to assume that the court is pronouncing as either contractually permitted or as fair use this practice as well as the practice of making the copy to begin with. THe distinction between thumbnails and any other copy is not discussed in the case.

As for the concern that many share about the school putting the students over a barrel, that, says the court, is a matter to take up with the school, not with Turnitin. If one is considering a policy about whether to use Turnitin, if, as David says, most matches are from the Web and not from the student database, one has little to lose by opting out of participation in using and building the student database.

And to Sandy's point about courts making policy by deciding how things should be and then justifying what their guts tell them with the squishy fair use analysis: yes, I do think that the search engine cases, the Bill Graham Archives case and this one do that, but I think they really don't have a choice. It is the nature of fair use that this is what it requires. The idea that the four factors tell you something about how to proceed *divorced* from what you think the result should be is fallacy, in may opinion.

I know that courts are not supposed to decide policy, but fair use seems invariably to invite it. So we cheer the policy we like (the fair use outcomes we like) and we boo the ones we don't like. Congress, well, there is much that can be said about Congress having abdicated its policy making role to ... don't get me started. Maybe the S.Ct. tosses the ball back in their court, but these lower courts are not doing that. And I am not currently unhappy that they are not. I think there is a principled distinction being made in these cases, around the issue of functional markets for permission, for sale, for license, etc., and lack thereof on the one hand, and creative uses (Bill Graham) or transformative duplicative (massive search engine copying) uses, on the other. I don't think courts are having a difficult time with that distinction. I don't think it runs a risk of being abused.

As a blogger this interests me because people constantly scrape my site. Plagiarism in the blogosphere is rampant!

This article was a bit over my head I am afraid however I support the notion that you can do something about it. "Students confronted with this choice really have a choice in our higher ed environment anyway,"..yes they do!

This is an interesting article. For some reason, a lot of people feel copying content from another website somehow is not plagiarism. Content on my site has been copied word for word by another colleague in my area.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on March 23, 2008 11:12 AM.

The previous post in this blog was Semantic web and copyright.

The next post in this blog is Section 108 Study Group Report has been published.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.