Here we'll explore the nexus of legal rulings, Capitol Hill
policy-making, technical standards development, and technological
innovation that creates -- and will recreate -- the networked world as we
know it. Among the topics we'll touch on: intellectual property
conflicts, technical architecture and innovation, the evolution of
copyright, private vs. public interests in Net policy-making, lobbying
and the law, and more.
Disclaimer: the opinions expressed in this weblog are those of the authors and not of their respective institutions.
In the Boston area?: Join us on June 11 for Startups and the Cloud, a free event on cloud computing with insights from Intuit founder Scott Cook and others
USC/Berkeley Report: over 30% of DMCA take-down notices are improper
Posted by Jason Schultz
Jennifer Urban of USC's Intellectual Property Legal Clinic and Laura Quilter of UC Berkeley's Boalt Hall have released a summary report examining over 900 DMCA take-down notices collected from the Chilling Effects project. The report finds that nearly 1/3rd of all notices are improper and potentially illegal. The full report will be out in March 2006.
The 30% figure in your headline assumes that the defendant would win every single contested claim, which seems unlikely. Also, the report notes the obvious unrepresentativeness of the data set, but seems not to attribute much importance to it; as a result, it's unclear that even the figure of 30% for the number of contestable claims can be relied upon.
2. Jennifer Urban on November 23, 2005 5:07 PM writes...
Thanks, Bruce,
Not speaking to Copyfight's headline, but to your larger point:
Actually, we tried to be very careful about both of those things.
The 30% does NOT assume the alleged infringer can win. Note, however, that it also does not count every claim that an alleged infringer might win. We coded only for clear defenses or other clear problems--in other words, for situations where the ex ante takedown is an obvious problem, because the defendant would have a clearly good argument. Because of the fact-specific nature of copyright cases (particularly fair use), there is no way to know absolutely who would win in any case. That said, no one was more surprised by the 30% number than we were. Despite the fact that we were conservative in our count (for example, counting obvious favored uses such as parody as "fair use defense," but excluding situations where a fair use defense depended only on the amount taken--say a couple of paragraphs), we got this number. There will more on the methodology in the longer paper.
On the sample, again, we tried to be very careful. If more OSPs would give us their notices, then we could compare Google notices more effectively with the universe of notices, but broadly, we only pointed out that this is _some_ evidence that 512 notices are used in questionable cases--we did not try to make any claims for all 512 notices. We did our best with some confidential interview information, and noted that Google may very well differ from other providers in a number of ways. The non-Google, self-reported notices create a separate issue because people may very well self-report when they are right (or, at least think they are in the right). For this reason, we used these mainly to compare to Google; we noted in particular that p2p complaints turn up much more outside of Google. (Makes sense; Google is not a transmission/routing provider.) We hope for more data, and the fact that Chilling Effects now has notices from a small ISP means that some more comparison can be done in future.
All of that said--the Google set, in and of itself, is very good: Google forwards everything to Chilling Effects, good, bad, or in-between. (It is possible that Google simply refuses to accept some highly flawed notices--I don't know--but that would indicate _more_ flawed notices than we counted, rather than less.) Finally, we compared Google and self-reported notices and found that each group, separately, was also 30% flawed. (Guess people don't always know when they're right, either.)
In any case, fears about the ex ante nature of the takedown are not disproven by our study, and at least to a limited extent, are borne out. That seems clear.
Hope this helps; the full paper will have more discussion.
3. Jonathan on November 23, 2005 6:05 PM writes...
Jennifer,
I might have found a flaw with your sampling then. If Google submits everything they get, then I can't explain to you why only one of my notices is in chillingeffects database.
I know for a fact that I've submitted at least half a dozen notices to them, all for their blogger service, yet only one (an old one at that) appears in the database.
I'll double check my facts in a second, but I know that I've submitted more than that. Obviously, for whatever reason, Google isn't submitting everything that they get.
4. Jonathan on November 23, 2005 6:32 PM writes...
Jennifer,
I was apparently a bit quick with my criticism of Google. As you can see in my post, I'm excited about the study.
I checked my records and, though I've handled other problems on Blogspot, none involved DMCA notices. I only use those in worst-case scenarios so that's actually a good thing.
I'm very sorry for jumping the gun. I will, if you wish, track my next Google DMCA notice and see when it gets added to the database. I have over 60 incidents of plagiarism I need to handle and, sadly, I feel one of them will involv a notice to Google.
Again, I'm very sorry for the confusion. I hope you can forgive me, my memory isn't what it used to be.
Are you sure that Google sends all of their DMCA notices to Chilling Effects (and who at Google confirmed that)?
By comparison, the paper itself notes that The Planet, a Texas ISP, has received 1,600 notices in the last year, while Google has only submitted only 734 notices in more than three years, which seems out of scale with Google's much larger online presence. It also is inconsistent with the level of notices I know are received by other online service providers, which are more in line with The Planet's experience. Perhaps this difference can be attributed to the fact that Google doesn't really host much content (except for services like blogger), but only indexes it; but that too draws into question whether the Google sample is representative of the typical OSP experience.
This is a very interesting report.
(Not trying to be funny here, but)
Could you please date it and specify the copyright holder/publisher? I'd like to include it in a citation.
The "last modified" date in the PDF is 21-Nov-2005.
Thanks Jennifer, I look forward to the whole report. My concern about the Google sample is that DMCA notices sent to Google, which is not a host or ISP, may not be representative of the majority of takedown notices sent; it seems plausible that copyright owners attempting to remove links from search engines (as opposed to, or in addition to, removing the actual content from hosts or the networks of ISPs) skew towards the overly aggressive. That could artificially inflate the number of "flawed" notices, however defined. The data from The Planet may be more helpful in this regard, but I don't know enough about The Planet to know if it is a typical host and ISP.
Jennifer Urban of USC’s Intellectual Property Legal Clinic and Laura Quilter of the Samuelson Clinic at the University of California, Berkeley, released a summary report of a study they’ve been working on regarding DMCA takedown notices. T... [Read More]
Jennifer Urban of USC's Intellectual Property Legal Clinic and Laura Quilter of UC Berkeley's Boalt Hall have released a summary report examining over 900 DMCA take-down notices collected from the Chilling Effects project. The report finds that nearly 1/ [Read More]
1. Bruce on November 23, 2005 12:46 PM writes...
The 30% figure in your headline assumes that the defendant would win every single contested claim, which seems unlikely. Also, the report notes the obvious unrepresentativeness of the data set, but seems not to attribute much importance to it; as a result, it's unclear that even the figure of 30% for the number of contestable claims can be relied upon.
Permalink to Comment2. Jennifer Urban on November 23, 2005 5:07 PM writes...
Thanks, Bruce,
Not speaking to Copyfight's headline, but to your larger point:
Actually, we tried to be very careful about both of those things.
The 30% does NOT assume the alleged infringer can win. Note, however, that it also does not count every claim that an alleged infringer might win. We coded only for clear defenses or other clear problems--in other words, for situations where the ex ante takedown is an obvious problem, because the defendant would have a clearly good argument. Because of the fact-specific nature of copyright cases (particularly fair use), there is no way to know absolutely who would win in any case. That said, no one was more surprised by the 30% number than we were. Despite the fact that we were conservative in our count (for example, counting obvious favored uses such as parody as "fair use defense," but excluding situations where a fair use defense depended only on the amount taken--say a couple of paragraphs), we got this number. There will more on the methodology in the longer paper.
On the sample, again, we tried to be very careful. If more OSPs would give us their notices, then we could compare Google notices more effectively with the universe of notices, but broadly, we only pointed out that this is _some_ evidence that 512 notices are used in questionable cases--we did not try to make any claims for all 512 notices. We did our best with some confidential interview information, and noted that Google may very well differ from other providers in a number of ways. The non-Google, self-reported notices create a separate issue because people may very well self-report when they are right (or, at least think they are in the right). For this reason, we used these mainly to compare to Google; we noted in particular that p2p complaints turn up much more outside of Google. (Makes sense; Google is not a transmission/routing provider.) We hope for more data, and the fact that Chilling Effects now has notices from a small ISP means that some more comparison can be done in future.
All of that said--the Google set, in and of itself, is very good: Google forwards everything to Chilling Effects, good, bad, or in-between. (It is possible that Google simply refuses to accept some highly flawed notices--I don't know--but that would indicate _more_ flawed notices than we counted, rather than less.) Finally, we compared Google and self-reported notices and found that each group, separately, was also 30% flawed. (Guess people don't always know when they're right, either.)
In any case, fears about the ex ante nature of the takedown are not disproven by our study, and at least to a limited extent, are borne out. That seems clear.
Hope this helps; the full paper will have more discussion.
Permalink to Comment3. Jonathan on November 23, 2005 6:05 PM writes...
Jennifer,
I might have found a flaw with your sampling then. If Google submits everything they get, then I can't explain to you why only one of my notices is in chillingeffects database.
I know for a fact that I've submitted at least half a dozen notices to them, all for their blogger service, yet only one (an old one at that) appears in the database.
I'll double check my facts in a second, but I know that I've submitted more than that. Obviously, for whatever reason, Google isn't submitting everything that they get.
Permalink to Comment4. Jonathan on November 23, 2005 6:32 PM writes...
Jennifer,
I was apparently a bit quick with my criticism of Google. As you can see in my post, I'm excited about the study.
I checked my records and, though I've handled other problems on Blogspot, none involved DMCA notices. I only use those in worst-case scenarios so that's actually a good thing.
I'm very sorry for jumping the gun. I will, if you wish, track my next Google DMCA notice and see when it gets added to the database. I have over 60 incidents of plagiarism I need to handle and, sadly, I feel one of them will involv a notice to Google.
Again, I'm very sorry for the confusion. I hope you can forgive me, my memory isn't what it used to be.
Permalink to Comment5. Ashley Bowers on November 24, 2005 12:24 AM writes...
That is very intersting considering yahoo and msn take your site down once a DMCA is filed what if your competetion does this with no real facts?
Permalink to Comment6. Jim on November 24, 2005 12:13 PM writes...
Are you sure that Google sends all of their DMCA notices to Chilling Effects (and who at Google confirmed that)?
By comparison, the paper itself notes that The Planet, a Texas ISP, has received 1,600 notices in the last year, while Google has only submitted only 734 notices in more than three years, which seems out of scale with Google's much larger online presence. It also is inconsistent with the level of notices I know are received by other online service providers, which are more in line with The Planet's experience. Perhaps this difference can be attributed to the fact that Google doesn't really host much content (except for services like blogger), but only indexes it; but that too draws into question whether the Google sample is representative of the typical OSP experience.
Permalink to Comment7. Dr. Neal Krawetz on November 24, 2005 6:26 PM writes...
This is a very interesting report.
(Not trying to be funny here, but)
Could you please date it and specify the copyright holder/publisher? I'd like to include it in a citation.
The "last modified" date in the PDF is 21-Nov-2005.
Permalink to Comment8. Bruce on November 24, 2005 11:00 PM writes...
Thanks Jennifer, I look forward to the whole report. My concern about the Google sample is that DMCA notices sent to Google, which is not a host or ISP, may not be representative of the majority of takedown notices sent; it seems plausible that copyright owners attempting to remove links from search engines (as opposed to, or in addition to, removing the actual content from hosts or the networks of ISPs) skew towards the overly aggressive. That could artificially inflate the number of "flawed" notices, however defined. The data from The Planet may be more helpful in this regard, but I don't know enough about The Planet to know if it is a typical host and ISP.
Permalink to Comment