Corante

AUTHORS

Donna Wentworth
( Archive | Home | Technorati Profile)

Ernest Miller
( Archive | Home )

Elizabeth Rader
( Archive | Home )

Jason Schultz
( Archive | Home )

Wendy Seltzer
( Archive | Home | Technorati Profile )

Aaron Swartz
( Archive | Home )

Alan Wexelblat
( Archive | Home )

About this weblog
Here we'll explore the nexus of legal rulings, Capitol Hill policy-making, technical standards development, and technological innovation that creates -- and will recreate -- the networked world as we know it. Among the topics we'll touch on: intellectual property conflicts, technical architecture and innovation, the evolution of copyright, private vs. public interests in Net policy-making, lobbying and the law, and more.

Disclaimer: the opinions expressed in this weblog are those of the authors and not of their respective institutions.

What Does "Copyfight" Mean?

Copyfight, the Solo Years: April 2002-March 2004

COPYFIGHTERS
a Typical Joe
Academic Copyright
Jack Balkin
John Perry Barlow
Benlog
beSpacific
bIPlog
Blogaritaville
Blogbook IP
BoingBoing
David Bollier
James Boyle
Robert Boynton
Brad Ideas
Ren Bucholz
Cabalamat: Digital Rights
Cinema Minima
CoCo
Commons-blog
Consensus @ Lawyerpoint
Copyfighter's Musings
Copyfutures
Copyright Readings
Copyrighteous
CopyrightWatch Canada
Susan Crawford
Walt Crawford
Creative Commons
Cruelty to Analog
Culture Cat
Deep Links
Derivative Work
Detritus
Julian Dibbell
DigitalConsumer
Digital Copyright Canada
Displacement of Concepts
Downhill Battle
DTM:<|
Electrolite
Exploded Library
Bret Fausett
Edward Felten - Freedom to Tinker
Edward Felten - Dashlog
Frank Field
Seth Finkelstein
Brian Flemming
Frankston, Reed
Free Culture
Free Range Librarian
Michael Froomkin
Michael Geist
Michael Geist's BNA News
Dan Gillmor
Mike Godwin
Joe Gratz
GrepLaw
James Grimmelmann
GrokLaw
Groklaw News
Matt Haughey
Erik J. Heels
ICANNWatch.org
Illegal-art.org
Induce Act blog
Inter Alia
IP & Social Justice
IPac blog
IPTAblog
Joi Ito
Jon Johansen
JD Lasica
LawMeme.org
Legal Theory Blog
Lenz Blog
Larry Lessig
Jessica Litman
James Love
Alex Macgillivray
Madisonian Theory
Maison Bisson
Kevin Marks
Tim Marman
Matt Rolls a Hoover
miniLinks
Mary Minow
Declan McCullagh
Eben Moglen
Dan Moniz
Napsterization
Nerdlaw
NQB
Danny O'Brien
Open Access
Open Codex
John Palfrey
Chris Palmer
Promote the Progress
PK News
PVR Blog
Eric Raymond
Joseph Reagle
Recording Industry vs. the People
Lisa Rein
Thomas Roessler
Seth Schoen
Doc Searls
Seb's Open Research
Shifted Librarian
Doug Simpson
Slapnose
Slashdot.org
Stay Free! Daily
Sarah Stirland
Swarthmore Coalition
Tech Law Advisor
Technology Liberation Front
Teleread
Siva Vaidhyanathan
Vertical Hold
Kim Weatherall
Weblogg-ed
David Weinberger
Matthew Yglesias

LINKABLE + THINKABLE
AKMA
Timothy Armstrong
Bag and Baggage
Charles Bailey
Beltway Blogroll
Between Lawyers
Blawg Channel
bk
Chief Blogging Officer
Drew Clark
Chris Cohen
Crawlspace
Crooked Timber
Daily Whirl
Dead Parrots Society
Delaware Law Office
J. Bradford DeLong
Betsy Devine
Dispositive
Ben Edelman
EEJD
Ernie the Attorney
FedLawyerGuy
Foreword
How Appealing
Industry Standard
IP Democracy
IPnewsblog
IP Watch
Dennis Kennedy
Rick Klau
Wendy Koslow
Kuro5hin.org
Elizabeth L. Lawley
Jerry Lawson
Legal Reader
Likelihood of Confusion
Chris Locke
Derek Lowe
Misbehaving
MIT Tech Review
NewsGrist
OtherMag
Paper Chase
Frank Paynter
PHOSITA
Scott Rosenberg
Scrivener's Error
Jeneane Sessum
Silent Lucidity
Smart Mobs
Trademark Blog
Eugene Volokh
Kevin Werbach

ORGANIZATIONS
ARL
Berkman @ Harvard
CDT
Chilling Effects
CIS @ Stanford
CPSR
Copyright Reform
Creative Commons
DigitalConsumer.org
DFC
EFF
EPIC
FIPR
FCC
FEPP
FSF
Global Internet Proj.
ICANN
IETF
ILPF
Info Commons
IP Justice
ISP @ Yale
NY for Fair Use
Open Content
PFF
Public Knowledge
Shidler Center @ UW
Tech Center @ GMU
U. Maine Tech Law Center
US Copyright Office
US Dept. of Justice
US Patent Office
W3C


In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

Copyfight

« Dionne Warwick versus the Cartel | Main | "Civil Rights for Musicians Act " Fight Gets Nastier (and More Confusing) »

August 11, 2009

Source linking back from browser copy-paste

Email This Entry

Posted by Alan Wexelblat

I can't decide if this is cool, creepy, or both. Best if you do the experiment yourself to see what's on, so follow these steps:

  1. Go to http://www.dailymail.co.uk/news/article-1205737/Man-killed-shards-glass-hurling-girlfriend-shop-window.html.

  2. In your browser (I've tried in Firefox and others report it works in IE, Chrome, and other desktop browsers) select a passage of text, say a paragraph, and "Copy" it.

  3. Bring up a text editor such as Notepad on a PC or similar (even works in Emacs) and Paste using whatever operation that editor uses for pasting text.

Now if you're like me and my friends you see the text that you copied and also this:
Read more: http://www.dailymail.co.uk/news/article-1205737/Man-killed-shards-glass-hurling-girlfriend-shop-window.html#ixzz0NuRbqTSe

The amount of text copied that is necessary to trigger this seems to vary by which browser you start in.

Viewing the page source doesn't give any immediate clues as to what's going on, so I'm guessing it's some kind of javascript hook. On the one hand I think it's a fairly clever way to encourage people to link back to the original content and seems to be much more in keeping with what I think of as the "spirit" of the Web than wrapping up content in passwords or DRM. On the other hand, silently adding text into peoples' copy buffers strikes me as creepy and probably a good way to manufacture a code injection hack.

Comments (10) + TrackBacks (0) | Category: Tech


COMMENTS

1. kdawson on August 11, 2009 4:11 PM writes...

Hmm, didn't notice this before: the code added at the end of the URL may be a UID of some sort; perhaps the ID in the cookie established for this session? It's beginning to shade towards "creepy."

Permalink to Comment

2. kdawson on August 11, 2009 4:26 PM writes...

Just looked at the 6 cookies that [www.]dailymail.co.uk dropped on me, and none has any obvious relation to the value at the end of my paste buffer. Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW. A serial number perhaps, the number of pastes done from this article? Or a hash, more likely.

Permalink to Comment

3. Dave Parker on August 11, 2009 4:50 PM writes...

Doesn't seem to work in Linux.

It's Tynt tracer (http://www.tynt.com/)

Creative Commons blog uses it too. (http://creativecommons.org/weblog/entry/16060)

Permalink to Comment

4. Scott Ellerman on August 11, 2009 7:10 PM writes...

Not that this is a problem we should have to solve, but this skulduggery is an excellent reason for Firefox users to use the NoScript extension which denies all JavaScript by default unless the user specifically authorizes it on a domain-by-domain basis. Just marked tynt.com as "Untrusted" on mine.

Permalink to Comment

5. Bryan Price on August 11, 2009 8:21 PM writes...

The first time, I was using NoScript with Firefox. I didn't get the additional. I allowed everything on the page, and got the extra bit.

I also loaded the new link. It highlighted what I highlighted to start with. But then I had to allow even more additional sites to load javascript.

YMMV.

Permalink to Comment

6. Josh Froelich on August 11, 2009 10:20 PM writes...

Unable to reproduce on Vista32 using Chrome. You know, modern Windows browsers copy the HTML out of the browser. This is then stored in the Windows Clipboard. When another app accesses the clipboard to paste, it can either paste the HTML or extract and paste only the text or do whatever it wants in pasting. But, as a security measure, nothing is "live". There is no connect back to the server. There is no javascript evaluation. That said, if you do paste HTML that contains Javascript into another tool that can render HTML and evaluate Javascript, then it would phone home and have the ability to insert arbitrary information.

The random digits are most likely a session identifier generated by the CGI program used to generate. This is generated by the CGI, before the page is served, unless there is an AJAX call via Javascript but that would be rather ridiculous.

In other words, I think this is some paranoia.

Permalink to Comment

7. Gary Stock on August 11, 2009 11:40 PM writes...

I've unpacked thousands of different cryptic URL schemes for Nexcerpt. That string is a *very* uncommon format for CGI identifiers (at least among news sources), and is not random.

kdawson said: "Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW." Taking the UC/lc mix as base 62 (which the javascript implies) my late night quick and dirty says the base 10 values are:

ixzz0NuRbqTSe = 145077917733655000000000
ixzz0NuaOYxPW = 145077917733663000000000

My bleary eyes and skepticism suggest that nine rightmost zeroes fall out of two randomly chosen base 62 strings approximately... never. So, it tracks something very specific -- of which eight have occurred recently (page views? copies?)

There have NOT been 145 trillion of much that humans would measure -- yet there have been zero of something else (for which they've reserved nine digits). Thus, this is a string of strings. My guess is a date is packed in there, along with more data that may vary, and room for still more beyond what is now being encoded.

Hmmm... I doubt we could disentangle today's unixtime (~1250000000) without knowing more. Who else has examples?

Permalink to Comment

8. DrWex on August 12, 2009 8:07 AM writes...

Thanks for all the input. It does look like Dave Parker is correct and the source is Tynt Tracer (http://tracer.tynt.com/faq-general-product-info#axzz0NyPkFFCE) - notice the format of that URL, which I coped from the navigation window of Firefox once I had visited their FAQ page.

Their claim is that the UID is a non-personal identification number, tracking activity. Given the variety and specificity of the UID, though, it does appear that individual-level tracking would be possible.

Presumably the javascript that creates the extra content and the UID also logs activity in Tynt's databases, so they know things like how you got to the page, whatever info your browser can reveal about you, and so on.

Permalink to Comment

9. leesean on August 14, 2009 4:04 PM writes...

I'm a blogger, and I tried using Tynt tracer on my blog for a couple weeks. But then then more I thought about it, the more it sort of creeped me out. I didnt want extraneous Javascript on my blog, so I stopped using it. Interesting idea, but I think I will have to pass for now.

Permalink to Comment

10. david sanger on August 16, 2009 12:31 PM writes...

An interesting method for providing the sources of quotes which could be useful in making references. It might also give them insight into what pieces of the story are being found interesting and being quoted.

Trying to paste a selection as plaintext, however, gives you blanks; they must have inserted something into the richtext format.

If you drag and drop text to another document (at least in Safari) it doesn't have this appended source URL.

They show much less concern, for some reason, for photos which also can be dragged and dropped. They don't even put identifying IPTC/XMP metadata in the photos, an easy and standard practice.

Permalink to Comment

POST A COMMENT




Remember Me?



EMAIL THIS ENTRY TO A FRIEND

Email this entry to:

Your email address:

Message (optional):




RELATED ENTRIES
Music Business for 21st Century Independent Artists
Net Neutrality? Still Could Be Kept
Hey, Look, E-Books Still Suck
Makers, Fan Art, Making it Pay
IP Analogy to Physical Property (in Architecture)
That Sound You Hear is the Anti-Neutrality Dam Breaking
Having (Mostly) Failed with Authors, Amazon Makes a Pitch for the Readers
And No Kill Switches, Either