Here we'll explore the nexus of legal rulings, Capitol Hill
policy-making, technical standards development, and technological
innovation that creates -- and will recreate -- the networked world as we
know it. Among the topics we'll touch on: intellectual property
conflicts, technical architecture and innovation, the evolution of
copyright, private vs. public interests in Net policy-making, lobbying
and the law, and more.
Disclaimer: the opinions expressed in this weblog are those of the authors and not of their respective institutions.
In your browser (I've tried in Firefox and others report it works in IE, Chrome, and other desktop browsers) select a passage of text, say a paragraph, and "Copy" it.
Bring up a text editor such as Notepad on a PC or similar (even works in Emacs) and Paste using whatever operation that editor uses for pasting text.
Now if you're like me and my friends you see the text that you copied and also this:
The amount of text copied that is necessary to trigger this seems to vary by which browser you start in.
Viewing the page source doesn't give any immediate clues as to what's going on, so I'm guessing it's some kind of javascript hook. On the one hand I think it's a fairly clever way to encourage people to link back to the original content and seems to be much more in keeping with what I think of as the "spirit" of the Web than wrapping up content in passwords or DRM. On the other hand, silently adding text into peoples' copy buffers strikes me as creepy and probably a good way to manufacture a code injection hack.
Hmm, didn't notice this before: the code added at the end of the URL may be a UID of some sort; perhaps the ID in the cookie established for this session? It's beginning to shade towards "creepy."
Just looked at the 6 cookies that [www.]dailymail.co.uk dropped on me, and none has any obvious relation to the value at the end of my paste buffer. Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW. A serial number perhaps, the number of pastes done from this article? Or a hash, more likely.
4. Scott Ellerman on August 11, 2009 7:10 PM writes...
Not that this is a problem we should have to solve, but this skulduggery is an excellent reason for Firefox users to use the NoScript extension which denies all JavaScript by default unless the user specifically authorizes it on a domain-by-domain basis. Just marked tynt.com as "Untrusted" on mine.
Unable to reproduce on Vista32 using Chrome. You know, modern Windows browsers copy the HTML out of the browser. This is then stored in the Windows Clipboard. When another app accesses the clipboard to paste, it can either paste the HTML or extract and paste only the text or do whatever it wants in pasting. But, as a security measure, nothing is "live". There is no connect back to the server. There is no javascript evaluation. That said, if you do paste HTML that contains Javascript into another tool that can render HTML and evaluate Javascript, then it would phone home and have the ability to insert arbitrary information.
The random digits are most likely a session identifier generated by the CGI program used to generate. This is generated by the CGI, before the page is served, unless there is an AJAX call via Javascript but that would be rather ridiculous.
I've unpacked thousands of different cryptic URL schemes for Nexcerpt. That string is a *very* uncommon format for CGI identifiers (at least among news sources), and is not random.
kdawson said: "Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW." Taking the UC/lc mix as base 62 (which the javascript implies) my late night quick and dirty says the base 10 values are:
My bleary eyes and skepticism suggest that nine rightmost zeroes fall out of two randomly chosen base 62 strings approximately... never. So, it tracks something very specific -- of which eight have occurred recently (page views? copies?)
There have NOT been 145 trillion of much that humans would measure -- yet there have been zero of something else (for which they've reserved nine digits). Thus, this is a string of strings. My guess is a date is packed in there, along with more data that may vary, and room for still more beyond what is now being encoded.
Hmmm... I doubt we could disentangle today's unixtime (~1250000000) without knowing more. Who else has examples?
Thanks for all the input. It does look like Dave Parker is correct and the source is Tynt Tracer (http://tracer.tynt.com/faq-general-product-info#axzz0NyPkFFCE) - notice the format of that URL, which I coped from the navigation window of Firefox once I had visited their FAQ page.
Their claim is that the UID is a non-personal identification number, tracking activity. Given the variety and specificity of the UID, though, it does appear that individual-level tracking would be possible.
Presumably the javascript that creates the extra content and the UID also logs activity in Tynt's databases, so they know things like how you got to the page, whatever info your browser can reveal about you, and so on.
I'm a blogger, and I tried using Tynt tracer on my blog for a couple weeks. But then then more I thought about it, the more it sort of creeped me out. I didnt want extraneous Javascript on my blog, so I stopped using it. Interesting idea, but I think I will have to pass for now.
An interesting method for providing the sources of quotes which could be useful in making references. It might also give them insight into what pieces of the story are being found interesting and being quoted.
Trying to paste a selection as plaintext, however, gives you blanks; they must have inserted something into the richtext format.
If you drag and drop text to another document (at least in Safari) it doesn't have this appended source URL.
They show much less concern, for some reason, for photos which also can be dragged and dropped. They don't even put identifying IPTC/XMP metadata in the photos, an easy and standard practice.
1. kdawson on August 11, 2009 4:11 PM writes...
Hmm, didn't notice this before: the code added at the end of the URL may be a UID of some sort; perhaps the ID in the cookie established for this session? It's beginning to shade towards "creepy."
Permalink to Comment2. kdawson on August 11, 2009 4:26 PM writes...
Just looked at the 6 cookies that [www.]dailymail.co.uk dropped on me, and none has any obvious relation to the value at the end of my paste buffer. Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW. A serial number perhaps, the number of pastes done from this article? Or a hash, more likely.
Permalink to Comment3. Dave Parker on August 11, 2009 4:50 PM writes...
Doesn't seem to work in Linux.
It's Tynt tracer (http://www.tynt.com/)
Creative Commons blog uses it too. (http://creativecommons.org/weblog/entry/16060)
Permalink to Comment4. Scott Ellerman on August 11, 2009 7:10 PM writes...
Not that this is a problem we should have to solve, but this skulduggery is an excellent reason for Firefox users to use the NoScript extension which denies all JavaScript by default unless the user specifically authorizes it on a domain-by-domain basis. Just marked tynt.com as "Untrusted" on mine.
Permalink to Comment5. Bryan Price on August 11, 2009 8:21 PM writes...
The first time, I was using NoScript with Firefox. I didn't get the additional. I allowed everything on the page, and got the extra bit.
I also loaded the new link. It highlighted what I highlighted to start with. But then I had to allow even more additional sites to load javascript.
YMMV.
Permalink to Comment6. Josh Froelich on August 11, 2009 10:20 PM writes...
Unable to reproduce on Vista32 using Chrome. You know, modern Windows browsers copy the HTML out of the browser. This is then stored in the Windows Clipboard. When another app accesses the clipboard to paste, it can either paste the HTML or extract and paste only the text or do whatever it wants in pasting. But, as a security measure, nothing is "live". There is no connect back to the server. There is no javascript evaluation. That said, if you do paste HTML that contains Javascript into another tool that can render HTML and evaluate Javascript, then it would phone home and have the ability to insert arbitrary information.
The random digits are most likely a session identifier generated by the CGI program used to generate. This is generated by the CGI, before the page is served, unless there is an AJAX call via Javascript but that would be rather ridiculous.
In other words, I think this is some paranoia.
Permalink to Comment7. Gary Stock on August 11, 2009 11:40 PM writes...
I've unpacked thousands of different cryptic URL schemes for Nexcerpt. That string is a *very* uncommon format for CGI identifiers (at least among news sources), and is not random.
kdawson said: "Except: yours is ixzz0NuRbqTSe, mine just now is ixzz0NuaOYxPW." Taking the UC/lc mix as base 62 (which the javascript implies) my late night quick and dirty says the base 10 values are:
ixzz0NuRbqTSe = 145077917733655000000000
ixzz0NuaOYxPW = 145077917733663000000000
My bleary eyes and skepticism suggest that nine rightmost zeroes fall out of two randomly chosen base 62 strings approximately... never. So, it tracks something very specific -- of which eight have occurred recently (page views? copies?)
There have NOT been 145 trillion of much that humans would measure -- yet there have been zero of something else (for which they've reserved nine digits). Thus, this is a string of strings. My guess is a date is packed in there, along with more data that may vary, and room for still more beyond what is now being encoded.
Hmmm... I doubt we could disentangle today's unixtime (~1250000000) without knowing more. Who else has examples?
Permalink to Comment8. DrWex on August 12, 2009 8:07 AM writes...
Thanks for all the input. It does look like Dave Parker is correct and the source is Tynt Tracer (http://tracer.tynt.com/faq-general-product-info#axzz0NyPkFFCE) - notice the format of that URL, which I coped from the navigation window of Firefox once I had visited their FAQ page.
Their claim is that the UID is a non-personal identification number, tracking activity. Given the variety and specificity of the UID, though, it does appear that individual-level tracking would be possible.
Presumably the javascript that creates the extra content and the UID also logs activity in Tynt's databases, so they know things like how you got to the page, whatever info your browser can reveal about you, and so on.
Permalink to Comment9. leesean on August 14, 2009 4:04 PM writes...
I'm a blogger, and I tried using Tynt tracer on my blog for a couple weeks. But then then more I thought about it, the more it sort of creeped me out. I didnt want extraneous Javascript on my blog, so I stopped using it. Interesting idea, but I think I will have to pass for now.
Permalink to Comment10. david sanger on August 16, 2009 12:31 PM writes...
An interesting method for providing the sources of quotes which could be useful in making references. It might also give them insight into what pieces of the story are being found interesting and being quoted.
Trying to paste a selection as plaintext, however, gives you blanks; they must have inserted something into the richtext format.
If you drag and drop text to another document (at least in Safari) it doesn't have this appended source URL.
They show much less concern, for some reason, for photos which also can be dragged and dropped. They don't even put identifying IPTC/XMP metadata in the photos, an easy and standard practice.
Permalink to Comment