[#10749] POST full text of the web page being bookmarked with the bookmarklet
Summary POST full text of the web page being bookmarked with the bookmarklet
Queue Trean
Type Enhancement
State Accepted
Priority 1. Low
Owners chuck@horde.org
Requester chuck@horde.org
Created 2011-11-12 (3669 days ago)
Updated 2013-01-21 (3233 days ago)
Patch No

Chuck Hagenbuch <chuck@horde.org> 2011-11-12 23:04:56
To avoid not being able to save full content for the web page being 
bookmarked if it's password protected, grab it with javascript and 
send it along to trean. From 
What part of Instapaper's infrastructure are you most proud of?

The bookmarklet has a mechanism to save pages from sites that require 
logins for full content, such as the Wall Street Journal and Harper's, 
by sending a copy of the page's HTML from the customer's browser to 
the server. It's like automating the "Save as..." menu item: if you 
have your own account for these sites and can see the page in your 
browser, you can save it to Instapaper.

The way it does this is ridiculous: instead of calling a simple GET 
request to save the page, since an entire page's contents would 
quickly overrun any URL-length limits in the stack, it injects a FORM 
with a POST action and populates a hidden value with the page contents.

But form-data requests from browsers aren't Gzip-compressed, so the 
resulting data is huge and needs to be sent over people's (often slow, 
often mobile) upstream connections. So I found an open-source DEFLATE 
implementation in Javascript - really - and the bookmarklet compresses 
the page data right there in the browser before sending it.

The whole procedure is hideously complex, but works incredibly well.

Chuck Hagenbuch <chuck@horde.org> 2013-01-21 04:05:31