Nginx module by Google for rewriting web pages to reduce latency and bandwidth

Here are some of the most useful PageSpeed filters. Each one has
a simple HTML example attached; click “before” to see the
original file, and “after” to see what PageSpeed produces
with that filter (and only that filter) enabled. The two
versions should look exactly the same, but the “after” one will
be (slightly) speedier. Use “view source” to see
the PageSpeed difference!

Original URL:  

Original article

Wasavi – a browser extension that transforms TEXTAREA elements into a VI editor

wasavi is an extension for Chrome, Opera and Firefox. wasavi transforms TEXTAREA element of any page into a VI editor, so you can edit the text in VI. wasavi supports almost all VI commands and some ex commands.

wasavi is under development. Any bug report or feature request is welcome.

And we also welcome a donation to continue development:


  • Here is a native TEXTAREA

    native textarea

  • Focus the TEXTAREA, and press Ctrl+Enter to launch wasavi

    wasavi running

Salient Features

  • wasavi supports some ex commands. This is the output of :set all

    set all

  • Vim’s incremental search

    incremental search

  • wasavi online app. Open this link on a browser that has wasavi extension. wasavi will launch automatically. Then you can read and write files at your Dropbox/Google Drive account.

    stand alone

Currently, wasavi is available for following browsers only. Select your browser and click the link. Standard extension installation procedure of your browser will follow. These extensions are hosted at the addons store of their respective browser.

Source code and latest development releases are hosted at Github:

A note for Opera users

Presto based Opera does not support DOM3 Composition Event therefore input via IME is not guaranteed. Presto based Opera does not allow extension to manipulate system clipboard, so the register * is unavailable on Opera.

A note for Chrome users

Chrome has reserved some fundamental shortcuts, such as Ctrl+T, Ctrl+W and Ctrl+N. Although these keys cannot be used in wasavi, you can use Alt+T, Alt+W and Alt+N.

How to launch wasavi

Focus TEXTAREA and press Ctrl+Enter.

How to quit wasavi

To quit wasavi press ZZ or :q or :wq or any other VI quit command.

Which options are accepted by the :set command?

See this table.

Note: there are also options which are accepted but don’t have any effect yet.

How to use wasavi with Vimperator on Firefox

Put wasavi_mediator.js in your Vimperator plugin directory, for example, ~/.vimperator/plugin or %HOME%vimperatorplugin.

This plugin will control the pass-through mode of Vimperator according to the state of wasavi.

How to use wasavi with Keysnail on Firefox

Put wasavi_mediator.ks.js in your Keysnail plugin directory.

This plugin will control suspend mode of Keysnail according to the state of wasavi.

How to use wasavi as an independent text editor

Install the wasavi extension and open the link to wasavi online app. wasavi will start automatically. You can use ex commands :read, :write, :edit or :file to access your Dropbox/Google Drive files. You will have to authorize wasavi via OAuth to access these storages.

About automatic setting override

The :set commands which you input while wasavi is running are stored to extension’s persistent storage, and those are regenerated when you launch wasavi next time.

This setting override mechanism works each independent URLs (max 30). If you think this is unnecessary, put :set nooverride in your exrc. Then overriding will be skipped.

I have noticed a bug

Please create an issue on wasavi issue tracker

  • to maximize the wasavi: :set fullscreen or :set fs
  • to restore the wasavi: :set nofullscreen or :set nofs
  • to change a color theme: :set theme=blight or :set theme=charcoal or :set theme=matrix
  • to modify initial settings:
    open preference wasavi extension (or enter :options on wasavi), and edit “exrc” textbox.
  • abbreviate syntax is
    • :abbreviate displays all the abbreviations currently registered.
    • :abbreviate [clear] clears all the abbreviations.
    • :abbreviate lhs displays the abbreviation corresponding to lhs.
    • :abbreviate lhs rhs registers a abbreviation which expands lhs to rhs.
    • :abbreviate [noremap] lhs rhs also registers, but it is not effected remap mechanism.
  • map syntax is
    • :map displays all the mappings currently registered.
    • :map [clear] clears all the mappings.
    • :map lhs rhs registers a rule which translates lhs to rhs. Its translation is recursive. About syntax of key stroke descriptor like in the lhs and rhs, see this page.
    • :map [noremap] lhs rhs also registers, but it is non-recursive.
    • :map targets the normal mode mappings. On the other hand,
      :map! targets the insert mode. This is equivalent to vim’s :imap.
  • j k ^ $ moves cursor by physical row, on the other hand,
    gj gk g^ g$ moves by wrapped row. To swap the behavior: :set jkdenotative
  • f/F/t/T extension for Japanese: these commands recognizes reading (ro-ma ji
    expression) of hiragana, katakana, and kanji. For example, fk will place
    a cursor on ‘か’, ‘カ’, ‘漢’ and so on.
  • f/F/t/T extension for Latin script: these commands recognizes the base alphabet
    of diacritical marked letter. For example, fa will place a cursor on
    ‘å’, ‘ä’, ‘à’, ‘â’, ‘ā’ and so on. Also see mapping table.
  • use a online storage as file system:
    • :filesystem status shows all file systems currently available.
    • :filesystem default shows default file system. You can set default file system
      via :filesystem default dropbox or :filesystem default gdrive.
    • :filesystem reset discards the access token for online storage.
    • You can place the file system name at the head of a file name explicitly:
      for instance, :read dropbox:/hello.txt.
  • When you read from the register of A to Z, some registers returns special content:
    • B register: user agent string
    • D register: current date time string (formatted by using datetime option as template of strftime(3))
    • T register: title string
    • U register: URL string
    • W register: version string of wasavi
  • To return a setting to default state:
    • :set & or :set &default
  • To return all settings to default state:
    • :set all& or :set all&default
  • To return a setting to the state just after evaluation of exrc:
  • To return all settings to the state just after evaluation of exrc:
  • To submit a form automatically after writing text and closing wasavi:
    • :wqs
    • :submit (this can be shortened to :sub )
  • [count] operation [count] motion
  • [count] operation [count] range-symbol
  • [count] surround-operation [count] motion surround-string
  • [count] surround-operation [count] range-symbol surround-string
  • [count] de-surround-operation [count] surround-identifier
  • [count] re-surround-operation [count] surround-identifier surround-string
  • [count] operation-alias
  • [count] surround-operation-alias surround-string
  • [count] motion
  • [count] scroll-command
  • [count] edit-command
  • [count] : ex-command


c y d > < gq gu gU

Operation Aliases

cc yy dd >> << C Y D gqq guu gUU yss ySS

A counter can be inserted in front of the last 1 character.

Surround Operations

  • to surround: ys yS
  • to remove a surround: ds
  • to change a surround: cs


- + ^ $ % | , ;
_ / ? ' ` ( ) { } [[ ]] 0
j k h l ^N ^P ^H

w W b B e E gg gj gk g^ g$ G H M L f F t T n N

Range symbols (Vim text objects)

  • a" a' a` a[ a] a{ a} aB a< a> a( a) ab aw aW ap as at
  • i" i' i` i[ i] i{ i} iB i< i> i( i) ib iw iW ip is it

Scroll commands

^U ^D ^Y ^E ^B ^F z z. zz z-

Edit commands

x X p P J . u ^R ~ ^L ^G ga gv m @ q r R a A i I o O & s S v V ZZ gi

ex commands

abbreviate cd chdir copy delete edit file filesystem global join k map mark marks move options print put pwd quit read redo s & ~ set sort submit registers to unabbreviate undo unmap version v write wq wqs xit yank > < @ *

The addressing in ex command is fully supported:

  • whole buffer: %s/re/rep/
  • current line: .p
  • the last line of buffer: $p
  • absolute line number: 1,2p
  • relative line number: +1,+2p
  • regal expression: /re/p ?re?p
  • mark referencing: 'a,'bp

In addition to this wasavi also accepts offset, for example: /re/+1p.
Two addresses are usually connected by a ,, wasavi also supports ;.

Input mode commands

  • ^@: input the most recently input text, and exit input mode. this key stroke is actually Ctrl+Space.
  • ^D: unshift. but if the last input character is 0 or ^, delete all indentation
  • ^H: delete a character
  • ^R: paste register’s content
  • ^T: shift
  • ^U: delete all the characters entered in the current input session
  • ^V: literal input
  • ^W: delete a word

Line input mode commands

  • ^A: move cursor to top of line
  • ^B: back
  • ^E: move cursor to end of line
  • ^F: forward
  • ^H: delete a character
  • ^N: next history
  • ^P: previous history
  • ^R: paste register’s content
  • ^U: delete whole line
  • ^V: literal input
  • ^W: delete a word
  • tab: complete ex command name, set option name, file name argument of read/edit/write/file

Bound mode commands

Bound mode is similar to vim’s visual mode.

  • c: delete the bound, and switch to insert mode
  • d: delete the bound
  • y: yank the bound
  • <: unshift the bound
  • >: shift the bound
  • C: delete the line-wise bound, and switch to insert mode
  • S: surround the bound
  • R: same as C
  • D: delete the line-wise bound
  • X: same as D
  • Y: yank the line-wise bound
  • g prefix commands
  • a, i prefix range symbols
  • ~: swap lower case and upper case in the bound
  • :: switch to line input mode
  • J: join the bound
  • p: delete the bound, and paste a register’s content
  • P: same as p
  • r: fill the bound up with inputted letter
  • s: same as c
  • u: lower-ize the bound
  • U: upper-ize the bound
  • v: character wise bound mode
  • V: line wise bound mode
  • x: same as d

Surrounding identifiers

  • quotations: one of !#$%&*+,-.:;=?@^_|~"'`
  • brackets: one of abBrt[]{}()

Surrounding string

  • quotations: one of !#$%&*+,-.:;=?@^_|~"'`
  • brackets: one of abBr[]{}()
  • tags: one of ^T ,<Tt

Vim features in wasavi

  • multiple level undo/redo
  • incremental search
  • range symbols (aka, Vim text objects)
  • register ", :, *, / (* is used to access system clipboard)
  • auto-reformat in input mode, and reformat operator (gq command) on the state of textwidth > 0
  • bound mode (aka, Vim visual mode)
  • options: iskeyword, incsearch, smartcase, undolevels, quoteescape, relativenumber, textwidth, expandtab, cursorline, cursorcolumn
  • writing to the register of A to Z
  • gu / gU + motion: lowerize or upperize a region
  • partial functionality of Surround.vim
  • partial functionality of :sort (regex pattern, r and i options)

Original URL:  

Original article

Free, easy, automated HTTPS for Node.js

SSL Certificates using SNI with almost zero configuration for free with!
This module has yet to be thoroughly tested but feel free to give it a shot!

If you have any questions, throw them up on gitter.

Join the chat at


  • Fetch SSL certificates from letsencrypt.
  • Automatically renew certificates.
  • If creating a certificate fails it will fall back to a simple self signed certificate.
  • Forward all incomming http requests to https.
var createServer = require("auto-sni");

var server = createServer({
    email: ..., // Emailed when certificates expire.
    agreeTos: true, // Required for letsencrypt.
    debug: true, // Add console messages.
    domains: ["", "(dev|staging|production)"], // Optional list of allowed domains (uses pathtoregexp)
    forceSSL: true, // Make this false to disable auto http->https redirects (default true).
    ports: {
        http: 80, // Optionally override the default http port.
        https: 443 // // Optionally override the default https port.

// Server is a "https.createServer" instance.
server.once("listening", ()=> {
    console.log("We are ready to go.");

Usage with express.

var createServer = require("auto-sni");
var express      = require("express");
var app          = express();

app.get("/test", ...);

createServer({ email: ..., agreeTos: true }, app);

Usage with koa.

var createServer = require("auto-sni");
var koa          = require("koa");
var app          = koa();


createServer({ email: ..., agreeTos: true }, app.callback());

Usage with rill.

var createServer = require("auto-sni");
var koa          = require("rill");
var app          = rill();

app.get("/test", ...);

createServer({ email: ..., agreeTos: true }, app.handler());

Usage with hapi.

// Untested (Let me know in gitter if this doesn't work.)
var createServer = require("auto-sni");
var hapi         = require("hapi");
var server       = new hapi.Server();
var secureServer = createServer({ email: ..., agreeTos: true });

server.connection({ listener: secureServer, autoListen: false, tls: true });

Usage with restify.

// Untested (Let me know in gitter if this doesn't work.)
var createServer = require("auto-sni");
var restify      = require("restify");

// Override the https module in AutoSNI with restify.
createServer.https = restify.createServer;

var app = createServer({ email: ..., agreeTos: true });
app.get("/test", ...);

AutoSNI requires access to low level ports 80 (http) and 443 (https) to operate by default.
These ports are typically restricted by the operating system.

In production (on linux servers) you can use the following command to give Node access to these ports.

sudo setcap cap_net_bind_service=+ep $(which node)

For development it’s best to set the “ports” option manually to something like:

    ports: {
        http: 3001,
        https: 3002

// Access server on localhost:3002

Currently LetsEncrypt beta imposes some rate limits on certificate creation.
These will likely increase over time. There are also talks of LetsEncrypt creating a form to increase the amount.

The current rates are:

  • 10 new registrations every 3 hours.
  • 5 certificates per domain every 7 days.


Please feel free to create a PR!

Original URL:  

Original article

How to Build a TimesMachine

At the beginning of this year, we quietly expanded TimesMachine, our virtual microfilm reader, to include every issue of The New York Times published between 1981 and 2002. Prior to this expansion, TimesMachine contained every issue published between 1851 and 1980, which consisted of over 11 million articles spread out over approximately 2.5 million pages. The new expansion adds an additional 8,035 complete issues containing 1.4 million articles over 1.6 million pages.

Creating and expanding TimesMachine presented us with several interesting technical challenges, and in this post we’ll describe how we tackled two. First, we’ll discuss the fundamental challenge with TimesMachine: efficiently providing a user with a scan of an entire day’s newspaper without requiring the download of hundreds of megabytes of data. Then, we’ll discuss a fascinating string matching problem we had to solve in order to include articles published after 1980 in TimesMachine.

The Archive, Pre-TimesMachine

Before TimesMachine was launched in 2014, articles from the archive were searchable and available to subscribers only as PDF documents. While the archive was accessible, two major problems in implementation remained: context and user experience.

Isolating an article from the surrounding content removes the context in which it was published. A modern reader might discover that on July 20, 1969, a man named John Fairfax became the first man to row across the Atlantic Ocean. However, a reader absorbed in The New York Times that morning might have been considerably more impressed by the front page news that Apollo 11, whose crew contained Neil Armstrong, had just swung into orbit around the moon in preparation for the first moon landing. Knowing where that John Fairfax article was published in the paper (bottom left of the front page) as well as what else was going on that day is much more interesting and valuable to a historian than an article on its own without the context of other news of the day.

We wanted to present the archive in all its glory as it was meant to be consumed on the day it was printed — one issue at a time. Our goal was to create a fluid viewing experience, not to force users to slowly download high resolution images. Here’s how we did that.

Publisher Pipeline

Our digitized print archive is big, containing petabytes of high-resolution page scans. Even for a single issue, the storage requirements are appreciable. The May 22, 1927 issue announcing the success of Charles Lindbergh’s pioneering trans-Atlantic flight consists of 226 pages which require nearly 200 megabytes of storage. When we built TimesMachine, we knew that there was no way we could expect users to sit through multi-hundred-megabyte downloads in order to browse a single issue. We needed a way to load just the parts of an issue that a user is looking at. We found an answer from a somewhat unexpected quarter and now, when you load that 200 megabyte Lindbergh issue in your browser, the initial page load requires the transmission of just a couple of megabytes.

We achieve this by using mapping software to display each issue. Like the pages of a scanned newspaper, a digital map is just a really big image. The technique most often used to display digital maps (and the same technique we employ for TimesMachine) is image tiling. With image tiling, a large image is broken down into a myriad of small square images, or “tiles,” computed at a variety of zoom levels.

Clever software then runs in the browser and loads only those tiles that correspond to the region of the image the user wants to see. Numerous open source software libraries have been created to make and display such tiles (we used GDAL for tile generation and leaflet.js for display). All we had to do was adapt these libraries to show you a newspaper. To do this we created a processing pipeline called The TimesMachine Publisher. Here’s how it works.

For a given issue, the pipeline takes in three inputs: high-resolution scans of pages from microfilm, XML files of article metadata, and INI files describing the geometric boundaries of every article on every page. The pipeline first stitches the pages together into one large virtual image. The coordinates of every article on every page are then projected from cartesian (x, y) coordinates into geographic (latitude and longitude) coordinates. These projected coordinates are combined with article metadata into a large JavaScript object describing the contents of a complete issue. The large virtual image is then cut up into thousands of 256×256 pixel tiles computed at several zoom levels. All of this data is uploaded to a content distribution network (CDN).

Whenever a user requests a day’s paper in TimesMachine, the client- side software downloads the JSON object describing the paper’s contents and requests only those tiles necessary to display the portion of the paper that fits into the user’s viewport. Additional data is loaded only when the user pans or zooms. Using this approach, TimesMachine delivers any day’s newspaper to the client quickly and efficiently.

Content Expansion

We encountered a fascinating issue in our attempt to expand the number of issues in TimesMachine. Initially, TimesMachine contained only those articles published between 1851 and 1980. The exclusion of data from after 1980 stems from an interesting historic quirk of our archive. Starting around 1981, The Times began keeping an archive of the complete digital text of every article published in print. In order to expand TimesMachine beyond 1980 and include links to the full text, we needed to know how our scanned print archive and our digital text archive aligned. Here is how we figured this out.

The first step was to run optical character recognition (OCR) on articles in the scanned print archive to transcribe the text as cleanly as possible. We used tesseract-ocr for this.

Here’s an example of some nicely-OCRed text:

After doing this for every article in a single day’s issue, we ended up with a bucket of scanned print articles OCRed with tesseract, and a bucket of articles from the full text archive. We then had to figure out which articles matched up between these two buckets, which was an interesting process.

Because an OCRed article is seldom an exact match for its full text counterpart, we could not align articles by simply testing for string equality. Instead, we used fuzzy string matching. Our approach was applied one issue at a time and relied on a technique known as “shingling.” Using shingling, we transformed the text of articles in both datasets into a list of tokens, and then turned the list of tokens into a list of n-token sequences called “shingles.”

We’ll illustrate with this quote by Abraham Lincoln:

This is our full text. We tokenize it by splitting it into a list of words separated by spaces. The string “secret” is considered a token in the full text.

Now we convert the list of tokens into a list of “shingles,” which are sequences of tokens. If we use a shingle size of 4, we end up with the following: 5 lists of tokens. (As you can see, the contents of the lists overlap like shingles on a roof.)

When we generate the list of shingles for every article in the full text digital archive, we get something that looks like this:

It is a reasonable hypothesis that sequences of words from an OCRed article will overlap a fair amount with sequences of words in that same article in the full text archive. We want a list of articles that contain each shingle so we can narrow down our options. Iterating through the above list, we can transform our data into the following hash table:

Now that we have a mapping of all the shingles appearing in a given issue to all the full text articles from that issue containing each shingle, we repeat the first part of the process with the OCRed text, getting a list of shingles for each article.

Let’s say OCRed article_A consists of shingle_2 and shingle_5. We can use the table above to generate a list of article candidates that might be a “match” with article_A. By looking up shingle_2 and shingle_5 in the table, we conclude that article_1, article_2, article_2 and article_5 are all potential matches for article_A.

This greatly reduces the problem space. Now, instead of having to compare every OCRed article in an issue to every full text article in an issue, which could involve tens of thousands of computationally expensive comparisons, we need only compare a short list. This ends up reducing the number of comparisons by several orders of magnitude.

To quantify the difference between the OCRed data and the full text articles, we used the Python difflib library. It gave us a nice, clean result:

From this particular example, it is clear that OCRed article_A is most likely the same article as full text article_1.

Using this process, we could match approximately 80 percent of the articles. The remaining 20 percent did not have clear enough distinctions in scores, which required us to be a little more clever. In a perfect world, the relationship between our two buckets of articles would have been one-to-one, but in this world, it was actually many-to-many. Some full text articles were represented as multiple regions in the scanned archive, and some single regions in the scanned archive corresponded to multiple items in the full text archive. We reconciled the disparity by splitting the data into paragraphs and carrying out a similar process to the one described above, on the paragraph level.

We ended up with a near-perfect, many-to-many matching of zones to the full text archive which is wonderfully searchable. You can check it out by exploring the entire Times archive at

Original URL:  

Original article

Turn the Chromecast Into a Standalone Media Player, No Internet Required

The Chromecast is a pretty awesome media player that pays for itself . If you want to use it without the internet, though, you’re fresh out of luck. This custom ROM can change that.

Read more…

Original URL:  

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: