Blade Runner re-encoded using neural networks

Last week, Warner Bros. issued a DMCA takedown notice to the video streaming website Vimeo. The notice concerned a pretty standard list of illegally uploaded files from media properties Warner owns the copyright to — including episodes of Friends and Pretty Little Liars, as well as two uploads featuring footage from the Ridley Scott movie Blade Runner.

Just a routine example of copyright infringement, right? Not exactly. Warner Bros. had just made a fascinating mistake. Some of the Blade Runner footage — which Warner has since reinstated — wasn’t actually Blade Runner footage. Or, rather, it was, but not in any form the world had ever seen.

Instead, it was part of a unique machine-learned encoding project, one that had attempted to reconstruct the classic Philip K. Dick android fable from a pile of disassembled data.

Sample reconstruction from the opening scene of Blade Runner.

In other words: Warner had just DMCA’d an artificial reconstruction of a film about artificial intelligence being indistinguishable from humans, because it couldn’t distinguish between the simulation and the real thing.

Deconstructing Blade Runner using artificial intelligence

Terence Broad is a researcher living in London and working on a master’s degree in creative computing. His dissertation, “Autoencoding Video Frames,” sounds straightforwardly boring, until you realize that it’s the key to the weird tangle of remix culture, internet copyright issues, and artificial intelligence that led Warner Bros. to file its takedown notice in the first place.

Broad’s goal was to apply “deep learning” — a fundamental piece of artificial intelligence that uses algorithmic machine learning — to video; he wanted to discover what kinds of creations a rudimentary form of AI might be able to generate when it was “taught” to understand real video data.

As a medium, video contains a huge amount of visual information. When you watch a video on a computer, all that information has usually been encoded/compressed and then decoded/decompressed to allow a computer to read files that would otherwise be too big to store on its hard drive.

Normally, video encoding happens through an automated electronic process using a compression standard developed by humans who decide what the parameters should be — how much data should be compressed into what format, and how to package and reduce different kinds of data like aspect ratio, sound, metadata, and so forth.

Broad wanted to teach an artificial neural network how to achieve this video encoding process on its own, without relying on the human factor. An artificial neural network is a machine-built simulacrum of the functions carried out by the brain and the central nervous system. It’s essentially a mechanical form of artificial intelligence that works to accomplish complex tasks by doing what a regular central nervous system does — using its various parts to gather information and communicate that information to the system as a whole.

Broad hoped that if he was successful, this new way of encoding might become “a new technique in the production of experimental image and video.” But before that could happen, he had to teach the neural network how to watch a movie — not like a person would, but like a machine.

Do encoders dream of electric sheep? (Or, how do you “teach” an AI to watch a film?)

Broad decided to use a type of neural network called a convolutional autoencoder. First, he set up what’s called a “learned similarity metric” to help the encoder identify Blade Runner data. The metric had the encoder read data from selected frames of the film, as well as “false” data, or data that’s not part of the film. By comparing the data from the film to the “outside” data, the encoder “learned” to recognize the similarities among the pieces of data that were actually from Blade Runner. In other words, it now knew what the film “looked” like.

Once it had taught itself to recognize the Blade Runner data, the encoder reduced each frame of the film to a 200-digit representation of itself and reconstructed those 200 digits into a new frame intended to match the original. (Broad chose a small file size, which contributes to the blurriness of the reconstruction in the images and videos I’ve included in this story.) Finally, Broad had the encoder resequence the reconstructed frames to match the order of the original film.

In addition to Blade Runner, Broad also “taught” his autoencoder to “watch” the rotoscope-animated film A Scanner Darkly. Both films are adaptations of famed Philip K. Dick sci-fi novels, and Broad felt they would be especially fitting for the project (more on that below).

Broad repeated the “learning” process a total of six times for both films, each time tweaking the algorithm he used to help the machine get smarter about deciding how to read the assembled data. Here’s what selected frames from Blade Runner looked like to the encoder after the sixth training. Below we see two columns of before/after shots. On the left is the original frame; on the right is the encoder’s interpretation of the frame:

Autoencoding Video Frames
Real and generated samples from the first half of Blade Runner in steps of 4,000 frames, alternating real and constructed images.

During the six training rounds, Broad only used select frames from the two films. Once he finished the sixth round of training and fine-tuning, Broad instructed the neural network to reconstruct the entirety of both films, based on what it had “learned.” Here’s a glimpse at how A Scanner Darkly turned out:

Broad told Vox in an email that the neural network’s version of the film was entirely unique, created based on what it “sees” in the original footage. “In essence, you are seeing the film through the neural network. So [the reconstruction] is the system’s interpretation of the film (and the other films I put through the models), based on its limited representational ‘understanding.'”

Why Philip K. Dick’s work is perfect for this project

Dick was a legendary science fiction writer whose work frequently combined a focus on social issues with explorations in metaphysics and the reality of our universe. The many screen adaptations his works have inspired include Minority Report, Total Recall, The Adjustment Bureau, and the Amazon TV series The Man in the High Castle.

And then there’s his famous novel Do Androids Dream of Electric Sheep?, which formed the basis of Blade Runner, a dystopian sci-fi masterpiece and one of the greatest films ever made. In the film, Harrison Ford’s character Rick Deckard has a job that involves hunting down and killing “replicants” — an advanced group of androids that pass for humans in nearly every way. The film’s antagonist, Roy Batty, is one of these replicants, famously played by a world-weary Rutger Hauer. Batty struggles with his humanity while fighting to extend his life and defeat Deckard before Deckard “retires him.”

Dick was deeply concerned with the gap between the “only apparently real” and the “really real.” In his dissertation, Broad said that he felt using two of Dick’s works for his simulation project was only fitting:

[T]here could not be a more apt film to explore these themes [of subjective rationality] with than Blade Runner (1982)… which was one of the first novels to explore the themes of arial subjectivity, and which repeatedly depicts eyes, photographs and other symbols alluding to perception.

The other film chosen to model for this project is A Scanner Darkly (2006), another adaption of a Philip K. Dick novel (2011 [1977]). This story also explores themes of the nature of reality, and is particularly interesting for being reconstructed with a neural network as every frame of the film has already been reconstructed (hand traced over the original film) by an animator.

In other words, using Blade Runner had a deeply symbolic meaning relative to a project involving artificial recreation. “I felt like the first ever film remade by a neural network had to be Blade Runner,” Broad told Vox.

A copyright conundrum

These complexities and nuances of sci-fi culture and artificial learning were quite possibly lost on whoever decided to file the takedown claim for Warner Bros. Perhaps that’s why, after Vox contacted Warner Bros., the company conducted an investigation and reinstated the two videos it had initially taken down.

Still, Broad noted to Vox that the way he used Blade Runner in his AI research doesn’t exactly constitute a cut-and-dried legal case: “No one has ever made a video like this before, so I guess there is no precedent for this and no legal definition of whether these reconstructed videos are an infringement of copyright.”

But whether or not his videos continue to rise above copyright claims, Broad’s experiments won’t just stop with Blade Runner. On Medium, where he detailed the project, he wrote that he “was astonished at how well the model performed as soon as I started training it on Blade Runner,” and that he would “certainly be doing more experiments training these models on more films in future to see what they produce.”

The potential for machines to accurately and easily “read” and recreate video footage opens up exciting possibilities both for artificial intelligence and video creation. Obviously there’s still a long way to go before Broad’s neural network generates earth-shattering video technology, but we can safely say already — we’ve seen things you people wouldn’t believe.

Original URL:

Original article

Microsoft’s Official Guide for a DIY, Raspberry Pi-Powered Magic Mirror with Face Detection

Smart mirrors have been all the rage this year, and it looks like Microsoft’s getting into the game too. While Microsoft’s mirror is teased as a commercial product, they’ve released the source code if you’re interested in making one for yourself.

Read more…

Original URL:

Original article

Carte Blanche – isolated development space with integrated fuzz testing

Carte Blanche is an isolated development space with integrated fuzz testing for your components. See them individually, explore them in different states and quickly and confidently develop them.

Screenshot of Carte Blanche

30 seconds feature video on Youtube


Setting up Carte Blanche is an easy two-step process:

  1. Install the plugin with npm install --save-dev carte-blanche

  2. Add it to the plugins in your development webpack configuration, specifying a relative path to the folder with your components in the componentRoot option:

    var CarteBlanche = require('carte-blanche');
    /* … */
    plugins: [
      new CarteBlanche({
        componentRoot: './src/components'

That’s it, now start your development environment and go to /carte-blanche to see your Carte Blanche!


You can specify some options for the webpack plugin:

  • componentRoot (required): Folder where your component modules are.

      plugins: [
        new CarteBlanche({
          componentRoot: 'src/components'
  • dest (default: 'carte-blanche'): Change the location of your Carte Blanche. Needs to be a path.

      plugins: [
        new CarteBlanche({
          componentRoot: 'src/components',
          dest: 'components'
  • plugins (default: ReactPlugin): An array of plugins to use in your Carte Blanche. (Want to write your own? See for more information!)

      var ReactPlugin = require('carte-blanche-react-plugin');
      var SourcePlugin = require('carte-blanche-source-plugin');
      plugins: [
        new CarteBlanche({
          componentRoot: 'src/components',
          plugins: [
           new SourcePlugin({ /* …options for the plugin here… */ }),
           new ReactPlugin()
  • filter (default: matches uppercase files and uppercase folders with an index file): Regex that matches your components in the componentRoot folder. We do not recommend changing this, as it might have unintended side effects.

      plugins: [
        new CarteBlanche({
          filter: /.*.jsx$/ // Matches all files ending in .jsx

This project has a custom plugin system to make it as extensible as possible. By default, we include the ReactPlugin, which has options of itself. (to pass these in you’ll have to explicitly specify it with the plugins option)

ReactPlugin Options

  • variationFolderName (default: variations): The name of the folders that stores the variation files.

    new ReactPlugin({
      variationFolderName: 'examples'
  • port (default: 8082): The port the variations server runs at.

    new ReactPlugin({
      port: 7000
  • hostname (default: localhost): The URL the variations server runs at.

    new ReactPlugin({
      hostname: ''


This is a list of endorsed plugins that are useable right now:

Want to write your own plugin? Check out!


Copyright (c) 2016 Nikolaus Graf and Maximilian Stoiber, licensed under the MIT License.

Original URL:

Original article

Varnish Website’s IT Infrastructure

One of the major reasons for the website upgrade the Varnish Project
has been going through this month, was in an effort to eat more of
our own dogfood.

The principle of eating your own dogfood is important for software
quality, that is how you experience what your users are dealing with
and I am not the least ashamed to admit that several obvious improvements
have already appeared on my TODO list as a result of this transition.

But it is also important to externalize what you learn doing so, and
therefore I thought I would document here how the projects new “internal
IT” works.


Who cares?

Yes, we use some kind of hardware, but to be honest I don’t know what
it is.

Our primary site runs on a RootBSD ‘Omega’
virtual server somewhere near CDG/Paris.

And as backup/integration/testing server we can use any server,
virtual or physical, as long as it has a internet connection and
contemporary performance, because the entire install is scripted
and under version control (more below)

Operating System

So, dogfood: Obviously FreeBSD.

Apart from the obvious reason that I wrote a lot of FreeBSD and
can get world-class support by bugging my buddies about it, there
are two equally serious reasons for the Varnish Project to run on
FreeBSD: Dogfood and jails.

Varnish Cache is not “software for Linux”, it is software for any
competent UNIX-like operating system, and FreeBSD is our primary
“keep us honest about this” platform.


You have probably heard about Docker and Containers, but FreeBSD
have had jails
since I wrote them in 1998
and they’re a wonderful way to keep your server installation

We currently have three jails:

Script & Version Control All The Things

We have a git repos with shell scripts which create these jails
from scratch and also a script to configure the host machine

That means that the procedure to install a clone of the server
is, unabridged:

# Install FreeBSD 10.3 (if not already done by hosting)
# Configure networking (if not already done by hosting)
# Set the clock
service ntpdate forcestart
# Get git
env ASSUME_ALWAYS_YES=yes pkg install git
# Clone the private git repo
git clone ssh://
# Edit the machines IP numbers in /etc/pf.conf
# Configure the host
sh |& tee
# Build the jails
foreach i (Tools Hitch Varnish)
        (cd $i ; sh build* |& tee

From bare hardware to ready system in 15-30 minutes.

It goes without saying that this git repos contains stuff
like ssh host keys, so it should not go on github.


Right now there is nothing we need to backup.

When I move the mailserver/mailman/mailing lists over, those will
need to be backed up, but here the trick is to only backup the
minimal set of files, and in a “exchange” format, so that future
migrations and upgrades can slurp them in right away.

The Homepage

The new homepage is built with Sphinx
and lives in its own
github project (Pull requests
are very welcome!)

We have taken snapshots of some of the old webproperties, Trac, the
Forum etc as static HTML copies.

Why on Earth…

It is a little bit tedious to get a setup like this going, whenever
you tweak some config file, you need to remember to pull the change
back out and put it in your Admin repos.

But that extra effort pays of so many times later.

You never have to wonder “who made that change and why” or even try
to remember what changes were needed in the first place.

For us as a project, it means, that all our sysadmin people
can build a clone of our infrastructure, if they have a copy of
our “Admin” git repos and access to github.

And when FreeBSD 11
comes out, or a new version of sphinx or something else, mucking
about with things until they work can be done at leisure without
guess work.

For instance I just added the forum snapshot, by working out all
the kinks on one of my test-machines.

Once it was as I wanted it, I pushed the changes the live machine and then:

varnishadm vcl.use backup
# The 'backup' VCL does a "pass" of all trafic to my server
cd Admin
git pull
cd Tools
sh |& tee
varnishadm vcl.load foobar varnish-live.vcl
varnishadm vcl.use foobar

For a few minutes our website was a bit slower (because of the
extra Paris-Denmark hop), but there was never any interruption.

And by doing it this way, I know it will work next time also.

2016-04-25 /phk

All that buzz about “reproducible builds” ? Yeah, not a new idea.

Original URL:

Original article

GCC 5.4

This is the mail archive of the
mailing list for the GCC project.

  • From: Richard Biener
  • To: gcc-announce at gcc dot gnu dot org, gcc at gcc dot gnu dot org, info-gnu at gnu dot org
  • Date: Fri, 3 Jun 2016 16:02:37 +0200 (CEST)
  • Subject: GCC 5.4 Released
  • Authentication-results:; auth=none
  • Reply-to: gcc at gcc dot gnu dot org

The GNU Compiler Collection version 5.4 has been released.

GCC 5.4 is a bug-fix release from the GCC 5 branch
containing important fixes for regressions and serious bugs in
GCC 5.3 with more than 147 bugs fixed since the previous release.
This release is available from the FTP servers listed at:

Please do not contact me directly regarding questions or comments
about this release.  Instead, use the resources available from

As always, a vast number of people contributed to this GCC release
-- far too many to thank them individually!

Original URL:

Original article

Hoofbeatz Audio announces ‘i Rock N Ride’ horseback riding Bluetooth speaker

When I think of horseback riding, my mind drifts to simpler times. It also conjures thoughts of cowboys, farming, and the Amish. Since the invention of motorized vehicles, equestrian travel just seems a bit old fashioned. With all of that said, there is an apparent need to bring technology to horseback riding. How, you ask? With the Hoofbeatz Audio ‘i Rock N Ride’ Bluetooth speaker, currently on Kickstarter. You can now listen to music and answer telephone calls from the convenience of your saddle! “Riders can send or receive phone calls, ride to music and use Siri and Android voice prompts, all… [Continue Reading]

Original URL:

Original article

How to Listen to and Delete Everything You’ve Ever Said to Google

Here’s a fun fact: Every time you do a voice search, Google records it. And if you’re an Android user, every time you say “Ok Google,” the company records that, too. Don’t freak out, though, because Google lets you hear (and delete) these recordings. Here’s how.

Read more…

Original URL:

Original article

Packet Capturing MySQL with Rust


By: Paul LaCrosse

Packet Capturing MySQL with Rust

Recently, AgilData launched the Gibbs MySQL Scalability Advisor, a free self-service tool that allows users to capture a live stream of queries to be uploaded to Gibbs and analyzed by AgilData’s experts.  Spyglass is the database traffic capture tool for Gibbs. Built using the Rust programming language, it provides exceptional performance for profiling your MySQL database. Running Spyglass on your application or database server allows you to pick up all kinds of information about your queries, performance and database health.

Spyglass will watch interactions between your MySQL Servers and client applications, unobtrusively and without any changes to either. By logging statistics from live-interactions between clients and MySQL instances, Spyglass provides for eventual analysis to chart query performance, database schema, table indexes, and return a report of optimization opportunities.

User information is protected by replacing actual values, such as dates, numbers, and text, with the ‘?’ question-mark character.  It’s cool stuff, especially for a free service.

Spyglass is Open Source

Spyglass has inherent security concerns associated with packet capturing of production data.  In order to demonstrate confidence that nothing suspicious or malicious is under the hood we felt it was important to allow everyone access to the source. Spyglass is open-source software licensed under the GPL3 license.  The source code, as well as some pre-built binaries can be found on AgilData’s GitHub account.

To run Spyglass, you need extra permissions above that of a normal user in order to capture network traffic at the data-link layer, below IP, and without having to alter or interfere with the regular data flow between the client app and database servers.  We recommend running it using “sudo.”

As a Spyglass user, you have the opportunity to examine the file and verify if you want to send the contents to Gibbs for full analysis, prior to it being uploaded. You can use the Spyglass command-line utility on 64-bit Linux, OS X and Windows operating systems after cloning from GitHub and building it.  Or use the pre-built musl-based Linux image, which has no shared-library requirements, on any current 64-bit distro, from the GitHub repo releases page for Spyglass.

What it “Must Have”

Gibbs required the ability to run on production systems and sit on an app server without disruption to production processes.  If it altered anything, caused a noticeable slowdown, or required a tedious setup process, nobody would use it.  Our “absolute must-haves” list included:

• No reconfiguration of the client or app server
• No reconfiguration of the MySQL database server
• A small, single executable of the pre-built binary
• Run on any modern 64-bit Linux distribution
• Production friendly – Minimal use of the system resources
• Does not require any restarts of the app or MySQL servers
• User controlled “start and stop” at any time
• Easy User Interface on the command line, allowing it to run on headless servers
• User control to examine the data file captured prior to transmitting
• And most importantly have the performance necessary to keep up in real-time

We want to provide you a tool built with the memory and type safety commonly found in garbage-collected languages, but also the truly high-performance that is lacking in them.

Rust Image


We needed a reliable and trustworthy systems language to produce a high-performance executable, that was also able to examine network packets at a level below the typical TCP layer. While this could be done with the venerable C language, Rust provides all of the same low-level control but with the memory-safety you would typically get from using a managed, garbage-collected language.  For us, a combination of low-level performance and direct compatibility with other native ‘C’ code using higher-level functional constructs and “zero-cost” abstractions for these constructs was key.  Concurrently, it provides modern, high-level functional programming concepts.

Teams pushing the boundaries for today’s newest infrastructure have correctly characterized the language as “cutting edge, highly sophisticated hybrid functional language … which is uniquely high-performance, low-footprint and reliable.”  The Rust team themselves state “Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.”

Programmers familiar with Scala, Haskell, OCaml, and other similar languages will find much to like here.  Rust provides “zero-cost abstractions” allowing manipulation of collections using map, filter, folds, and other familiar paradigms; the compiler does most of the work, resulting in high-performance executables.  For example, many languages implement Option using a wrapper object.  In Rust, that will compile down to be as efficient as if the programmer manually checked for “null values” everywhere, without having to write that code.  Rust allows idiomatic functional code, without the run-time penalty. (As an added bonus, it does away with the dreaded null).

Is Rust > Java?

Many people will point out that the Java Virtual Machine is an awesome piece of engineering.  We agree; its twenty-plus year history has allowed it to evolve into something used effectively by millions of programmers. But has anyone else noticed it has reached its limits?  While Scala came on the scene over ten years ago, bringing functional programming to Java and vastly improving the experience, it is still saddled with the JVM as its run-time.  For example, in the Big Data space, JVM-based products are regularly using the “unsafe” interface within the JVM to directly invoke native code.  Using things such as memory-mapped files and custom collections written in ‘C’, numerous database vendors are pushing the boundary of performance you can get from the JVM.

The issue is, once you’ve done this, you’re really no longer completely within the Java ecosystem.  If you’re going to write and link with native code, it is far easier to do it in languages like Go or Rust. And, unlike using C, programmers still benefit from using modern functional programming features such as map, filter, and fold over collections.  And use native code you must, if you want to stray beyond the Java standard library in the realm of networking.

Rust’s protection features actually make it safer than the JVM with “unsafe calls” via JNI.  It’s even shielded better, in my opinion, than straight, plain Java in regards to memory leaks.  Borrow-checking by the Rust compiler generally assures not only that no “use after free” memory access occurs, but also avoids the programmer forgetting to free.  Long-time C developers know keeping track of your mallocs and frees is a tedious, but critical task.  Java promised, and delivered, freedom from much of this.  It was a key enabler for millions of developers to create reliable software, without the strict memory accounting.  It could not do it all, as memory leaks, and how to find and fix them, became an issue for JVM software.  While the segfaults became almost extinct, the garbage collection costs soared, and JVM profilers joined the toolbox to help count objects.  While not having to explicitly drop class instances and their associated memory saved programmers plenty of time coding and debugging, the opposite problem of holding onto too much replaced it.  And the cost of garbage-collection, including the large variance in when it occurs, and how long it takes, limited the practical heap size for production Java apps on standard JVMs if you have a maximum response time you’re attempting to adhere to (and who doesn’t?).

Borrow checking in Rust is a compile-time, instead of run-time operation.  There is no equivalent to the Java Virtual Machine, or any other large runtime requirement, thereby enabling small self-contained executables, as you would produce with straight C.  Unlike C, you still get safe memory access, without having to explicitly handle it yourself in most instances, just as you do with Java.  And the compiler tends to aggressively, and automatically, generate the code to drop your constructs and free their memory, resulting in a much smaller set of situations where you may accidentally hold onto things longer than actually needed, which does tend to occur in the Java ecosystem more often than we would hope.

“Rust”proofing Spyglass – How It’s Made

Spyglass had to just work without any reconfiguration of the systems being studied using  packet-capturing.  For an unobtrusive means to silently gather payloads from the data-link layer, the excellent pnet crate was used (a crate is the Rust equivalent of a library or package in other languages).  Rust’s community repository of crates already offers over 5,000 libraries (as of this writing), and has topped over 42 million crate downloads.  With a large following of performance-oriented systems developers contributing new crates daily, it usually isn’t difficult to locate what you need.

The bulk of the new work was in implementing code to handle the MySQL wire protocol, without actually receiving the TCP-layer stream.  Since TCP handles missing and out-of-order packets, the programmer is normally allowed to ignore these details.  But merely listening in on the data-link layer does not.  While ignoring that for expediency in early iterations of Spyglass produced acceptable results in a simple and lightly loaded test harness, it wasn’t adequate where real systems were involved.  Implementation of a minimalist Finite State Machine was necessary to monitor out-of-order and missing packets for each individual MySQL connection, and sync-up with the stream if too much divergence occurs.

Gibbs needs the schema for tables involved in the queries being captured, and waiting around for someone to “show tables” and “show columns” wouldn’t cut it.  Luckily, a MySQL client crate already existed, providing not only a place to corroborate our interpretation of the MySQL wire protocol, but also easy client code for the portion of Spyglass which doesn’t rely upon data-link snooping.  So we just used this client to query those items directly, and acquire the results in the usual manner.  Look how short the code is to gather the create table information from a Vector of Strings (table names), including formatting and timestamping the results:

    let timespec = time::get_time();
    let millis = timespec.sec * 1000 + timespec.nsec as i64 / 1000 / 1000;
        pool.prep_exec(format!(“show create table {}”, t), ())
            .map(|res| { res
                .map(|x| x.unwrap())
                .fold((), |_, row| {
                    let (_, c,): (String, String) = mysql::from_row(row);
                    let msg =
                        format!(“–GIBBStTYPE: DDLtTIMESTAMP: {}tSCHEMA: {}tSQL:n{};n”,
                                millis, db, c);
                    write_cap(&mut cap, &msg);  // write to capture file
                    printfl!(“.”);  // use custom macro which flushes to console

Nobody would approve of uploading their actual captured data values over the Internet, and they are not necessary for the Gibbs performance analysis anyways.  So our next step was to find a way to handle it.  While it might be tempting to include a full SQL parser, or other means of tokenizing and classifying each substring in the capture, a far simpler method presented itself.  How about the venerable regular expressions (regex)?  Would that work?  Back to to see if a library for regex exists; sure enough it does.  Isn’t the use of regular expressions something that programmers seeking “type safety” frown upon, since, like a string containing SQL, you won’t find out about your typos until you’re actually executing the program?  Enter regex macros!  While it is presently slower, and requires a Rust nightly, it has the very appealing property that if your regex is not a correct expression, your program won’t compile!  Instead of having to find some interactive regex testing UI, regular expressions which are improper will be flagged by your regular compiler, allowing you to continue your workflow with your standard code editor and build tools.  What we essentially end up with is a custom compiled set of code geared explicitly for our use case, redacting literal strings, ranges, numbers, and booleans without destroying the SQL statement surrounding them.  And the amount of source code needed to accomplish this is astonishingly small:

let redact = regex!(r#”(?x)( (?P

[s=(+-/*]) (

        ‘[^’]*((.|”)[^’]*)*’ |
        ;”[^”]*((.|””)[^”]*)*” |
        [.][d]+ | [d][.d]*
let cr = redact.replace_all(&x, “$p?”);

Where x is the String containing the captured statement, and the “$p?” exchanges all the captured literals with a question mark, a single call against the compiled regex suffices.  You may also notice some of the other Rust features used. Multi-line raw string literals eliminated the string escaping, thereby preventing the obfuscation of the regex escaping.  Nobody likes dealing with multiple layers of escapes within a single string.  Especially with regex, already a syntax which looks like someone scrambled random punctuation about like a kid playing with a box of Alpha-Bits cereal spilled on the floor.

Now that we have our text in a condition suitable for transmission back to AgilData, we have to deal with all that being an http client entails.  Combining hyper and multipart crates provides convenient APIs for this task.  While the resulting code isn’t quite as terse as the regex Rust macro example above, be sure to check out Spyglass’ source file to see how posting a multipart form was used to upload the data capture file.

Incorporating these crates, along with the Rust standard library and language features, yielded the results we were looking for.  In far less time than what we estimated we would need if using C alone, even with the wealth of decades-worth of C libraries at our disposal, Spyglass was born.

What else can Spyglass see? It’s up to you.

We are open to suggestions.  For the systems programmer, Spyglass code provides a good starting base for any form of case-specific protocol traffic capture at a level below TCP/IP.  The Finite State Machine (FSM) may be modified to match the specifics of your situation and will definitely give you a head-start of a few days to a few weeks, especially if you are new to the Rust ecosystem.  Please let us know how you’ve adapted it, or even better, send some pull requests our way with improvements!  And if you’re ever in Michigan, stop by a Rust Detroit Meetup or follow us on Twitter.


Learn more about using the free Gibbs Scalability Advisor with Spyglass to scale your MySQL.

Spyglass repository on Github.

Read about our new product, AgilData Scalable Cluster for MySQL, available today.

Understand AgilData Scalable Cluster for MySQL pricing.

Download the AgilData Scalable Cluster for MySQL Whitepaper for more information.

Original URL:

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: