Windows Subsystem for Linux Architectural Overview

We recently announced Bash on Ubuntu on Windows which enables native Linux ELF64 binaries to run on Windows via the Windows Subsystem for Linux (WSL). This subsystem was created by the Microsoft Windows Kernel team and has generated a lot of excitement. One of the most frequent question we get asked is how is this approach different from a traditional virtual machine. In this first of a series of blog posts we will provide an overview of WSL that will answer that and other questions. In future posts we will dive deep into the component areas introduced. 

History of Windows Subsystems

Since its inception, Microsoft Windows NT was designed to allow environment subsystems like Win32 to present a programmatic interface to applications without being tied to implementation details inside the kernel. This allowed the NT kernel to support POSIX, OS/2 and Win32 subsystems at its initial release.

Early subsystems were implemented as user mode modules that issued appropriate NT system calls based on the API they presented to applications for that subsystem. All applications were PE/COFF executables, a set of libraries and services to implement the subsystem API and NTDLL to perform the NT system call. When a user mode application got launched the loader invoked the right subsystem to satisfy the application dependencies based on the executable header.

Later versions of subsystems replaced the POSIX layer to provide the Subsystem for Unix-based Applications (SUA). This composed of user mode components to satisfy:

  1. Process and signal management
  2. Terminal management
  3. System service requests and inter process communication

The primary role of SUA was to encourage applications to get ported to Windows without significant rewrites. This was achieved by implementing the POSIX user mode APIs using NT constructs. Given that these components were constructed in user mode, it was difficult to have semantic and performance parity for kernel mode system calls like fork(). Because this model relied on the need for programs to be recompiled it required ongoing feature porting and was a maintenance burden.

Over time these initial subsystems were retired.

Since the Windows NT Kernel was architected to allow new subsystem environments, we were able to use the initial investments made in this area and broaden them to develop the Windows Subsystem for Linux.

Windows Subsystem for Linux

WSL is a collection of components that enables native Linux ELF64 binaries to run on Windows. It contains both user mode and kernel mode components. It is primarily comprised of:

  1. User mode session manager service that handles the Linux instance life cycle
  2. Pico provider drivers (lxss.sys, lxcore.sys) that emulate a Linux kernel by translating Linux syscalls
  3. Pico processes that host the unmodified user mode Linux (e.g. /bin/bash)

It is the space between the user mode Linux binaries and the Windows kernel components where the magic happens. By placing unmodified Linux binaries in Pico processes we enable Linux system calls to be directed into the Windows kernel. The lxss.sys and lxcore.sys drivers translate the Linux system calls into NT APIs and emulate the Linux kernel.

LXSS diagram

Figure 1: WSL Components

LXSS Manager Service

The LXSS Manager Service is a broker to the Linux subsystem driver and is the way Bash.exe invokes Linux binaries. The service is also used for synchronization around install and uninstall, allowing only one process to do those operations at a time and blocking Linux binaries from being launched while the operation is pending.

All Linux processes launched by a particular user go into a Linux instance. That instance is a data structure that keeps track of all LX processes, threads, and runtime state. The first time an NT process requests launching a Linux binary an instance is created.

Once the last NT client closes, the Linux instance is terminated. This includes any processes that were launched inside of the instance including daemons (e.g. the git credential cache).

Pico Process

As part of Project DrawBridge, the Windows kernel introduced the concept of Pico processes and Pico drivers. Pico processes are OS processes without the trappings of OS services associated with subystems like a Win32 Process Environment Block (PEB). Furthermore, for a Pico process, system calls and user mode exceptions are dispatched to a paired driver.

Pico processes and drivers provide the foundation for the Windows Subsystem for Linux.  The subsystem is able to run native unmodified Linux code by loading a binary executable into the process’s address space and emulating the underlying Linux kernel.

System Calls

WSL executes unmodified Linux ELF64 binaries by virtualizing a Linux kernel interface on top of the Windows NT kernel.  One of the kernel interfaces that it exposes are system calls (syscalls). A syscall is a service provided by the kernel that can be called from user mode.  Both the Linux kernel and Windows NT kernel expose several hundred syscalls to user mode, but they have different semantics and are generally not directly compatible. For example, the Linux kernel includes things like fork, open, and kill while the Windows NT kernel has the comparable NtCreateProcess, NtOpenFile, and NtTerminateProcess.

The Windows Subsystem for Linux includes kernel mode drivers (lxss.sys and lxcore.sys) that are responsible for handling Linux system call requests in coordination with the Windows NT kernel. The drivers do not contain code from the Linux kernel but are instead a clean room implementation of Linux-compatible kernel interfaces. On native Linux, when a syscall is made from a user mode executable it is handled by the Linux kernel. On WSL, when a syscall is made from the same executable the Windows NT kernel forwards the request to lxcore.sys.  Where possible, lxcore.sys translates the Linux syscall to the equivalent Windows NT call which in turn does the heavy lifting.  Where there is no reasonable mapping the Windows kernel mode driver must service the request directly.

As an example, the Linux fork() syscall has no direct equivalent call documented for Windows. When a fork system call is made to the Windows Subsystem for Linux, lxcore.sys does some of the initial work to prepare for copying the process. It then calls internal Windows NT kernel APIs to create the process with the correct semantics, and completes copying additional data for the new process.

File system

File system support in WSL was designed to meet two goals.

  1. Provide an environment that supports the full fidelity of Linux file systems
  2. Allow interoperability with drives and files in Windows

The Windows Subsystem for Linux provides virtual file system support similar to the real Linux kernel. Two file systems are used to provide access to files on the users system: VolFs and DriveFs.

VolFs

VolFs is a file system that provides full support for Linux file system features, including:

  • Linux permissions that can be modified through operations such as chmod and chroot
  • Symbolic links to other files
  • File names with characters that are not normally legal in Windows file names
  • Case sensitivity

Directories containing the Linux system, application files (/etc, /bin, /usr, etc.), and users Linux home folder, all use VolFs.

Interoperability between Windows applications and files in VolFs is not supported.

DriveFs

DriveFs is the file system used for interoperability with Windows. It requires all files names to be legal Windows file names, uses Windows security, and does not support all the features of Linux file systems. Files are case sensitive and users cannot create files whose names differ only by case.

All fixed Windows volumes are mounted under /mnt/c, /mnt/d, etc., using DriveFs. This is where users can access all Windows files. This allows users to edit files with their favorite Windows editors such as Visual Studio Code, and manipulate them with open source tools in Bash using WSL at the same time.

In future blog posts we will provide additional information on the inner workings of these component areas. The next post will cover more details on the Pico Process which is a foundational building block of WSL.

Deepu Thomas and Seth Juarez discuss the underlying architecture that enables the Windows Subsystem for Linux.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/5GKZ2NEkV-s/

Original article

TIL target=_blank is insecure

What problems does it solve?

You’re currently viewing index.html.

Imagine the following is user-generated content on your website:

Click me!!1 (same-origin)

Clicking the above link opens malicious.html in a new tab (using target=_blank). By itself, that’s not very exciting.

However, the malicious.html document in this new tab has a window.opener which points to the window of the HTML document you’re viewing right now, i.e. index.html.

This means that once the user clicks the link, malicious.html has full control over this document’s window object!

Note that this also works when index.html and malicious.html are on different origins — window.opener.location is accessible across origins! (Things like window.opener.document are subject to CORS though.) Here’s an example with a cross-origin link:

Click me!!1 (cross-origin)

In this proof of concept, malicious.html replaces the tab containing index.html with index.html#hax, which displays a hidden message. This is a relatively harmless example, but instead it could’ve redirected to a phishing page, designed to look like the real index.html, asking for login credentials. The user likely wouldn’t notice this, because the focus is on the malicious page in the new window while the redirect happens in the background. This attack could be made even more subtle by adding a delay before redirecting to the phishing page in the background (see tab nabbing).

TL;DR If window.opener is set, a page can trigger a navigation in the opener regardless of security origin.

Recommendations

To prevent pages from abusing window.opener, use rel=noopener. This ensures window.opener is null in Chrome 49 and Opera 36.

Click me!!1 (now with rel=noopener)

For older browsers, you could use rel=noreferrer which also disables the Referer HTTP header, or the following JavaScript work-around which potentially triggers the popup blocker:

var otherWindow = window.open();
otherWindow.opener = null;
otherWindow.location = url;

Don’t use target=_blank (or any other target that opens a new navigation context), especially for links in user-generated content, unless you have a good reason to.

Bug tickets to follow

Questions? Feedback? Tweet @mathias.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/1UAqs7dzr2w/

Original article

When to Rewrite from Scratch – Autopsy of a Failed Software

21 Apr 2016

It was winter of 2012. I was working as a software developer in a small team at a start-up. We had just released the first version of our software to a real corporate customer. The development finished right on schedule. When we launched, I was over the the moon and very proud. It was extremely satisfying to watch the system process couple of million of unique users a day and send out tens of millions of SMS messages. By summer, the company had real revenue. I got promoted to software manager. We hired new guys. The company was poised for growth. Life was great. And then we made a huge blunder and decided to rewrite the software. From scratch.

Why We Felt That Rewrite from Scratch Was Needed?

We had written the original system with a gun to our heads. We had to race to the finish line. We weren’t having long design discussions or review meetings – we didn’t have time for such things. We would finish up a feature get it tested quickly and move on to the next. We had a shared office and I remember software developers at other companies getting into lengthy design and architecture debates and arguing for weeks over design patterns.

Despite agile-on-steroids design, the original system wasn’t badly written and generally was well structured. There was some spaghetti code that carried over from company’s previous proof of concept attempts that we left untouched because it was working and we had no time. But instead of thinking about incremental improvements, we convinced ourselves that we need to rewrite from scratch because:

  • the old code was bad and hard to maintain.
  • the “monolith java architecture” was inadequate for our future need of supporting a very large operator with 60 million mobile users and multi-site deployments.
  • I wanted to try out new, shinny technologies like Apache Cassandra, Virtualization, Binary Protocols, Service Oriented Architecture, etc.

We convinced the entire organization and the board and sadly, we got our wish.

The Rewrite Journey

The development officially began in spring of 2012 and we set end of January, 2013 as the release date. Because the vision was so grand, we needed even more people. I hired consultants and couple of remote developers in India. However, we didn’t fully anticipate the need to maintain the original system in parallel with new development and underestimated customer demands. Remember I said in the beginning we had a real customer? The customer was one one of the biggest mobile operators in South America and once our system had adoption from its users, they started making demands for changes and new features. So we had to continue updating the original system, albeit half-heartedly because we were digging its grave. We dodged new feature requests from the customer as much as we can because we were going to throw the old one away anyways. This contributed to delays and we missed our January deadline. In fact, we missed it by 8 whole months!

But let’s skip to the end. When the project was finally finished, it looked great and met all the requirements. Load tests showed that it can easily support over 100 million users. The configuration was centralized and it had a beautiful UI tool to look at charts and graphs. It was time to go and kill the old system and replace it with the new one… until the customer said “no” to upgrade. It turned out that the original system had gained wide adoption and their users had started relying on it. They wanted absolutely no risks. Long story short, after months of back and forth, we got nowhere. The project was officially doomed.

Lessons Learnt

  • You should almost never, ever rewrite from scratch. We rewrote for all the wrong reasons. While parts of code were bad, we could have easily fixed them with refactoring if we had taken time to read and understand the source code that was written by other people. We had genuine concerns about the scalability and performance of the architecture to support more sophisticated business logic, but we could have introduced these changes incrementally.
  • Systems rewritten from scratch offer no new value to the user. To the engineering team, new technology and buzzwords may sound cool but they are meaningless to customers if they don’t offer new features that the customers need.
  • We missed real opportunities while we were focused on the rewrite. We had a very basic ‘Web Tool’ that the customer used to look at charts and reports. As they became more involved, they started asking for additional features such as real-time charts, access-levels, etc. Because we weren’t interested in the old code and had no time anyways, we either rejected new requests or did a bad job. As a result, the customer stopped using the tool and insisted on reports by email. Another lost opportunity was an opportunity to build a robust Analytics platform that was badly needed.
  • I underestimated the effort of maintaining the old system while the new one is in development. We estimated 3-5 requests a month and got 3 times as many.
  • We thought our code was harder to read and maintain since we didn’t use proper design patterns and practices that other developers spent days discussing. It turned out that most professional code I have seen in larger organizations is 2x time worst than that we had. So we were dead wrong about that.

When Is Rewrite the Answer?

Joel Spolsky made strong arguments against rewrite and suggests that one should never do it. I’m not so sure about it. Sometimes incremental improvements and refactoring are very difficult and the only way to understand the code is to rewrite it. Plus software developers love to write code and create new things – it’s boring to read someone else’s code and try to understand their code and their ‘mental abstractions’. But good programmers are also good maintainers.

If you want to rewrite, do it for the right reasons and plan properly for the following:

  • The old code will still need to be maintained, in some cases, long after you release the new version. Maintaining two versions of code will require huge efforts and you need to ask yourself if you have enough time and resources to justify that based on the size of the project.
  • Think about losing other opportunities and prioritize.
  • Rewriting a big system is more risky than smaller ones. Ask yourself if you can incrementally rewrite. We switched to a new database, became a ‘Service Oriented Architecture’ and changed our protocols to binary, all at the same time. We could have introduced each of these changes incrementally.
  • Consider the developers’ bias. When developers want to learn a new technology or language, they want to write some code in it. While I’m not against it and it’s a sign of a good environment and culture, you should take this into consideration and weigh it against risks and opportunities.

Michael Meadows made excellent observations on when “BIG” rewrite becomes necessary:

Technical

  • The coupling of components is so high that changes to a single component cannot be isolated from other components. A redesign of a single component results in a cascade of changes not only to adjacent components, but indirectly to all components.
  • The technology stack is so complicated that future state design necessitates multiple infrastructure changes. This would be necessary in a complete rewrite as well, but if it’s required in an incremental redesign, then you lose that advantage.
  • Redesigning a component results in a complete rewrite of that component anyway, because the existing design is so fubar that there’s nothing worth saving. Again, you lose the advantage if this is the case.

Political

  • The sponsors cannot be made to understand that an incremental redesign requires a long-term commitment to the project. Inevitably, most organizations lose the appetite for the continuing budget drain that an incremental redesign creates. This loss of appetite is inevitable for a rewrite as well, but the sponsors will be more inclined to continue, because they don’t want to be split between a partially complete new system and a partially obsolete old system.
  • The users of the system are too attached with their “current screens.” If this is the case, you won’t have the license to improve a vital part of the system (the front-end). A redesign lets you circumvent this problem, since they’re starting with something new. They’ll still insist on getting “the same screens,” but you have a little more ammunition to push back.
    Keep in mind that the total cost of redesigning incrementally is always higher than doing a complete rewrite, but the impact to the organization is usually smaller. In my opinion, if you can justify a rewrite, and you have superstar developers, then do it.

Abandoning working projects is dangerous and we wasted an enormous amount of money and time duplicating working functionality we already had, rejected new features, irritated the customer and delayed ourselves by years. If you are embarking on a rewrite journey, all the power to you, but make sure you do it for the right reasons, understand the risks and plan for it.

This article was written by Umer Mansoor.
Please leave your comments below and also don’t forget to follow us on Facebook.



Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/ShQUvXYGG9g/

Original article

Microsoft Announces Windows 10 Build 14328 With Windows Ink, New UI

An anonymous reader writes: Windows Ink is one of the many new features rolling out to beta testers as part of Windows 10 Build 14328. The build includes the new Windows Ink Workspace, providing access to new and improved sticky notes, a sketchpad, and a new screen sketch feature. There’s also a new digital ruler you can use to create shapes and draw objects freely. The UI of the Start menu and Start Screen have also been tweaked. The most used apps list and all apps UI have been merged into a single view, creating a less cluttered Start menu. Microsoft also moved power, settings, and file explorer shortcuts so they’re always visible. You can now bring back the fullscreen all apps list in the Start Screen, and you can toggle between the all apps view and your regular pinned apps. If you want things to feel less like a desktop PC, you can auto-hide the taskbar in tablet mode. Microsoft has detailed all of the new features found in Build 14328 in their blog post.


Share on Google+

Read more of this story at Slashdot.


Original URL: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/w7656UW_OME/microsoft-announces-windows-10-build-14328-with-windows-ink-new-ui

Original article

12-inch MacBook’s three flaws that Apple could’ve fixed but didn’t

Earlier this week, Apple finally updated its svelte laptop that launched 13-months ago. I am awe-struck by the company’s design-audacity—not for brash innovation but bumbling compromises that make me wonder who needs this thing. The 12-inch MacBook offers much, wth respect to thinness, lightness, and typing experience (the keyboard is clever tech). But baffling is the decision to keep the crappy 480p webcam. These days, not late-1990s state-of-art, 720p is the least a pricey computer should come with, and is it too much to ask for 1080p or 4K when modern smartphones can shoot just that? This shortcoming, and two… [Continue Reading]


Original URL: http://feeds.betanews.com/~r/bn/~3/3JCuHCa4-sI/

Original article

Continuous Deployment with Helm, Deis Workflow, and Wercker

21 Apr 2016
in Deis Workflow, Helm, Wercker, Continuous Deployment

Deis Workflow is currently in beta. But what is it like to work with? Well, I created an example repository on GitHub to demo some functionality.

Using this example, we’ll build a simple, multi-tier web application using Helm, Deis Workflow, and Wercker for continuous deployment.

When we finish, we’ll have:

  • A backend Redis cluster (for storage)
  • A web frontend (installed as a Deis Workflow app) that interacts with Redis via JavaScript
  • Wercker for continuous deployment of your Docker image to Deis Workflow

Prerequisites

Before we continue, you’ll need a few things set up.

Firstly, you need a running Kubernetes cluster that is accessible remotely so Wercker can deploy new versions of your Docker image.

Next, you’ll need Helm installed. Helm is the Kubernetes package manager developed by the Deis folks.

You can install Helm by running:

curl -s https://get.helm.sh | bash

Then you’ll need to install Deis Workflow. Consult the Deis docs for that!

We’re also going to use Docker Hub for hosting our Docker repository. So you’ll need an account there.

And finally, head over to Wercker and set up an account.

App Setup

Install Redis With Helm

As a quick refresher, a chart in Helm lingo is a packaged unit of Kubernetes software. For the purposes of this demo, I wrote a chart that sets up Redis for you.

Point Helm at the demo repository by running:

$ helm up
$ helm repo add demo-charts https://github.com/deis/demo-charts
$ helm up

Now install the chart I wrote for this demo that sets up Redis:

$ helm fetch demo-charts/redis-guestbook
$ helm install redis-guestbook

And done!

Create the Deis App

To work with Deis Workflow, we need to create a Deis app.

You can do that by running:

$ deis create guestbook --no-remote

We’re creating the app with --no-remote because Deis only needs a Git remote when we’re using it for building Docker images. But we’re using Wercker for that.

Once that’s done, the only thing we need to do is specify some environment variables so that our app knows how to contact the Redis cluster.

Run this command:

$ deis config:set GET_HOSTS_FROM=env REDIS_MASTER_SERVICE_HOST=redis-master.default REDIS_SLAVE_SERVICE_HOST=redis-slave.default -a guestbook

Now that’s set up, we can can set up the code.

Set Up the Code

This bit’s easy.

Fork my demo repo to your GitHub account.

Then, clone your fork locally:

$ git clone https://github.com/USERNAME/example-guestbook-wercker.git

Now, we can set up continuous deployment with Wercker so changes made to your fork will result in automatic deployments to your Deis Workflow.

Set Up Continuous Deployment

Log into your Docker Hub account and create a new repository. You can do that via the Create menu, then Create Repository, then call it something like example-guestbook-wercker, or whatever you want.

Log in to your Wercker account and select Create, then Application. Connect Werker to your GitHub account and select your repository. When configuring access, select the default option: check out the code without using an SSH key.

Once the app is created, go to Settings, then Environment Variables, and create the following key-value pairs:

  • DOCKER_USERNAME: Your Docker Hub or other hosted docker registry account name
  • DOCKER_PASSWORD: Your Docker registry account password (you’ll probably want to set this protected, which hides it from the UI)
  • DOCKER_REPO: Your Docker Hub URL e.g. USERNAME/example-guestbook-wercker
  • DEIS_CONTROLLER: Your Deis controller URL (see your ~/.deis/client.json file)
  • DEIS_TOKEN: Your Deis user token (see your ~/.deis/client.json file)
  • TAG: A tag for tagging your Docker image, e.g. latest

These environment variables will then be passed into the wercker.yml file before it is evaluated by Wercker.

Test It

Make some changes to your code. Then, push to GitHub.

Wercker will see this, and do the following:

  • Build and tag your Docker image
  • Push to your Docker registry
  • Deploy your Docker image to Deis Workflow

There we have it.

Continuous deployment with Helm, Deis Workflow, and Werker.

Stay tuned for more posts like this.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/vL5je9G8Nyw/

Original article

The Looting of ShapeShift

Bitcoin, as any system of man, exhibits together both the highest ideals of utopia, and the lowest residual trash of society.

[Note: some names & sensitive details have been changed]

Erik VoorheesThis is the story of how ShapeShift, a leading blockchain asset exchange platform, was betrayed. Not once, not twice, but three times in less than a month.

In total, nearly two-hundred thousand dollars in cryptocurrency was stolen by thieves within and without, not to mention the significant resources expended in its wake. Nevertheless, no customer funds were ever lost or at risk, a milestone for an industry pocked with past tragedy, and ShapeShift itself has adapted and rebuilt, humbled by the experience learned, and ever more resolute in its mission of safe, frictionless asset exchange.

In the spirit of Bitcoin’s openness, we wanted to share this story with the community; may you be informed, entertained, reflective, and ever-diligent in your own affairs.

The Backstory

Since its inception in the Spring of 2014, ShapeShift has been an evolving creature. What began as a quick experimental way to swap between Bitcoin and Litecoin grew into an advanced engine for the effortless exchange of all major blockchain assets, each one into the other, with no user friction. No user accounts. No signup process. It is the Google Translate of cryptocurrency.

And we’ve always been playing catch-up. Trying to build at the speed of this industry, not only along the vertical of Bitcoin proper, but along the breadth of all crypto, is a challenge.

Last Fall, we realized the “minimum viable product” server architecture established originally for ShapeShift was insufficient. We needed a professional to join the small team, and craft a scalable, and secure, server apparatus upon which our technology could grow.

We hired such a person, and patted ourselves on the back for our proactive decision. On paper, he looked great; the reference we called confirmed his prior role and responsibility. He’d even been into Bitcoin since 2011/2012 and had built miners in his room. Awesome. We’ll call this new employee Bob… indeed his real name starts with a B.

Over the next months, Bob built and managed ShapeShift’s infrastructure. He did okay, nothing special, but we were content to have a professional taking care of devops at least well enough to enable our engineers to build upon the architecture.

In the first quarter of this year, as the market discovered what we already knew – that our world will be one of many blockchain assets each needing liquidity with the other – exchange volumes surged at ShapeShift. Ethereum was on the rise, specifically. Our infrastructure was not ready for the pace of growth. It was like riding a bicycle upon which jet engines suddenly appear full-thrust

Unfortunately, Bob did little to be helpful. He puttered around aimlessly while the team worked long hours to keep the ship together.

Scratch that, actually, Bob was not aimless.

He was preparing to steal from us.

The Genesis Betrayal

On the morning of March 14th, in the midst of one of our heaviest volume weeks ever, I get a call from our Head of Operations, Greg. “Erik, our hot wallet is missing 315 Bitcoin.” Why did we have so much in a hot wallet, you ask? Well, with volumes surging, our hot wallet would be drained through normal business in an hour at that level, which then required constant manual rebalancing. Are there ways to automate and reduce that risk? Absolutely… but hindsight of one’s development priorities is always 20/20.

So 315 Bitcoin was gone.

To those who have experienced such incidents, the feeling of sickness is profound. It’s a deep, dismal state, that doesn’t stop at the edge of financial loss, but permeates down to one’s core. When systems are breached, systems that one has engineered and cared for deeply, obsessively, that violation of what one considers safe and secure is very, very uncomfortable. And then there’s the loss itself. 315 Bitcoin… roughly $130,000. That’s college tuition, part of a house, food for ten years… a couple months of payroll. It’s a lot of money for a pre-profit startup.

I rushed to the office, hoping there was some mistake. The only comforting thought was that the loss was only our own money. With no customer accounts, neither customer funds nor personal information were at risk from the hack. That was by design from the beginning of ShapeShift; one of our tenets. But even if nobody nearby is harmed, a punch in the face still hurts like hell.

Myself, Greg, and our two lead engineers poured through logs and servers, trying frantically to figure out what had happened. The 315 BTC went to an unfamiliar Bitcoin address, and was sitting there.

Indeed, it sits there still: https://blockchain.info/address/1LchKFYxkugq3EPMoJJp5cvUyTyPMu1qBR

Despite our note to all employees to come into the office urgently, Bob, our head IT guy, the one responsible for security and infrastructure, arrives at 11:30am.

We ask Bob to join our discussion. We reveal the hack to him. We ask him if he had logged in at all that morning, to which he responded no (on several occasions). On the new of the theft, he seems neither particularly shocked nor outraged, yet it was his security that failed us. Immediately, he starts pointing to red herring explanations, “It must be one of the exchanges that got hacked, that happens all the time.” Umm, our exchange accounts are fine, Bob.

“Well, look at the IP address, it happened somewhere off west Africa.” Umm, IP addresses on block explorers indicate only the first node that noticed a transaction, and are generally meaningless in the context of Bitcoin, Bob. (What kind of Bitcoiner doesn’t know that?)

Very quickly, we realize he is pretty much useless. Here we have our “server guy” and he has zero intelligent comments about a hack against his own infrastructure.

While pouring over logs we noticed, however, a couple SSH keys (belonging to Bob) that had logged into the breached server that morning an hour before the rogue transaction, and then logged off two minutes after. Not nefarious, necessarily, for indeed Bob’s keys would be expected to log in periodically, though the timing was strange (6am-ish in the morning). We also discovered the breach occurred over the VPN, meaning someone in the office, or someone with access to our VPN, committed the theft.

We ask everyone with server access to provide the fingerprints of their SSH keys so we can start comparing them to logs. Everyone does so, but another strange thing: the fingerprint of the key handed in by Bob doesn’t appear in any logs. It appears brand new. Strange that the key of the server admin would never have been seen on any server…

Soon after, Bob decides it’s time for his lunch break, and we don’t see him for an hour, during the worst incident in ShapeShift’s history. We frankly didn’t care that much, he wasn’t helpful and suspicions were starting to creep in. He tells all of us that he’s leaving his laptop open to download some logs, and makes sure we see that the laptop is left open. He’s being a little weird.

Upon his return an hour later, he is sitting down with other engineers still investigating what occurred. I’m in the other room on a call. When I finish my call, I come check on the progress. Bob appears to receive a call “from his mother who needs to go to the hospital.” He packs up his stuff, grabs his dog who was at the office, and heads out. We’re all half relieved for his departure and half in awe… did our server admin really just leave for the second time during our investigation, which he should be leading?

He says, “I’ll be back within an hour.” This was at about 3pm, March 14.

We never saw him again

Shortly after he leaves, one of our engineers pulls myself and Greg aside, and says, “While you were on your call, we were all sitting around the table, and we saw in the logs that Bob deleted two SSH keys while he was sitting there with us, then he grep’d several times for them [a server command to find specific text], and then he left. Those two keys matched the two keys we saw in the log this morning which accessed the Bitcoin server just prior to the hack.”

He just deleted his keys from the server?? Well fuck. Guns don’t get any smokier than that.

We all immediately move to the assumption that Bob stole the funds. He is out of the building, and so we start locking everything down. All keys are changed in haste (well, almost all).

We work for a few more hours, no word from Bob. No calls, no texts, nothing. By the end of the day, it had been 3-4 hours since he left to “take his mother to the hospital.” We decide to call him, without letting on our suspicions just yet.

“Hey Bob, where are you?.”

“Oh hey, I just decided to go home.”

“You’re at home?”

“Yeah, just here, working on some stuff.”

WTF?

That call is innocuous, but we recorded it. We also recorded the next one 30 mins later, in which we confront him with some of the evidence.

“So Bob, it looks like you deleted your SSH keys, and gave us a new key that had never accessed any servers.”

“Yeah, well I deleted them because I didn’t think they were important.”

Yes, he actually said that. Our server admin, in the midst of an investigation into a $130,000 theft, deletes his two keys, and only these two keys, without telling anyone, and then admits on our call that he did it because “they weren’t important.”

It just so happens those two keys were the exact ones logged into the Bitcoin server that morning, and which logged off two minutes after the theft transaction. Not important indeed!

He gives no explanation of his behavior or actions that day, but dances around questions and implies, subtly at first, and then more explicitly, that we’re being racist.

“Umm Bob, we’re targeting you because your keys were on the server, and you deleted them and left, during an active investigation.”

It goes on like that for 45 mins. He says other ridiculous stuff, all recorded.

We uncover further evidence details, and there is a sense of relief after knowing exactly what happened and who was responsible. We spend the rest of the evening documenting everything, and preparing to file civil and criminal charges against Bob.

I give him a final chance that evening for redemption. In a message to all employees, so as not to force him to implicate himself by responding,

This is your chance to walk away, learn a lesson, and let this be closed. We will not pursue legal action if 315 Bitcoin are found in this address by 10am. No further questions will be asked, and we can part ways amicably. Send 315 BTC here: 35JBgzjyCUPswjRP9iqrUTkkX76QwrKkB9 -Erik

I get a response message from Bob at 4:36am, “I didn’t delete any keys and I regularly log into servers to check them out.”

Right, except that we have him already on record saying he did delete the keys and hadn’t logged on that morning. His ineptitude at lying appears outmatched only by his incompetence in server administration.

He goes on, with charming adolescent flare…

“Of course blaming me is the racist thing to do… you were basically looking for an excuse to satisfy your racism. I have no criminal history unlike you with the SEC.”

The next morning, our general counsel writes a formal letter (via email and post) to Bob, outlining some of the evidence that we knew, and demanding the stolen property be returned. It also notified Bob that his employment was terminated (I think that was fair, considering). In response, Bob emails back to the lawyer, addressing none of the evidence whatsoever,  “Your clients are racist so make sure you know who you’re dealing with.”

It’s like he was wearing his internet troll hat in real life. Did he not even understand the seriousness of the situation? Well… the absurdity was just getting started.

Over the next days, we file the formal civil complaint. The address Bob had given us was a PO box, though we had his legal name, his bank info, and his social serfdom number. We hired a private investigator. We found his apartment within a couple days. Several attempts at service failed, though the investigator heard a dog barking behind the door. One of his cars was found; he drives two unmarked retired police cruisers.

I have investors to whom I owe a level of protocol diligence, so, we also made arrangements for a criminal case, and herein the theft constitutes a Class 3 Felony, with 4-12 years in prison. Honestly, I don’t care whether he is punished. I care whether we are made whole, and whether he realizes his error and changes his life to become a better person. No sign yet, of that.

We learn some more things. Bob has prior police records in Florida, where he’s from. Incidentally, the records indicate he’s white, after all.

With civil and criminal cases proceeding against him, and with further discovery that Bob fled to Florida (leaving his dog to be temporarily cared for by his neighbor… who is now wondering where he is and hasn’t heard from him in weeks), we thought the case was basically closed. We’d get him somewhere, sooner or later. And, hopefully, we’d get our stolen property returned, or the fiat equivalent.

Rovion

We’d worked to build a new server infrastructure in Bob’s wake, assuming his work in our system to be largely compromised. We set up a new cloud architecture with a company we’ll call CloudCo.

It’s now the week of April 4th, and we were about ready to go live with this new cloud infrastructure. Then all hell breaks loose. Again.

On Thursday April 7th, around midday, we notice a bunch of Ethereum had left the hot wallet on the new infrastructure at CloudCo. The NEW infrastructure. The infrastructure that was not even public yet. At first, we believed our code had done something weird, perhaps sweeping funds to a development server address or similar. Then we noticed a bunch of Bitcoin was also missing. And then Litecoin also.

Thief’s Bitcoin address: 14Kt9i5MdQCKvjX6HS2hEevVgbPhK13SKD

Thief’s Ethereum address: 0xC26B321d50910f2f990EF92A8Effd8EC38aDE8f5

Thief’s Litecoin address: LL9jqgXVqxUbWbWVaJocBcF9Vm8uS3NaTd

And very quickly reality hits you, and that’s what flashback feels like. The horrible sinking feeling sets in immediately, once again. What the fuck happened?

Keys that were not even on publicly known servers had been compromised, somehow. We shut the system down, including our live production site, while we investigated. We didn’t lose as much as the hack a month prior, because we’d be keeping wallets somewhat conservative, but it was still quite a bit. We couldn’t believe it. How could brand new keys, generated with brand new infrastructure, be compromised?

After several hours of fruitless investigation, we decide that one of the most likely explanations is that the cloud company itself was compromised. This has happened before in Bitcoinland. We thought CloudCo was reputable, but who knows? Clouds are very convenient and scalable, but on some level you’re trusting that company with your infrastructure. We decided we had to keep the site down for at least 24 hours, and bust our asses to prepare, yet again, an entirely new infrastructure on an entirely new set of servers.

What was nearly as bad as the money lost was not knowing how it happened. Logs were not done as well as they should have been, so they proved fruitless. Indeed, they had been wiped.

Despite that, we watched the blockchains for the hacked funds. We tracked some to an exchange account. We got profile information of the depositor.

Name: Rovion Vavilov

Email: rovion.vavilov@riseup.net

Address: Chayanova St. 15, Moscow

DOB: Feb 2, 1980

Phone: +7 9625148445

That profile information was probably fake, but I emailed him that night.

From: Erik Voorhees erik@shapeshift.io

To: rovion.vavilov@riseup.net

Subject: ShapeShift Hack…

“Nice job on the hack. How did you do it? -Erik”

Pro Tip: Black hats like to be recognized for their skill, regardless of how immoral their deeds may be. Talk to them calmly, as adults. They may reveal information, or help in some way. It’s weird, but it happens. In any case, I didn’t expect anything to come of my email.

The rest of that night, and into the next day (Friday, the 8th), the team worked feverishly to rebuild everything on new infrastructure, once again, in a wholly clean environment on a wholly separate host.

Now to many, ShapeShift appears to be a simple web service. It’s taken a lot of work by our engineers to keep up that appearance. Behind the scenes, the platform is complex. Over 1,400 direct asset trading pairs, integrations with half a dozen exchange API’s requiring real-time price information on all offered cryptocurrencies, low-latency service API’s to several dozen partners, the monitoring and calculation of constantly changing exchange rates and order book depth in some of the most volatile markets on Earth, and incorporation of what can only be described as alpha-level software in various states of disarray (coin daemons…bleh).

And in Bitcoinland, indeed, and there is no guide book.

Admittedly, as a non-engineer myself, I can only occasionally glimpse the magnificence of what we’re building. I wish I could take credit. To our team reading this, you have engineered an amazing machine and should be very proud of it.

And now here is where the story deepens

Around mid-day on Friday, the hacker responds to my email (remember I had asked him how he did it…)

From: rovion.vavilov@fastmail.com (noted new domain)

To: Erik Voorhees erik@shapeshift.io

Subject: ShapeShift Hack…

“One word: Bob”

That was the entirety of that first email, but we were stunned. For a moment, we thought, “Is Bob the hacker?” Quickly, that notion gave way to the more likely answer: that Bob sold or gave away our information to a hacker, who then exploited it.

Bob betrayed us. He betrayed his privileged position, profiting directly from the destruction of those who trusted him. He stole, lied, ran away, and then after being afforded a period of time long enough to reflect upon his actions, decided to betray us again for a few more scraps in his pathetic bowl. Hackers gonna hack, but it takes a certain variety of bastard to ascend to a trusted position, work face to face with a team, receive a salary and confidence from that team, and then screw them all for barely enough money to buy a Tesla. Oh yeah, and then abandon a dog to starve alone, likely soon to be put down by animal services.

Watch out for these people in your lives. If you suspect them, sever ties quickly.

Anyway, after herculean efforts, we had everything ready by Friday night, 24 hrs later. We launched the site on yet a new provider, who we’ll call HostCo. Despite a couple glitchy bugs, the system was running. We had told the public about the hack and decided to release more details once we studied the compromised environment in more detail later.

Exchange orders started up immediately. We breathed a sigh of relief. I fell asleep around 1am and slept peacefully, exhausted from the ordeal and very proud of the team.

Then it was Saturday 9am, and I start emerging from slumber. My phone rings. It was Greg.

“We were hacked again. Bitcoin and Ethereum taken from the HostCo hot wallets.”

I’m silent on the phone. I’m thinking only, “Is this the fucking apocalypse?!?”

It didn’t seem possible. The hack two days prior didn’t seem possible, and this now was just immensely confusing and depressing. I tell Greg to take the site down again and I’ll call him back in 30 minutes. How the hell are we going to explain this to the community, to our customers… to our investors? How do we even explain it to ourselves?

I get out of bed, not panicked, but just feeling utterly defeated. I take the worst shower of my life. Anger surrounds me… we knew Bob was involved from the hacker’s email, and we knew Bob committed a Class 3 felony against us, which the authorities knew about three weeks ago, and our private investigator had provided all the information needed for an immediate conviction. And now this happens.

As I gather my thoughts, I decide it’s time to call in some professional resources.

Michael Perklin, Head of Security and Investigative Services at Ledger Labs, and chairman of the Steering Committee for the Board of CCSS, is first on my list. He’s in Toronto, and agrees to fly out to meet us that evening. He was on his way to the hospital; he had a toe broken in an event he’d prefer not to discuss. He changes course and heads to the airport. What a champion.

I also chat further with heads of several leading exchanges. None of them like thieves, and are eager to help. Despite its hectic pace and diversity of opinions and interests, this industry comes together when it needs to.

1500 ETH recovered, and exchanges are hunting for more. The thief is probably upset by this… it sucks to be stolen from, after all.

Fireside Chats with the Thief

In parallel to all that, I hear again from the thief via email. I had responded to his “One word: Bob” message by asking if he would provide more info. He mentions that for a price, he may.

“hi” he says.

I arrange to pay him 2 BTC for information.

“I need to know what your relation to bob is” I ask. I tried to avoid pre-empting details.

He replies, “I got information that Bob “hacked” you while I was trying to hack you too. I had some access before Bob hacked you but not enough to get the coins myself.”

“What do you know about Bob hacking us?” I ask

“Inside job. 315 BTC.” he replies. “I talked to Bob after he took the coins, asked him about how I could hack it too. He gave me more information about the infrastructure and some keys.”

I ask, “Why would he give you information and what did he give you?”

Rovion responds, “Because I offered BTC. IP addresses, server roles, users, a working SSH key. Does not work anymore.”

We chat further, and he reveals Bob’s email that he communicated with: m0money@gmail.com.

While I had not seen that email before, it seemed familiar. I thought for a while, and then realized that Bob often substituted 0’s for o’s, including on one of the two keys which he had deleted from the server (the specific key was named something which, if displayed, would give away Bob’s real name). That, and the fact that one of Bob’s common password variations was “m0m0ney.” Our security guy used l33tspeak for his passwords. Real secure.

As clear as it had been that Bob had stolen our funds a few weeks prior, it was now clear that this hacker, Rovion, was giving us information related to Bob that only Bob or those with whom he had actually interacted would know.

Another thought, could this hacker have actually framed Bob from the beginning? Sure, perhaps, but every action of Bob’s back on March 14th points away from that explanation, specifically Bob deleting his own keys right under our nose and then leaving the office, never to return. Other evidence not listed here further counters that theory.

Back to the chat with Rovion… I ask which “working SSH key” he had obtained. “None of your business,” he responds, “but he told me he got it from a coworker’s open laptop.”

Wow. If true, that means Bob, while working at ShapeShift, accessed a coworkers computer and copied a key (or more?), at some point before he stole the funds. Did he premeditate the whole thing, I wonder?

I try to get more information, but Rovion is unforthcoming. His last message…

“Your millions will save you, Erik Voorhees. Goodbye, I will be on email.”

By the early evening, our forensic investigator, Michael Perklin, had arrived. I picked him up from the airport. We had decided to hold off on poking around in our servers until he was there. While the hacker gave a vague sense of how he came upon secret information, we didn’t really know the specifics of the breach. Keys had been changed after Bob’s departure, and while we found one key we hadn’t remembered to change, it only had access to a server that could not have stolen the funds on the preceding Thursday. And again, it wouldn’t at all explain how the Saturday morning theft occurred. Both CloudCo and HostCo had funds stolen off them, despite them being built as entirely new environments with wholly new keys.

Michael asked me to convey to him the whole story of the past month. He proceeded through his investigative protocol, which included the assumption that nobody at the company was trustworthy. It was hard to argue that the team was trustworthy, given the fact that this all started with a rogue employee. It was a depressing feeling.

Many interesting details could be added here about how such forensic work is done, but space is limited and it’s probably unwise to reveal every such method. After a while, we dove into the logs themselves, attacking the Saturday logs first. They were deleted, most of them. How were they deleted? We weren’t sure.

We know now how to prevent that… indeed, the experience we’ve received throughout this incident has been immensely valuable. Though it sounds cliché, if your startup is involved in securing information or servers whatsoever, do yourself a favor and bring in 3rd party professional help very early. We hadn’t needed it at first, because we were small. But growth creeps up on you, and before you know it you are securing significant assets with sub-standard methods.

While much of the logs were gone, we in fact recovered a great portion of them off the “empty” disk space itself using forensic techniques. This was just lucky. Perhaps the Ghost of Satoshi was looking out for us (could have used his help a week ago, of course!)

From the recovered data, we discovered the malware, if that’s the right term. There was a program, written in Go, installed on a crucial server which communicated with coins. This program had its dates changed to appear consistent with the setup of the server, and its filename made to look innocuous. But it was the direct tool by which funds were stolen.

udevd-bridge it was called

We were glad to find it (and yes, the same thing appeared in both server environments, CloudCo and HostCo). However, it still didn’t explain how it was put there. We had a lot of information, but not the whole story.

And we wouldn’t have the whole story for a couple more days. But then the stars aligned.

Out of the blue, the hacker, Rovion, emails me again on Wednesday, April 13th.

From: Rovion Vavilov rovion.vavilov@fastmail.com

To: Erik Voorhees erik@shapeshift.io Subject:

Re: ShapeShift

“Would you be interested in buying the ETH that I currently hold back at a highly discounted rate in exchange for BTC? I’d be willing to trade in small quantities since you have no reason to trust me.”

Yes, it appears the hacker has gotten annoyed that his Ethereum kept getting frozen at exchanges. So he comes back to the store he robbed from, and asks us if we’ll trade for a more liquid asset. We’d be essentially buying back our own Ethereum, and paying him Bitcoin.

Obviously worth it, if we can obtain more information. Since neither of us trust the other, we establish a protocol:

1) We pay 2 BTC to get the conversation started

2) Rovion gives us half the relevant information

3) We exchange, in increments of 250, 2000 ETH for BTC at 0.02 BTC/ETH rate

4) Rovion gives us second half of the relevant information

5) We exchange, in the same increments, the remaining 2500 ETH for BTC at same rate

6) We cease communication (this last one was Rovion’s suggestion)

He asks us to send the BTC to his already known BTC address: 14Kt9i5MdQCKvjX6HS2hEevVgbPhK13SKD

After the initial 2 BTC payment, Rovion begins with description of April 7th hack:

“We contacted Bob. He gave us the ShapeShift core source code, core server IP address, an SSH key, and [redacted]. I logged in to the core server with the SSH key provided, installed a backdoor and took the coins since the core server had SSH access to the coins server.”

“What’s the fingerprint of the SSH key mentioned above?” I ask

“9c:3f:4b:ad:d6:43:ec:9a:55:de:b9:0b:d8:f5:0a:cb”

We see that it’s Greg’s key, newly created for the CloudCo environment. It was not even in existence until more than a week after Bob had stolen the funds in March and disappeared. How on Earth did this hacker get a new key, post Bob?

I also ask about the “[redacted]” mentioned but Rovion says that is part of the second batch of information. We proceed with the incremental exchange of the second batch of funds.

Then Rovion says,

“[redacted] was access to an RDP installed on a coworker’s machine by Bob. That’s how I hacked you the second time.”

Wow, now it’s starting to come together, each revelation peeling back a layer of Bob’s treachery. Bob had installed an RDP (remote desktop protocol – basically a screen viewer or controller) on Greg’s computer. And perhaps on others, we must assume.

Then Rovion shares via pastebin an email from Bob (the info he purchased):

“hi,

i received your 50 bitcoin. gh source and ssh priv key as attachments”

core ip: XX.XX.XX.XX

router for forwarding: XX.XX.XX.XX:XXXX

admin:[redacted password]

rdp internal ip: XX.XX.XX.XX

acadmin:pass

thanks for your business.

[2 attachments listed]

(specific IP’s redacted by us)

And there it is. Bob sold information on the production servers, access to ShapeShift’s internal network, part of ShapeShift’s source code, and access to an RDP client he had installed on a co-worker’s computer, to Rovion, for 50 Bitcoin. The IP and internal router info checked out.

This explained almost everything. With access to Greg’s computer (and perhaps others), via RDP, the new server environments could be witnessed and the new SSH keys could be used. It wasn’t the cloud service provider’s fault, it was our own.

We had changed almost everything, but hadn’t scrapped our personal computers used while Bob had been part of the team. Would that have been the paranoid thing to do? Yes. Would it have been the right thing to do?

Clearly.

And one of the last things Rovion said before we ended the discussion,

“Even though I said cease communication, can you still send me an email when Bob gets sued/whatever it is you’re going to do? I feel it’s really shitty to steal from your own employer.”

Cleaning Up a Mess

We imagine this information will assist in demonstrating criminal intent on the part of Bob. This was not a spur-of-the-moment taking, but an orchestrated treachery. I’ve lost count of the number of felonies involved at this point.

We also know that while the story from Rovion checks out, it may well not be the full story. We have to assume other details are relevant to the case, and to our infrastructure. This is why ShapeShift has been offline for longer than any of us would have liked. We are being very careful, and very paranoid.

Nonetheless, I have been immensely proud of my team. Working in a startup, in the Bitcoin industry, is stressful enough, and then to deal with a series of layered betrayals like this and all the damage (financially, technically, psychologically) it causes… that is hard. You guys have done an amazing job and I am immensely encouraged seeing the team’s cohesion and fortitude.

It didn’t help that we had just brought on four new employees in the very week of the two incidents (nearly doubling our development staff). They were thrown into the fray without mercy, and they’ve been incredible.

#ShapeShiftUserNotAffected

To survive in Bitcoin, one has to be an optimist. While the betrayal and loss and clean up effort has been horribly taxing, there are some silver linings.

First, no person or organization is perfect. We learned some of our own vulnerabilities, and our own mistakes. We are correcting them, and improving upon them wherever possible. Such improvement doesn’t come cheap, but the ShapeShift of today is made better than the ShapeShift of yesterday. The steel is tempered, the machine refined. Though no single organization can ultimately achieve it, we try to approach anti-fragility, and exemplify it as an ideal in our work.

Second, no customers lost money throughout multiple hacks orchestrated even by an insider. Through decentralization, through code, through innovation, through structure… consumer protection by design is one of this industry’s most important contributions to society – something that a century of legacy banking has failed to achieve, as noted by Satoshi’s infamous line in the Genesis Block.

ShapeShift will always work to develop upon this platform of consumer protection. Many others in this community are doing the same along different avenues. Thank you for the tools you are building, and the work you have done. And indeed, there is still much to do.

To our customers, I would like to personally apologize for our downtime. While we can ensure your funds are not at risk, I know many rely on our service, and it has been unavailable. Redundancy, even in the face of disaster, will be one of our primary development goals going forward.

Further, thank you sincerely to those in the community who reached out and offered all manner of support, and to our investors who were immensely kind and understanding.

And finally, as with all intense episodes one endures, we must appreciate the room and opportunity for growth, for experience, and for one of life’s most precious luxuries, reflection.

Never a dull day in Bitcoinland

-Erik Voorhees

CEO ShapeShift.io

And to Bob… Note that your real name and identifying information were not divulged. Consider that a final, tenuous courtesy.

Images courtesy of ShapeShift.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/oSzuk_fkHE4/

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: