SQLite with a Fine-Toothed Comb

One of the things I’ve been doing at Trust-in-Soft is looking for defects in open-source software. The program I’ve spent the most time with is SQLite: an extremely widely deployed lightweight database. At ~113 KSLOC of pointer-intensive code, SQLite is too big for easy static verification. On the other hand, it has an incredibly impressive test suite that has already been used to drive dynamic tools such as Valgrind, ASan, and UBSan. I’ve been using these tests with tis-interpreter.

Here I try to depict the various categories of UBs (not to scale) where this post is mainly about the blue-shaded area:

For honesty’s sake I should add that using tis-interpreter isn’t painless (it wasn’t designed as a from-scratch C interpreter, but rather is an adaptation of a sound formal methods tool). It is slower even than Valgrind. It has trouble with separately compiled libraries and with code that interacts with the system, such as mmap() calls. As I tried to show in the figure above, when run on code that is clean with respect to ASan and UBSan, it tends to find fiddly problems that a lot of people aren’t going to want to hear about. In any case, one of the reasons that I’m pushing SQLite through tis-interpreter is to help us find and fix the pain points.

Next let’s look at some bugs. I’ve been reporting issues as I go, and a number of things that I’ve reported have already been fixed in SQLite. I’ll also discuss how these bugs relate to the idea of a Friendly C dialect.

Values of Dangling Pointers

SQLite likes to use — but not dereference — pointers to heap blocks that have been freed. It did this at quite a few locations. For example, at a time when zHdr is dangling:

if( (zHdr>=zEndHdr && (zHdr>zEndHdr 
  || offset64!=pC->payloadSize))
 || (offset64 > pC->payloadSize)
  goto abort_due_to_error;

These uses are undefined behavior, but are they compiled harmfully by current C compilers? I have no evidence that they are, but in other situations the compiler will take advantage of the fact that a pointer is dangling; see this post and also Winner #2 here. You play with dangling pointers at your own risk. Valgrind and ASan make no attempt to catch these uses, as far as I know.

Using the value of a dangling pointer is a nettlesome UB, causing inconvenience while — as far as I can tell — giving the optimizer almost no added power in realistic situations. Eliminating it is a no-brainer for Friendly C.

Uses of Uninitialized Storage

I found several reads of uninitialized storage. This is somewhere between unspecified and undefined behavior.

One idiom was something like this:

int dummy;
some sort of loop {
  // we don't care about function()'s return value
  // (but its other callers might)
  dummy += function();
// dummy is not used again

Here the intent is to avoid a compiler warning about an ignored return value. Of course a better alternative is to initialize dummy; the compiler can still optimize away the unwanted bookkeeping if function() is inlined or otherwise specialized.

At least one uninitialized read that we found was potentially harmful, though we couldn’t make it behave unpredictably. Also, it had not been found by Valgrind. Both of these facts — the predictability and the lack of an alarm — might be explained by the compiler reusing a stack slot that had previously contained a different local variable. Of course we must not count on such coincidences working out well.

A Friendly C could ignore reads of uninitialized storage based on the idea that tool support for detecting this class of error is good enough. This is the solution I would advocate. A heavier-handed alternative would be compiler-enforced zeroing of heap blocks and automatic variables.

Out-of-Bounds Pointers

In C you are not allowed to compute — much less use or dereference — a pointer that isn’t inside an object or one element past its end.

SQLite’s vdbe struct has a member called aMem that uses 1-based array indexing. To avoid wasting an element, this array is initialized like this:

p->aMem = allocSpace(...);

I’ve elided a bunch of code, the full version is in sqlite3VdbeMakeReady() in this file. The real situation is more complicated since allocSpace() isn’t just a wrapper for malloc(): UB only happens when aMem points to the beginning of a block returned by malloc(). This could be fixed by avoiding the 1-based addressing, by allocating a zero element that would never be used, or by reordering the struct fields.

Other out-of-bounds pointers in SQLite were computed when pointers into character arrays went past the end. This particular class of UB is commonly seen even in hardened C code. It is probably only a problem when either the pointer goes far out of bounds or else when the object in question is allocated near the end of the address space. In these cases, the OOB pointer can wrap, causing a bounds check to fail to trigger, potentially causing a security problem. The full situation involves an undesirable UB-based optimization and is pretty interesting. A good solution, as the LWN article suggests, is to move the bounds checks into the integer domain. The Friendly C solution would be to legitimize creation of, and comparison between, OOB pointers. Alternately, we could beef up the sanitizers to complain about these things.

Illegal Arguments

It is UB to call memset(), memcpy(), and other library functions with invalid or null pointer arguments. GCC actively exploits this UB to optimize code. SQLite had a place where it called memset() with an invalid pointer and another calling memcpy() with a null pointer. In both cases the length argument was zero, so the calls were otherwise harmless. A Friendly C dialect would only require each pointer to refer to as much storage as the length argument implies is necessary, including none at all.

Comparisons of Pointers to Unrelated Objects

When the relational operators >, >=,

# define SQLITE_WITHIN(P,S,E) 
    ((uintptr_t)(P)>=(uintptr_t)(S) && 

Uses of this macro can be found in btree.c.

With respect to pointer comparisons, tis-interpreter’s (and Frama-C’s) intentions are stronger than just detecting undefined behavior: we want to ensure that execution is deterministic. Comparisons between unrelated objects destroy determinism because the allocator makes no guarantees about their relative locations. On the other hand, if the pair of comparisons in a SQLITE_WITHIN call is treated atomically, and if S and E point into the same object, there is no determinism violation. We added a “within” builtin to Frama-C that can be used without violating determinism and also a bare pointer comparison that — if used — breaks the guarantee that Frama-C’s results hold for all possible allocation orders. Sound analysis of programs that depend on the relative locations of different allocated objects is a research problem and something like a model checker would be required.

In Friendly C, comparisons of pointers to unrelated objects would act as if they had already been cast to uintptr_t. I don’t think this ties the hands of the optimizer at all.


SQLite is a carefully engineered and thoroughly tested piece of software. Even so, it contains undefined behaviors because, until recently, no good checker for these behaviors existed. If anything is going to save us from UB hell, it’s tools combined with developers who care to listen to them. Richard Hipp is a programming hero who always responds quickly and has been receptive to the fiddly kinds of problems I’m talking about here.

What to Do Now

Although the situation is much, much better than it was five years ago, C and C++ developers will not be on solid ground until every undefined behavior falls into one of these two categories:

  • Erroneous UBs for which reliable and ubiquitous (but perhaps optional and inefficient) detectors exist. These can work either at compile time or run time.
  • Benign behaviors for which compiler developers have provided a documented semantics.

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/yklh2pP29qI/1292

Original article

Fast Search Using PostgreSQL Trigram Indexes

Mar 18, 2016

GitLab 8.6 will ship with improved search performance for PostgreSQL thanks to
the use of trigram indexes. In this article we’ll look at how these indexes work
and how they can be used to speed up queries using LIKE conditions.

Continue reading “Fast Search Using PostgreSQL Trigram Indexes”

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/Sutjp-acEA0/

Original article

$99 Mineserver: The Devil Is in the Details

IMG_0734You may recall my three sons ran a successful Kickstarter campaign last fall for their $99 Mineserver, a multiuser Minecraft server the size of a pack (not a carton) of cigarettes. On the eve of their product finally shipping here’s an update with some lessons for any complex technical project.

At the time we shot the Kickstarter video my kids already had in hand a functional prototype. Everything seen in the video was real and the boys felt that only producing custom cases really stood in the way of shipping. How wrong they were!

First we needed a mobile app to administer the server so we hired an experienced mobile developer through guru.com. His credentials were great but maybe it should have been a tip-off when, right after we made the first payment, the developer moved from Europe to India. The United States Postal Service guaranteed us it would take no more than five days for our development hardware to reach Mumbai. It took three weeks. And our USPS refund for busting the delivery guarantee has yet to appear.

We were naive. The original development estimate was exceeded in the first week and we were up to more than 8X by the time we pulled the plug. Still the developer kept trying to charge us, eventually sending the project to arbitration, which we won.

Our saving grace was we found a commercial app we could license for less than a dollar per unit. Why hadn’t we done this earlier? Well it wasn’t available for our ARM platform and didn’t work with our preferred Minecraft server, called Cuberite (formerly MC Server).

Enter, stage right, the lawyers of Minecraft developer Mojang in Sweden.

Mojang is a peculiar outfit. They are nominally owned by Microsoft yet Redmond is only very slowly starting to exert control. I’m not sure the boys and girls in Sweden even know that. Minecraft server software is free, Mojang makes its money (lots of money) from client licenses, so they and Microsoft ought to want third-party hardware like ours to be successful. But no, that’s not the case.

I’ve known people at Microsoft since the company was two years old so it was easy to reach out for support. Maybe we could do some co-marketing?

Nope, that’s against Mojang rules, Redmond told us, their eyes rolling at the same time.

We urged them to consider a hardware certification program. We’d gladly pay a small royalty to be deemed Ready-for-Minecraft.

Nope, Mojang refuses to have anything to do with hardware developers. Oh, and both our logo and font were in violation of Mojang copyrights, so change those right away, please.

Swedes are very polite but firm, much like Volvos.

Minecraft, which is written in Java, is nominally Open Source, but there are some peculiar restrictions on distributing the code. The server software can’t be distributed pre-compiled. For that matter it also can’t be distributed even as source code if the delivery vehicle is a piece of operational hardware like our Mineserver. Our box would have to ship empty then download the source and compile it for our ARM platform before the first use, making everything a lot more difficult.

Now let’s be clear, this particular restriction technically only applies to one version of the Minecraft server, usually called Vanilla — the multiuser server distributed for free directly by Mojang. There are other Minecraft servers that, in theory, we ought to be able to ship with our little boxes except Mojang has all the developers so freaked that nobody does it. Besides, Vanilla is the official Minecraft server and some people won’t accept anything else.

But our experience shows Vanilla Minecraft isn’t very good at all. In fact it is our least favorite server, primarily because it supports only a single core on our four-core and eight-core boxes. As such Vanilla supports the lowest number of concurrent Minecraft players. A better server like Spigot can support 2-3 times as many users as Vanilla.

The best Minecraft server of all in our opinion is Cuberite, which is also the only one written in C++ instead of Java. Cuberite extracts far more performance from our hardware than any other server, which is why we chose to make it our de facto installation. We’ll also support Vanilla, Spigot and Tekkit Lite (you can switch between them), but Cuberite will be the first server to compile on the machine.

The only problem with Cuberite is that the off-the-shelf admin application we discovered doesn’t support it. Or didn’t. The very cooperative admin developer in the UK is extending his product to support Cuberite. This should be done soon and waiting for Cuberite is a major reason why we haven’t shipped. We’re hoping to have it in a few more days.

But waiting for Cuberite wasn’t our only problem. We had to develop a dynamic DNS system, WiFi support, and make sure the units were totally reliable.

Oh, and our laser cutter burst into flames.

Understand that for a Mineserver or Mineserver Pro, the sysadmin also typically goes by another title — Mom. Our administration tool allows her to control the server from any Internet-connected computer including Android and iOS mobile phones. She can bump or ban players from the frozen food aisle, monitor in-game text chat, reboot the server — anything. It’s a very powerful and easy-to-use tool.

While we were waiting for Cuberite support we added something else for Mom to worry about, a Mumble server. Mumble is open source voice chat with very low latency. We were able to add Mumble to Mineserver because the CPU load is very low with all encoding and decoding done in the client and the server acting mainly as a VoIP switch. If she wants to, Mama can listen to the Mumble feed  and step-in if little Johnny drops an F-bomb.

Every Mineserver has its own individual name chosen by the customer. This server name, rather than an IP address, is how whitelisted players find the game. Consulting with dynamic DNS experts prior to the Kickstarter campaign this sounded easy to do with a combination of A records and SRV records. But it’s not so easy because Mom doesn’t want to have to do port forwarding, so that meant adding other techniques like UPnP, which is tough to do if it’s not turned-on in your router. We eventually developed what the boys believe is a 95 percent solution. In 95 percent of cases it should work right out of the box with the remaining five percent falling on the slim shoulders of some Cringely kid.

Every Mineserver is assembled by a specific child who is also responsible for product support. His e-mail address is right on the case and if something doesn’t work he can ssh, telnet, or VNC into the box to fix it.

Somewhere in this mix of challenges we lost our primary Linux consultant. We still don’t know what happened to him, he just stopped responding to e-mails. The next consultant really didn’t have enough time for us, but finally we found a guy with the help of our admin developer who has been doing a great job. He helped us switch our Linux server distribution with several positive results and helped come up with the custom distro we use today.

But still there were problems, specifically WiFi.

WiFi was something we’d rather not do at all, but it has become the new Ethernet (even Ethernet inventor Bob Metcalfe pretends WiFi is Ethernet, which it isn’t but we still love Bob). Many home networks are entirely WiFi. We feel the best way to use a Mineserver even in WiFi-only homes is by plugging the included CAT6 cable into a router or access point port and using the router’s WiFi capability. But some customers don’t want to plug anything into anything, so we’ve included native WiFi support in some Mineservers and all Mineserver Pros. That sounds easier to do than it actually was.

Mineservers are headless so how do you set an SSID or password the first time? Good question, but one we finally solved. Mineservers can be configured and administered entirely without wires if needed. In most situations the customer will plug-in their Mineserver to power and it will just work. If it doesn’t, then an 11 year-old will fix it. That first power-up will involve downloading and compiling the selected server software, which can be changed at any time. It’s a process that takes 5-10 minutes and then you are up and running.

What we hope is our final technical problem has been particularly vexing. We now have three Mineservers and three Mineserver Pros running at the sonic.net data center here in Santa Rosa. All six servers plus a power strip and a gig-Ethernet switch fit on a one foot square piece of plywood. The truly great folks at Sonic gave us half a rack and we fill perhaps one percent of that, meaning you could probably put 1200 Mineservers in a full rack — enough to support up to 60,000 players. But operating in this highly-secure facility with its ultra-clean power and unlimited bandwidth we began to notice during testing that sometimes the servers would just disappear from the net. One minute the IP would be there and the next minute it would be gone.

We’re still waiting for Cuberite support of course, but we even if we had that today we still can’t ship a product that disappears from the net. We’ve tried swapping-out boards but the problem still occurs. Maybe it was the gig-Ethernet switch, so we got a new one, then a bigger one, then an even bigger managed switch. We changed cables. We started fiddling with the software. Each Mineserver board has a serial port so we converted an old Mac Mini to Linux, added a powered USB hub and six UART-to-USB adapters so now our consultant in Texas can use six virtual serial terminals to monitor the test Mineservers 24/7 without having to rely on their Ethernet connections. Everything is being logged so the next time one goes down we’ll know exactly what’s happening.

We’re also in touch with other users of the same board like Lockheed Martin and Lawrence Livermore Lab where they have a cluster of 160. But that’s nothing compared to three kids in Santa Rosa who are right now burning-in 500 boards.

It’s the final bug, we’re approaching it with planning, gusto, and plenty of Captain Crunch, and fully expect to solve this last issue and start shipping next week when the kids are off school for Spring Break.

Mineservers as a business so far aren’t quite as good as the boys had hoped. The Kickstarter units are losing an average of $15 each (so far). But $7500 in the hole is not much cost to start a technology business. And with their marketing strategy (called “F-ing brilliant” by a VC friend) the boys are hoping to sell 100+ post-Kickstarter units per month to eventually pay for college.

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/2URL5HPK8Z8/

Original article

Here’s a look Inside Dell’s strategy for Linux PCs

In a world of PCs dominated by Windows and Macs, Dell’s line of “Project Sputnik” laptops with Ubuntu Linux have secured a cult following.

The latest Project Sputnik laptop is the XPS 13 Developer Edition, which shipped last week. With its sleek design, the XPS 13 brings a new, sexy look to otherwise dull Linux laptop designs.

The XPS 13 DE is also significant because it brings new technologies from Mac OS and Windows to Linux laptops. The XPS 13 DE models have 4K screens, Intel Skylake chips and the Thunderbolt 3 interconnect, which are new to Linux laptops.

The Linux laptop is a cousin of the XPS 13 with Windows 10, which was announced earlier this year. The Linux version has Ubuntu 14.04, but it couldn’t be launched at the same time because the Linux drivers weren’t ready. Support for Skylake chips in Ubuntu was announced on Feb. 18, which also held back the laptop’s release.

To read this article in full or to leave a comment, please click here

Original URL: http://www.computerworld.com/article/3046033/computer-hardware/heres-a-look-inside-dells-strategy-for-linux-pcs.html#tk.rss_all

Original article

CoreOS takes its Clair container security tool out of beta

shipping containers, boxes CoreOS announced the first preview of Clair, a tool that scans Docker containers for security vulnerabilities, last November and today, with the launch of Clair 1.0, it is ready to take the beta label off the service. Given that developers often rely on pre-packaged containers — or regularly recycle the same ones — ensuring that the software included in them is safe to run is… Read More

Original URL: http://feedproxy.google.com/~r/Techcrunch/~3/UMGVeokOee0/

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: