Extracting Text from Our Collection of PACER Documents

We’re getting ready to launch a brand new search engine for PACER content. When it launches, one of the big features it will have is full-text search for the millions of documents that people have submitted using our RECAP system. To our knowledge, this will be the first free system for searching PACER content in this way, allowing you to look up documents by any word they might contain.
The big problem with this goal? We have about a million PDFs that consist only of images. Some of these are actually quite beautiful:

A beautiful handwritten motion. It goes on like this for 46 pages.

But others are hideous:

An 84 page log from 1957. It’s come a long ways just to appear on this blog today.

But no matter how a document looks, we want to extract the text so that we can make it searchable. This is done using a system called Optical Character Recognition (OCR),

Original URL: https://free.law/2016/09/26/extracting-text-from-our-collection-of-pacer-documents/

Original article

Google Cloud Platform sets a course for new horizons

Embracing the multi-cloud world
Not only do applications running on GCP benefit from state-of-the-art infrastructure, but they also run on the latest and greatest compute platforms. Kubernetes, the open source container management system that we developed and open-sourced, reached version 1.4 earlier this week, and we’re actively updating Google Container Engine (GKE) to this new version. GKE customers will be the first to benefit from the latest Kubernetes features, including the ability to monitor cluster add-ons, one-click cluster spin-up, improved security, integration with Cluster Federation and support for the new Google Container-VM image (GCI). Kubernetes 1.4 improves Cluster Federation to support straightforward deployment across multiple clusters and multiple clouds. In our support of this feature, GKE customers will be able to build applications that can easily span multiple clouds, whether they are on-prem, on a different public cloud vendor, or a hybrid of both. We want GCP to be the best place to

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/3RVh4k5i_Ko/Google-Cloud-Platform-sets-a-course-for-new-horizons.html

Original article

Google’s Cloud Machine Learning service is now in public beta

 Google announced a number of updates to its cloud computing services at a small event in San Francisco this morning. These updates touch Google’s machine learning services, as well as its database and analytics services, and include an update to how it supports its users. The company’s focus today, though, was clearly on machine learning. Google launched the private alpha of… Read More

Original URL: http://feedproxy.google.com/~r/Techcrunch/~3/z6B7gbgKCNY/

Original article

Raspberry Pi Foundation Unveils New LXDE-Based Desktop For Raspbian Called PIXEL

Raspberry Pi Foundation’s Simon Long has unveiled a new desktop environment for the Debian-based Raspbian GNU/Linux operating system for Raspberry Pi devices. From a Softpedia report (submitted by an anonymous reader):Until today, Raspbian shipped with the well-known and lightweight LXDE desktop environment, which looks pretty much the same as on any other Linux-based distribution out there that is built around LXDE (Lightweight X11 Desktop Environment). But Simon Long, a UX engineer working for Raspberry Pi Foundation, was hired to make it better, transform it into something that’s more appealing to users. So after two years of work, he managed to create a whole new desktop environment for Raspbian, the flagship operating system for Raspberry Pi single-board computers developed and distributed by Raspberry Pi Foundation. Called PIXEL, the new Raspbian desktop offers a more eye-candy design with the panel on top (not on the bottom like on a default LXDE setup),

Original URL: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/KgnQgJG-XSQ/raspberry-pi-foundation-unveils-new-lxde-based-desktop-for-raspbian-called-pixel

Original article

TensorFlow for R


TensorFlow for R

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

The TensorFlow API is composed of a set of Python modules that enable constructing and executing TensorFlow graphs. The tensorflow package provides access to the complete TensorFlow API from within R.


Install the main TensorFlow distribution:

Install the tensorflow R package:


The tensorflow package will be built against the default version of python found in the PATH. If you want to build against a specific version of python you can define the TENSORFLOW_PYTHON_VERSION environment variable before installing. For example:

Verify that your installation is working

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/6D6Vcv_e-T0/tensorflow

Original article

The New Raspberry Pi OS Is Here, and It Looks Great

The Raspberry Pi’s main operating system, Raspbian, just got a brand new look from the Raspberry Pi Foundation. Dubbed PIXEL, it’s a skin for Raspbian that modernizes the interface, adds some new programs, and makes it much more pleasant to use. Let’s take a closer look at your Pi’s new appearance.Read more…

Original URL: http://feeds.gawker.com/~r/lifehacker/full/~3/tPGacGl9OfY/the-new-raspberry-pi-os-is-here-and-it-looks-great-1787194540

Original article

Tensorflow Ruby API

TensorFlow is an extraordinary open source software library for numerical computation using data flow graphs. It was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organisation for the purpose of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.TensorFlow comes with an easy to use Python interface and a C++ interface to build and execute your computational graphs. However, Tensorflow is available only in Python, and due to the strong interest from the Ruby community, I took an interest in porting it. I started working on Ruby API with support from Somatic.io and SciRuby foundation and came across some cool things that I would like to share with you. I am a student at Indian Institute of Technology, Kharagpur. I extremely fascinated with open

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/GLoQXZTuKUE/introducing-tensorflow-ruby-api-e77a477ff16e

Original article

A Commodore 64 Is Still Being Used to Run an Auto Shop in Poland

Hell yeah.

We need to learn a lesson about needless consumerism from this auto repair shop in Gdansk, Poland. Because it still uses a Commodore 64 to run its operations. Yes, the same Commodore 64 released 34 years ago that clocked in at 1 MHz and had 64 kilobytes of RAM. It came out in 1982, was discontinued in 1994, but it’s still used to run a freaking company in 2016. That’s awesome.To be sure, small businesses around the world often use technology that’s a bit more outdated than what the rest of us use in our daily lives but damn, flexing a Commodore 64 for work in a time when babies are given smartphones before pacifiers is pretty damn bad ass.

Here’s what Commodore USA’s Facebook page wrote regarding the computer:This C64C used by a small auto repair shop for balancing driveshafts has been working non-stop for over 25 years! And

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/udsKMHkpidQ/this-old-ass-commodore-64-is-still-being-used-to-run-an-1787196319

Original article

PostgreSQL 9.6 Released

Posted on 2016-09-29
PostgreSQL 9.6, the latest version of the world’s leading open source database, was released today by the PostgreSQL Global Development Group. This release will allow users to both scale up and scale out high performance database workloads. New features include parallel query, synchronous replication improvements, phrase search, and improvements to performance and usability, as well as many more features.
Scale Up with Parallel Query
Version 9.6 adds support for parallelizing some query operations, enabling utilization of several or all of the cores on a server to return query results faster. This release includes parallel sequential (table) scan, aggregation, and joins. Depending on details and available cores, parallelism can speed up big data queries by as much as 32 times faster.
“I migrated our entire genomics data platform – all 25 billion legacy MySQL rows of it – to a single Postgres database, leveraging the row compression abilities of the JSONB datatype,

Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/Lg1Bwi4EvnU/

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: