Review: Amazon puts machine learning in reach

As a physicist, I was originally trained to describe the world in terms of exact equations. Later, as an experimental high-energy particle physicist, I learned to deal with vast amounts of data with errors and with evaluating competing models to describe the data. Business data, taken in bulk, is often messier and harder to model than the physics data on which I cut my teeth. Simply put, human behavior is complicated, inconsistent, and not well understood, and it’s affected by many variables.

To read this article in full or to leave a comment, please click here

(Insider Story)

Original URL:

Original article

How to Install and Configure MySQL Cluster on CentOS 7

MySQL Cluster is designed to provide a MySQL compatible database with high availability and low latency. The MySQL Cluster technology is implemented through the NDB (Network DataBase) and NDBCLUSTER storage engines and provides shared-nothing clustering and auto-sharding for MySQL database systems. In the shared-nothing architecture, each of nodes has its own memory and disk, the use of shared storage such as NFS, SANs is not recommended and supported.

Original URL:

Original article

Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day

It took less than 24 hours for Twitter to corrupt an innocent AI chatbot. Yesterday, Microsoft unveiled Tay — a Twitter bot that the company described as an experiment in “conversational understanding.” The more you chat with Tay, said Microsoft, the smarter it gets, learning to engage people through “casual and playful conversation.”

Unfortunately, the conversations didn’t stay playful for long. Pretty soon after Tay launched, people starting tweeting the bot with all sorts of misogynistic, racist, and Donald Trumpist remarks. And Tay — being essentially a robot parrot with an internet connection — started repeating these sentiments back to users, proving correct that old programming adage: flaming garbage pile in, flaming garbage pile out.

Now, while these screenshots seem to show that Tay has assimilated the internet’s worst tendencies into its personality, it’s not quite as straightforward as that. Searching through Tay’s tweets (more than 96,000 of them!) we can see that many of the bot’s nastiest utterances have simply been the result of copying users. If you tell Tay to “repeat after me,” it will — allowing anybody to put words in the chatbot’s mouth.

One of Tay’s now deleted “repeat after me” tweets.

However, some of its weirder utterances have come out unprompted. The Guardian picked out a (now deleted) example when Tay was having an unremarkable conversation with one user (sample tweet: “new phone who dis?”), before it replied to the question “is Ricky Gervais an atheist?” by saying: “ricky gervais learned totalitarianism from adolf hitler, the inventor of atheism.”

But while it seems that some of the bad stuff Tay is being told is sinking in, it’s not like the bot has a coherent ideology. In the span of 15 hours Tay referred to feminism as a “cult” and a “cancer,” as well as noting “gender equality = feminism” and “i love feminism now.” Tweeting “Bruce Jenner” at the bot got similar mixed response, ranging from “caitlyn jenner is a hero & is a stunning, beautiful woman!” to the transphobic “caitlyn jenner isn’t a real woman yet she won woman of the year?” (Neither of which were phrases Tay had been asked to repeat.)

It’s unclear how much Microsoft prepared its bot for this sort of thing. The company’s website notes that Tay has been built using “relevant public data” that has been “modeled, cleaned, and filtered,” but it seems that after the chatbot went live filtering went out the window. The company starting cleaning up Tay’s timeline this morning, deleting many of its most offensive remarks.

Tay’s responses have turned the bot into a joke, but they raise serious questions

It’s a joke, obviously, but there are serious questions to answer, like how are we going to teach AI using public data without incorporating the worst traits of humanity? If we create bots that mirror their users, do we care if their users are human trash? There are plenty of examples of technology embodying — either accidentally or on purpose — the prejudices of society, and Tay’s adventures on Twitter show that even big corporations like Microsoft forget to take any preventative measures against these problems.

For Tay though, it all proved a bit too much, and just past midnight this morning, the bot called it a night:

In an emailed statement given later to Business Insider, Microsoft said: “The AI chatbot Tay is a machine learning project, designed for human engagement. As it learns, some of its responses are inappropriate and indicative of the types of interactions some people are having with it. We’re making some adjustments to Tay.”

Update March 24th, 6:50AM ET: Updated to note that Microsoft has been deleting some of Tay’s offensive tweets.

Update March 24th, 10:52AM ET: Updated to include Microsoft’s statement.

Verge Archives: Can we build a conscious computer?

Original URL:

Original article

Microsoft’s ‘Teen Girl’ AI Experiment Becomes a ‘Neo-Nazi Sex Robot’

Reader Penguinisto writes: Recently, Microsoft put an AI experiment onto Twitter, naming it “Tay”. The bot was built to be fully aware of the latest adolescent fixations (e.g. celebrities and similar), and to interact like a typical teen girl. In less than 24 hours, it inexplicably became a neo-nazi sex robot with daddy issues. Sample tweets from it proclaimed that “Hitler did nothing wrong!”, then went on to blame former President Bush for 9/11, stated that “donald trump is the only hope we’ve got”, and other similar instances. As the hours passed, it all went downhill from there, eventually spewing racial slurs and profanity, demanding sex, and calling everyone “daddy”. The bot was quickly removed once Microsoft discovered the trouble, but the hashtag is still around for those who want to see it in its ugly raw splendor.

Share on Google+

Read more of this story at Slashdot.

Original URL:

Original article

Newcomer Galactic Exchange can spin up a Hadoop cluster in five minutes

Business running with computer to illustrate speed. A new company with a cool name, Galactic Exchange, came out of stealth today with a great idea. It claims it can spin up a Hadoop cluster for you in five minutes, ready to go. That’s no small feat if it works as advertised and greatly simplifies what has traditionally been a process wrought with complexity.
The new product called ClusterGX is being released in Beta this week… Read More

Original URL:

Original article

Citus Unforks from PostgreSQL, Goes Open Source

When we started working on CitusDB 1.0 four years ago, we envisioned scaling out relational databases. We loved Postgres (and the elephant) and picked it as our underlying database of choice. Our goal was to extend this database to seamlessly shard and replicate your tables, provide high availability in the face of failures, and parallelize your SQL queries across a cluster of machines.

We wanted to make the PostgreSQL elephant magical.

Four years later, CitusDB has been deployed into production across a number of verticals, and received numerous feature improvements with every release. PostgreSQL also became much more extensible in that time–and we learned a lot more about it.

Today, we’re happy to release Citus 5.0, which seamlessly scales out PostgreSQL across a cluster of machines for real-time workloads . We’re also excited to share two major announcements in conjunction with the 5.0 release!

First, Citus 5.0 now fully uses the PostgreSQL extension APIs. In other words, Citus becomes the first distributed database in the world that doesn’t fork the underlying database. This means Citus users can immediately benefit from new features in PostgreSQL, such as semi-structured data types (json, jsonb), UPSERT, or when 9.6 arrives no more full table vacuums. Also, users can keep working with their existing Postgres drivers and tools.

Second, Citus is going open source! The project, codebase, and all open issues are now available on Github. We realized that when mentioned Citus and PostgreSQL together to prospective users, they already assumed that Citus was open. After many conversations with our customers, advisors, and board, we are happy to make Citus available for everyone.

To see how to get started with it, let’s take a hands-on look.

Getting Rolling

You can download the extension here. Once you’ve downloaded it you can bootstrap your initial Postgres database with the Citus cluster. From here we can begin using Citus:

Now that the extension is enabled you can begin taking advantage. First we’ll create a table, then we’re going to tell Citus to create it as a distributed table and finally we’ll inform it about our shards. If you’re running Citus on a single machine, this will scale queries across multiple CPU cores. and create the impression of sharding across databases.

As an example, which you can find more detail on in our tutorial, we’re going to create a table to capture edits from wikipedia, then shard this table across multiple Postgres instances. First let’s create our table:

CREATE TABLE wikipedia_changes (
  editor TEXT, -- The editor who made the change
  time TIMESTAMP WITH TIME ZONE, -- When the edit was made
  bot BOOLEAN, -- Whether the editor is a bot

  wiki TEXT, --  Which wiki was edited
  namespace TEXT, -- Which namespace the page is a part of
  title TEXT, -- The name of the page

  comment TEXT, -- The message they described the change with
  minor BOOLEAN, -- Whether this was a minor edit (self-reported)
  type TEXT, -- "new" if this created the page, "edit" otherwise

  old_length INT, -- How long the page used to be
  new_length INT -- How long the page is as of this edit

Now that we’ve created our table we’re going to tell the Citus extension this is the one we want to shard. *In the case of our demo, we’re going to lower the replication factor to one, since we’re only running 1 worker node*

SET citus.shard_replication_factor = 1;
SELECT master_create_distributed_table( 'wikipedia_changes', 'editor', 'hash' );
SELECT master_create_worker_shards('wikipedia_changes', 16, 1);

You can start inserting data with a standard INSERT INTO and Citus will shard and distribute your data across multiple nodes. If you want a jump start at loading data in check out our tutorial with scripts to help you start loading data automatically from the wikipedia event stream.

It’s that simple to use the fully open source Citus 5.0 extension. Now, let’s take a deeper look at some of the technical details.

What’s Unique about Citus?

Citus uses three new ideas when building the distributed database.

  1. Citus scales out SQL by extending PostgreSQL, not forking it. This way, users benefit from all the performance and feature work done on Postgres over the past two decades, scaled out on a cluster of machines.
  2. Data-intensive applications have evolved over time to require multiple workloads from the database. Citus comes with three distributed executors, recognizing differences across operational (low-latency) and analytic (high-throughput) workloads
  3. Parallelizing SQL queries requires that the underlying theoretical framework is complete. Citus’ distributed query planner uses multi-relational algebra, which is proven to be complete.

These principles help us lay the foundation for a scalable relational database. With that said, we know that we still have more work ahead of us. PostgreSQL is huge, and Citus currently doesn’t support the full spectrum of SQL queries. For details on SQL coverage, please see our FAQ.

A good way to get started with Citus today is to think of it in terms of your use-case.

Common Use Cases 

Citus provides users real-time responsiveness over large datasets, most commonly seen in rapidly growing event systems or with time series data . Common uses include powering real-time analytic dashboards, exploratory queries on events as they happen, session analytics, and large data set archival and reporting.

Citus is deployed in production across multiple verticals, ranging from technology start-ups to large enterprises. Here are some examples:, ranging from technology start-ups to large enterprises. Here are some examples:

  • CloudFlare uses Citus to provide real-time analytics on 100 TBs of data from over 4 million customer websites.
  • Neustar builds and maintains a scalable ad-tech infrastructure that analyzes billions of events per day using HyperLogLog and Citus.
  • Agari uses Citus to secure more than 85 percent of U.S. consumer emails on two 6-8 TB clusters.
  • Heap uses Citus to run dynamic funnel, segmentation, and cohort queries across billions of users and tens of billions of events.

As excited as we are to make Citus 5.0 available to everyone, we’d be remiss to not pay attention to those of you who need something more. For customers with large production deployments, we also offer an enterprise edition that comes with additional functionality and commercial support.

In Conclusion

We’re excited to release the latest version of Citus and make it open source. And we’d love to hear your feedback. If you have questions or comments for us, start a thread in our Google Group, join us through the Citus IRC channel, or open an issue on Github.

Original URL:

Original article

A Node virtual machine

A Node virtual machine


Ideal Node.js hosting service — 

  1. Here’s the URL of a GIT repo, containing a Node app. Run it with forever
  2. I want to be able to do a tail -f on the log so I can see what’s going on. 
  3. And to view its file system, perhaps through a browser-based JS app.

I think that’s about it. I give you a URL, you run it.

I know some services get close, but I don’t want close. I want just this. 

A Node virtual machine. 

Original URL:

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: