Gone in Six Characters: Short URLs Considered Harmful for Cloud Services

[This is a guest post by Vitaly Shmatikov, professor at Cornell Tech and once upon a time my adviser at the University of Texas at Austin. — Arvind Narayanan.]

TL;DR: short URLs produced by bit.ly, goo.gl, and similar services are so short that they can be scanned by brute force.  Our scan discovered a large number of Microsoft OneDrive accounts with private documents.  Many of these accounts are unlocked and allow anyone to inject malware that will be automatically downloaded to users’ devices.  We also discovered many driving directions that reveal sensitive information for identifiable individuals, including their visits to specialized medical facilities, prisons, and adult establishments.

URL shorteners such as bit.ly and goo.gl perform a straightforward task: they turn long URLs into short ones, consisting of a domain name followed by a 5-, 6-, or 7-character token.  This simple convenience feature turns out to have an unintended consequence.  The tokens are so short that the entire set of URLs can be scanned by brute force.  The actual, long URLs are thus effectively public and can be discovered by anyone with a little patience and a few machines at her disposal.

Today, we are releasing our study, 18 months in the making, of what URL shortening means for the security and privacy of cloud services.  We did not perform a comprehensive scan of all short URLs (as our analysis shows, such a scan would have been within the capabilities of a more powerful adversary), but we sampled enough to discover interesting information and draw important conclusions.  Our study focused on two cloud services that directly integrate URL shortening: Microsoft OneDrive cloud storage (formerly known as SkyDrive) and Google Maps.  In both cases, whenever a user wants to share a link to a document, folder, or map with another user, the service offers to generate a short URL – which, as we show, unintentionally makes the original URL public.

OneDrive.

OneDrive generates short URLs for documents and folders using the 1drv.ms domain.  This is a “branded short domain” operated by Bitly and uses the same tokens as bit.ly. Therefore, any scan of bit.ly short URLs automatically discovers 1drv.ms URLs.  In our sample scan of 100,000,000 bit.ly URLs with randomly chosen 6-character tokens, 42% resolved to actual URLs.  Of those, 19,524 URLs lead to OneDrive/SkyDrive files and folders, most of them live.  But this is just the beginning.

OneDrive URLs have predictable structure.  From the URL to a single shared document (“seed”), one can construct the root URL and automatically traverse the account, discovering all files and folders shared under the same capability as the seed document or without a capability. For example, suppose you obtain a short URL such as http://1drv.ms/1xNOWV7 which resolves to https://onedrive.live.com/?cid=48…48&id=48…48!115&ithint=folder,xlsx&authkey=!A..q4.  First parse the URL and extract the cid and authkey parameters.  Then, construct the root URL for the account as  https://onedrive.live.com/?cid=48…48&authkey=!A...q4. From the root URL, it is easy to automatically discover URLs of other shared files and folders in the account (note: the following traversal methodology no longer works as of March 2016). To find individual files, parse the HTML code of the page and look for a elements with href attributes containing &app=, &v=, /download.aspx?, or /survey?. To find other folders, look for links that start with https://onedrive.live.com/ and contain the account’s cid. 

The traversal-augmented scan yielded URLs to 227,276 publicly accessible OneDrive documents, including dozens of thousands of PDF and Word files, spreadsheets, media files, and executable binaries.  A similar scan of 100,000,000 random 7-character bit.ly tokens yielded URLs to 1,105,146 publicly accessible OneDrive documents.  We did not download their contents, but just from the metadata it is obvious that many of them contain private or sensitive information.

Around 7% of the OneDrive folders discovered in this fashion allow writing.  This means that anyone who randomly scans bit.ly URLs will find thousands of unlocked OneDrive folders and can modify existing files in them or upload arbitrary content, potentially including malware.  Microsoft’s virus scanning for OneDrive accounts is trivial to evade (for example, it fails to discover even the test EICAR virus if the attacker goes to the trouble of compressing it).  Furthermore, OneDrive “synchronizes” account contents across the user’s OneDrive clients.  Therefore, the injected malware will be automatically downloaded to all of the user’s machines and devices running OneDrive.

Google Maps.

Before September 2015, short goo.gl/maps URLs used 5-character tokens.  Our sample random scan of these URLs yielded 23,965,718 live links, of which 10% were for maps with driving directions.  These include directions to and from many sensitive locations: clinics for specific diseases (including cancer and mental diseases), addiction treatment centers, abortion providers, correctional and juvenile detention facilities, payday and car-title lenders, gentlemen’s clubs, etc.  The endpoints of driving directions often contain enough information (e.g., addresses of single-family residences) to uniquely identify the individuals who requested the directions. For instance, when analyzing one such endpoint, we uncovered the address, full name, and age of a young woman who shared directions to a planned parenthood facility. Conversely, by starting from a residential address and mapping all addresses appearing as the endpoints of the directions to and from the initial address, one can create a map of who visited whom.

Fine-grained data associated with individual residential addresses can be used to infer interesting information about the residents. We conjecture that one of the most frequently occurring residential addresses in our sample is the residence of a geocaching enthusiast. He or she shared directions to hundreds of locations around Austin, Texas, as shown in the picture, many of them specified as GPS coordinates. We have been able to find some of these coordinates in a geocaching database.

It is also worth mentioning that there is a rich literature on inferring information about individuals from location data. For example, Crandall et al. inferred social ties between people based on their co-occurrence in a geographic location, Isaacman et al. inferred important places in people’s lives from location traces, and Montjoye et al. observed that 95% of individuals can be uniquely identified given only 4 points in a high-resolution location dataset.

What happened when we told them.

We made several attempts to report the security and privacy risks of short OneDrive URLs to Microsoft’s Security Response Center (MSRC).  After an email exchange that lasted over two months, “Brian” informed us on August 1, 2015, that the ability to share documents via short URLs “appears by design” and “does not currently warrant an MSRC case.”  As of March of 2016, the URL shortening option is no longer available in the OneDrive interface, and the account traversal methodology described above no longer works.  After we contacted MSRC again, they denied that these changes have anything to do with our previous report and reiterated that the issues we discovered do not qualify as a security vulnerability,

As of this writing, all previously generated short OneDrive URLs remain vulnerable to scanning and malware injection.

We reported the privacy risks of short Google Maps URLs to the Google Security Team.  They responded immediately.  All newly generated goo.gl/maps URLs have 11- or 12-character tokens, and Google deployed defenses to limit the scanning of the existing URLs.

How cloud services should use URL shorteners.

Use longer tokens in short URLs.  Warn users that shortening a URL may expose the content behind the original URL to unintended third parties.  Use your own resolver and tokens, not bit.ly.  Detect and limit scanning, and consider techniques such as CAPTCHAs to separate human users from automated scanners.  Finally, design better APIs so that leakage of a single URL does not compromise every shared URL in the account.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/tSi6hs1px_0/

Original article

Show HN: Net-Commander – Automation and iOT IDE

This is a next generation tool, optimized for managing, programming and automating machines, services and apps or just loose parts of code snippets!

It can be used for iOT, automation or just as better interface for your build-tasks on multiple SSH servers . It acts as a controller, running in background but also enables to create interfaces with a visual designer.

Features

How does it work?

Screenshots


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/kVBqVx749Qg/

Original article

Show HN: Metachat, a news feed for unread Slack messages

Connect Metachat to all of your Slack accounts

We use advanced artificial intelligence to read your messages for you

Read Metachat’s summary feed, and get on with your day!

Metachat sorts your unread messages into an orderly feed

Related messages are clustered automatically — never wade through dozens of silly jokes again!

It also searches across all of your teams

Ever try to find a message or a link, only to forget which account you should even start with?

Metachat’s meta-search is for you.

Ready to experience messaging bliss?

Just create an account and connect your Slack teams.

Get started now

Home
·
Login
·
Sign up

Metachat is not created by, affiliated with, or supported by Slack Technologies, Inc.

We’re also not affiliated with the informal chat community for Metafilter users.

Heap | Mobile and Web Analytics


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/joMwb_sjmvc/

Original article

How I debug Node.js

15 Apr 2016

Node.js is a server sided runtime for Javascript apps. Being a server sided runtime, it doesn’t have a GUI. Therefore it’s abilities to provide an easy to use debugging interface are limited.

Why I don’t like Popular Options

There are several ways of debugging Node.js, here are few of the reasons why I don’t like the most popular ones

Builtin CLI debugger

  1. It doesn’t have a GUI of course
  2. It’s slow and even hangs at times
  3. You have to remember it’s commands
  4. Debugging complex problems is nearly impossible if not impossible

node-inspector

  1. It shows ES6 Maps and Sets as null
  2. It shows objects as empty randomly
  3. It is generally slow
  4. It’s very unstable

IDE Debuggers

  1. The IDEs are costy
  2. Each has their own UI
  3. They are hard to setup
  4. They lack advance features

Electron to the rescue

Electron is an open source project by GitHub, it is basically Chromium + Node.js. It has the best of both worlds, node’s require, globals and all the other APIs along with Chromium Dev Tools.

I have written a small wrapper around Electron to allow for quick Node.js debugging. It’s called DeNode, short for Debug Node.
You can install it using npm

It registers itself as denode bin, it accepts an optional file path of the node module to execute on boot.

denode
denode ./index
denode `which browserify`

What’s awesome about this?

  1. You can click and expand on deep nested objects
  2. You can profile your apps for memory leaks and CPU time
  3. You can set breakpoints on the fly
  4. You can update running code from dev tools
  5. Basically, all the awesomeness of Chromium Dev Tools

What’s the side effect?

  1. Not having the ability to execute it over a network or VM, theoretically you could do X forwarding but it would get too slow and painful

That’s what I use to debug my Node.js app, let me know what you think of it in the comments below.


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/_evRR6WWuLs/how-i-debug-node-js.html

Original article

Defeating the npm worm

The threat

There is a security vulnerability in npm by default that enables writing a worm that can propagate to anyone doing an npm install to a package that would contain an infected dependency (even if the dependency is deep).

More on this topic

This threat uses a combinaison of elements:

  1. npm install installs a package and all its dependencies (and deeply) by default
  2. npm install --save makes that by default, dependency versions accept any later patch version of the same package
  3. npm install, by default, runs all lifecycle scripts of all dependencies and these can be arbitrary bash commands
  4. npm login, by default, saves the authentication credentials in $HOME/.npmrc.
  5. By default, published packages are actually published without review
  6. Arbitrary npm lifecycle scripts bash commands have, by default, full authority over the installer computer. This includes scanning the person’s hardrive, find all their node projects, modify them to add worm propagation code as a lifecycle script and then run npm version patch ; npm publish.

Through the repeated annoying use of “by default”, the reader understands that if a worm is sent to npm, it will propagate because most people don’t change defaults. Point 5 listed an example of what the worm can do to propagate, but of course once you have arbitrary access to the machine, you can just encrypt their disk and ransom for a bitcoin against giving back the data.

This is serious.

NPM response and defense against a worm

NPM response has been considered weak by some. It is. It does not change any default settings, so the threat is not really being addressed.

Users should opt-in for security

npm recommands an opt-in defense which is running npm install --ignore-scripts which disables lifecycle scripts. It is suggested elsewhere to do npm shrinkwrap to lock down dependencies or to log out systematically after having published. These are also an opt-in.

Crowdsourced package inspection

npm cannot guarantee that packages available on the registry are safe. If you see malicious code on the registry, report it to support@npmjs.com and it will be taken down.

So, after the fact, when users machine have been infected by a worm npm takes the malicious package down? Too late, but thanks, lol!

As soon as a package is infected, people installing it will have their machines infected. npm cannot accept to be used to affect others machines. Even if npm maybe limits the propagation, malicious packages will be sent to npm.

Maybe the next person disatisfied with a name dispute resolution “owning” another heavily-dependend module will send malicious patch updates instead of just unpublishing them.

As packages will be more valuable, their author will be more likely to become targets of various attacks. Maybe someone will pay an author for the ownership of a module that’s heavily dependend on to distribute malware. This has happened with Chrome extensions, no reason this won’t happen with npm especially as long as there is no trust model for authors.

Defense against a “quick” worm

npm monitors publish frequency. A spreading worm would set off alarms within npm, and if a spreading worm were identified we could halt all publishing while infected packages were identified and taken down.

What about a patient worm? The publication frequency is exactly the same as the normal frequency and discrecy makes it hard to detect on users machines as well as hard to detect which packages are infected on npm. You can start playing the virus signature game but attack is always a step ahead of defense in this game and it’d be a massive amount of resources spent only on this problem. Blaaah…

Security by default

Software should be secure by default, not an opt-in.

People who are coming to npm today and tomorrow have missed the blogposts and tweets. They won’t opt-in, they’ll be infected.

People who reinstall node/npm will forget to opt-in. They’ll be infected.

I am sorry, but the current insecure-by-default state of npm is irresponsible. Some default needs to be changed.

But which default should be changed?

Let’s review the list above:

  1. Sort of the very point of npm install, let’s keep this default.
  2. Accepting patch versions: Hey, super useful when a module has a security patch! Removing this default pretty much means choosing a threat against another; a choice no one should ever have to make. This default remains.
  3. It’s been suggested that removing lifecycle scripts would help security. Sure it does, but then you have to run the lifecycle scripts manually because they’re useful. Oh! And by the way, you’re infected by a worm if you don’t review all of them before you run them! No sure the security would be improved that much. This default remains.
  4. Removing this default means logging in every time. Arguably a Denial-Of-Service attack against the user (credit for the joke). There is no reason to give up usability for the sake of security.
  5. This one can be debated. Complicated topic. I’m on the side of keeping things as they are today. It’s like the web. Anyone can publish, no authorizations required and in any case, it’s not economically tenable to not pay for npm and expect them to review packages manually
  6. well… last element in the list, so I guess that’s the default I should address :-p

People I don’t necessarily trust a lot write scripts, post them as lifecycle scripts on npm and that runs on my machine. Why on Earth would these scripts have access to my entire filesystem, by default? This is an absurd amount of authority to give to random scripts downloaded from npm, who themselves tell us they “cannot guarantee that packages available on the registry are safe”.

This is a classic violation of POLA.

Quite often, we have enough context to know that things look really alarming from the outside are really not that big a deal or rather are no bigger deal than is already there and in an unfixable way in the CLI; the package script vulnerability is a good example of that. That’s just a cost of doing business with user-contributed content.

Forrest Norvell during a recent npm CLI team meeting

Matrix Morpheus meme: What if I told you it's possible to do secure user-contributed content

Red pill coming your way.

Aside on capability security

Like the joke above, I’m only parroting the words of others here.

(these talks are long, but they’re worth your time, I promise. I have others if you’re interested)

The folks in these videos have good metaphors for the state of software security. One of my favorite quote comes from Marc Stiegler:

Buffy Summers! In Season 3, her mother makes the criticism that every security person needs to pay attention to! Joyce says to Buffy: “but what’s your plan? you go out every day, you kill a bunch of bad guys and then the next day there is more bad guys. You can’t win this way. You need a plan.”
And finally, Buffy in the last episode of the last season comes up with a plan: she changes the fundamental physics of her Universe to permanently favor the defender.

What could be lazier than forcing the other guy to play by your rules?

Why are we running commands that, by default, have the authority to publish npm packages on our behalf? We’re playing the attacker game. Any npm command and lifecycle script should only have the authority to do its job and no more.

But how?

Secure-by-default lifecycle scripts

The first step is defining the appropriate amount of authority that lifecycle scripts should have.
What are legitimate usages of lifecycle scripts? We can start with the following list:

  • build things (like compile coffeescript scripts or compile some C++ to make a C++ Node module) and put it somewhere in the project directory

(yes there is a single item, let’s have a discussion on what that list should be)

So the lifecycle script needs read-write authority over the project directory. Cool! Let’s give it only access to this specific directory and no other files!
…wait! Why does it have write authority over package.json? Never heard of a build script that needs to modify package.json, let’s only give read-only authority over this file and read-write over the rest.

Proof-of-concept of a how

I have a proof of concepts of this in the containednpm repo. It uses Docker because it was easy for me to write. Smarter people with more time on their hand will find more subtle solutions. The only point I’m trying to make is that it’s possible, not that my quick implementation should be used or even a reference.

In the end, what happens is that if you run npm install https://github.com/DavidBruant/harmless-worm/tarball/master --save, what happens is:

  • npm downloads the dependency
  • it is saved under node_modules
  • the postscript script runs and modifies package.json in a scary way
  • npm modifies package.json to add worm in the dependencies field

But when you run ./bin/containednpm install https://github.com/DavidBruant/harmless-worm/tarball/master --save, what happens is:

  • (same)
  • (same)
  • the postscript fails to edit package.json because it only has access to a read-only version
    It would also fail to read your $HOME because it runs in a docker container and nobody gave access your $HOME to this container
  • (same)

I have to mention that there is zero copy happening. The package.json that the contained lifecycle scripts see is the actual one. The creation of the node_modules directory happens in the right directory directly, no temporary directory, etc. None of this is magics. Docker does the heavy lifting and I’m just sitting on these shoulders.

What happens if the lifecycle encrypts the filesystem and wants to ransom for a bitcoin? It succeeds… inside the docker container which contains few valuable data (only the project you’re working on, hopefully, it’s versionned so you may not care too much losing it on this computer)… container that is discarded at the end of the ./bin/containednpm command.
In any case, all your other projects, your browser history, your emails and your $HOME are safe without having you to pay back for them. My pleasure.

That’s the way we can change the rules of the game permanently in favor of the defender.
It’s possible to be a lot more secure when “doing business with user-contributed content”.

Aside

When I started this work and this blogpost, I was planning on only talking about the technical bits about how I smartly used a combination of docker, docker-in-docker, dynamic docker volume swap, dirty $PATH hacks based on the fragile assumption than lifecycle are started with sh -c, etc.

But none of that matters. This is just a defense POC. It’s possible, I did it, let’s have the other more important discussion instead.

I’m happy to make another blog post to explain the technical details (they should be straightforward to anyone familiar with docker and unix command line) or maybe get in a call or answer emails.

Beyond a POC

Hey great work on the POC! I’m going to install your POC so my npm and node are safe by default!

(fictional quote)

My pleasure! This still needs more polishing but it’s good there is another secure opt-in against the worm and associated threats!

Regardless of the level of polishing, this defense will remain an opt-in and we need to change the defaults. This work needs to be merged into the official npm client to be any useful at scale. Security should not be an opt-in.

Plan of action

Security should not be an opt-in, so npm needs to be on-board otherwise, this is just yet another opt-in and does not really solve the problem.

npm needs to be on-board

  1. Get some form of acknowledgment from the npm folks that the current default is insecure
  2. Get some form of acknowledgment that npm should be secure by default
  3. Get some form of acknowledgment that what is being discussed in this blogpost (containing lifecycle scripts) is an option that is worth exploring

Exploration and path to becoming default

Assuming there is interest in the exploration:

  1. Make an initial list of legitimate authority that lifecycle scripts should have by default
  2. Figure out UX experience and what are the appropriate tools for implementation (currently, bin/containednpm uses docker and requires sudo privilege which is absurd. There are probably other tools for containement that don’t require this and requiring everyone who wants node/npm to install Docker is a ridiculous requirement anyway)
  3. Add that to the npm cli under a --safe flag (opt-in)
  4. Try lots of modules and notice the differences between npm install and npm install --safe. Each difference certainly means that the lifecycle scripts implicitely ask for authority that it hasn’t been granted. Decide then whether:
    • the lifecycle script uses legit authority => expand the default authority to cover the new use case
    • the lifecycle script uses authority that it shouldn’t need => contact author for more info to figure out whether it’s legit, maybe send a PR to reduce the authority required, maybe it’s a malicious package and just remove it from the registry
  5. After a good year (or more? or less?) of the previous step, npm install becomes npm install --unsafe and npm install --safe becomes the default behavior of npm install.

Bim! Secure default! And no major disruption for anyone. --unsafe provides a fallback if you want to get wormed or ransomed, but at least, you’re opting-in for being attacked! That’s a much better news for everyone else.

I have to note that there is a cost related to having to maintain the list of default authority over time. I doubt it will be too much, but I know it cannot be zero.

Additionally, this default behavior for the CLI would act as negative incentive for anyone who’d want to publish malicious packages. If they know it won’t work for the majority of people by default, they’ll certainly try to attack something else.

Things can be better; npm, let’s talk!

ack

Thanks Romain for an early review and convincing me to pursue in the direction of a blogpost that would be less technical and more about the threat and context.
Thanks Thomas for the help trying to make containednpm worm on Mac


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/chpRiz0OFQ8/defeating-the-npm-worm.html

Original article

jQuery Lock Plugin – prevent some users from changing content in your page

I am locked too!

What is jQuery Lock Plugin?

With this jQuery plugin you will be able to prevent some users from changing content in your page, using the Chrome Developer Tools for example.

Basic Usage

<script src="//code.jquery.com/jquery-2.1.3.min.js"></script>
<script src="//cnova.github.io/jquery-lock/release/jquery.lock.min.js"></script>
<script>
$( document ).ready(function() {
    $("h1,p").lock();
});
</script>
<h1> I am locked =) </h1>
<p> I am locked too! </p>

Options

You can customize the behavior of the plugin using this options:

Send a custom alert for every change on the locked elements

Sample

$("h1,p").lock({
    alertMessage: "You can`t change this."
});
  • customHandler(element, updatedHtml, savedHtml) (Function)

Change the behavior of the plugin, insted of replacing every change with the savedHtml.

Sample

$("h1,p").lock({
    customHandler: function(element, updatedHtml, savedHtml) {
        //Block the change  
        $(element).html(savedHtml);
        //Do something else if the updatedHtml
        console.log('Change blocked %s', updatedHtml);
    }
});


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/xyt9YYdhq2Q/

Original article

Microsoft Edge web browser gets plugin-free Skype, and that is great news for Linux users

As a Linux user, I have stopped using Skype recently. What was once a great experience on Ubuntu, Fedora, and other such operating systems, has been seemingly abandoned by Microsoft. Skype on Linux is barely usable nowadays, as the client has not seen an update in quite a while. This is rather tragic, as it is otherwise a great service on other platforms, such as Android, iOS, and of course, Windows. Users of Windows 10 that use the Edge web browser are getting a cool update this month, as Microsoft is rolling out plugin-free Skype support. While that is cool, the really… [Continue Reading]


Original URL: http://feeds.betanews.com/~r/bn/~3/U3XbdWeG5Uw/

Original article

Modernizing AbiWord code

When you work on a 18 year old code base like AbiWord, you encounter stuff from another age. This is the way it is in the lifecycle of software where the requirement and the tooling evolve.

Nonetheless, when AbiWord started in 1998, it was meant as a cross-platform code base written in C++ that had to compile on both Windows and Linux. C++ compiler where not as standard compliant as today so a lot of things where excluded: no template, so not standard C++ library (it was called STL at the time). Over the years, things have evolved, Mac support was added, gcc 4 got released (with much better C++ support), and in 2003 we started using template for the containers (not necessarily in that oder, BTW). Still no standard library. This came later. I just flipped the switch to make C++11
mandatory, more on that later.

As I was looking for some bugs I found it that with all that hodge podge of coding standard there wasn’t any, and this caused some serious ownership problems where we’d be using freed memory. The worse is this lead to file corruption where we write garbage memory into files as are supposed to be valid XML. This is bad.

The core of the problem is the way we pass attributes / properties around. They are passed as a NULL terminated array of pointer to strings. Even index are keys, odd are string values. While keys are always considered static, values are not always. Sometime they are taken out of a std::string or a one of the custom string containers from the code base (more on that one later), sometime they are just strdup() and free() later (uh oh, memory leaks).

Maybe this is the good time to do a cleanup and modernize the code base and make sure we have safer code rather that trying to figure out one by one all the corner cases. And shall I add that there is virtually no tests on AbiWord? So it is gonna be epic.

As I’m writing this I have 8 patches with a couple very big, amounting to the following stats (from git):

134 files changed, 5673 insertions(+), 7730 deletions(-)

These numbers just show how broad the changes are, and it seems to work. The bugs I was seeing with valgrind are gone, no more access to freed memory. That’s a good start.

Some of the 2000+ lines deleted are redundant code that could have been refactored (there are still a few places I marked for that), but a lot have to do with what I’m fixing. Also some changes are purely whitespace / indentation where it was relevant usually around an actual change.

Now, instead of passing around const char ** pointers, we pass around a const PP_PropertyVector & which is, currently, a typedef to std::vector. To make things nice the main storage for these properties is now also a std::map (possibly I will switch it to an unordered map) so that assignments are transparent to the std::string implementation. Before that it was a one of the custom containers.

Patterns like this:

 const char *props[] = { NULL, NULL, NULL, NULL };
 int i = 0;
 std::string value = object->getValue();
 props[i++] = "key";
 const char *s = strdup(value.c_str());
 props[i++] = s;
 thing->setProperties(props);
 free(s);

Turns to

 PP_PropertyValue props = {
   "key", object->getValue()
 };
 thing->setProperties(props);

Shorter, readable, less error prone. This uses C++11 initializer list. This explain some of the line removal.

Use C++ 11!

Something I can’t recommend enough if you have a C++ code base is to switch to C++ 11. Amongst the new features, let me list the few that I find important:

  • auto for automatic type deduction. Make life easier in typing and also in code changes. I mostly always use it whe declaring an iterator from a container.
  • unique_ptr and shared_ptr. Smart pointer inherited from boost. But without the need for boost. unique_ptr replaces the dreaded auto_ptr that is now deprecated.
  • unordered_map and unordered_set: hash based map and set in the standard library.
  • lambda functions. Not need to explain, it was one of the big missing feature of C++ in the age of JavaScript popularity
  • move semantics: transfer the ownership of an object. Not easy to use in C++ but clearly beneficial for when you always ended up copying. This is a key part of the unique_ptr implementation to be usable in a container where auto_ptr didn’t. The move semantic is the default behaviour of Rust while C++ copies.
  • initializer list allow construction of object by passing a list of initial values. I use this one a lot in this patch set for property vectors.

Don’t implement your own containers.

Don’t implement vector, map, set, associative container, string, lists. Use the standard C++ library instead. It is portable, it works and it likely does a better job than your own. I have another set of patches to properly remove these UT_Vector, UT_String, etc. from the AbiWord codebase. Some have been removed progressively, but it is still ongoing.

Also write tests.

This is something that is missing on AbiWord that I have tried to tackle a few time.

One more thing.

I could have mechanised these code changes to some extent, but then I wouldn’t have had to review all that code in which I found issues that I addressed. Eyeball mark II is still good for that.

The patch (in progress)


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/8HJhqM4Bf68/

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: