Extract, convert and process your PDFs with PDF MultiTool

PDFMultiTool

PDF MultiTool” sounds like it’s going to be yet another PDF toolkit: the usual set of basic functions, none of which work very well, which only exists at all because the developer hopes you’ll install its assorted toolbars.

But no: this really is different. There’s no adware here, no toolbars or similar irritations, and the feature list includes plenty of interesting extras which you’ll rarely find elsewhere.

PDF MultiTool opens as a simple viewer. Open a PDF and you can page through it, zoom in or out, rotate pages, search for text and more.

A left-hand pane organises the program’s more interesting functions into a tree, with three main sections.

“Data Extraction” has tools to extract text, embedded images, attachments and XFA form data from the selected area, the current page or the entire document.

That’s just the start. PDF MultiTool can also extract text from image files (via OCR) as well as regular documents, optionally trim spaces, align text columns to any header, or unwrap lines, before exporting the results as TXT, CSV, XML, XLS or XFDF, and saving the results as a file or copying them to the clipboard.

It’s a similar story in the “Conversion” section, where PDF MultiTool converts documents to images or HTML. As well as supporting some less common formats (GIF, multipage TIFF, EMF as well as JPG, BMP, PNG etc), it gives detailed control over compression methods, DPI, even CSS usage (there’s an option to convert HTML controls to plain text).

The “Utilities” section is more familiar, with tools to display document metadata, rotate, split or merge files. But even here there are a few extras, including the ability to make a PDF document searchable via OCR. This didn’t work well for us, but results will vary according to your source material, and it’s good to have the option available.

We noticed one or two issues, most obviously with a tool to automatically detect PDF tables, which was almost entirely useless on our test documents.

PDF MultiTool mostly worked well, though — particularly when extracting text — and there are more than enough interesting and unusual extras here to justify the download.


Original URL: http://feeds.betanews.com/~r/bn/~3/np8dveKeGho/

Original article

Something doesn’t smell right about the rush to “deprecate” HTTP

Google and Mozilla and others want force all non-HTTPS sites to become HTTPS.

And while the name HTTPS sounds a lot like HTTP, it’s actually a lot more complex and fraught with problems. If what they want to do ever happens, much of the independent web will disappear.

  1. First, the problem as I understand it, is that some ISPs are gathering data about the content flowing through their routers, inserting ads and cookies and otherwise modifying content as it flows through their routers. I agree of course that this is a bad thing.

  2. Going to HTTPS does not get rid of all possible ways man-in-the-middle snooping and modifying content. A toolbar, for example, hooking in after the content is decrypted, could change the content. Google tried to do this at one point in the evolution of the web. And even if you couldn’t do it with a toolbar, Google and Mozilla both own popular browsers. They could modify content any time they want. HTTPS won’t protect us against their snooping and interference. Why are we supposed to trust them more than an ISP? I don’t actually trust them that much, btw.

  3. Key point: If you care about whether ISPs modify your content, you can move to HTTPS on your own. You don’t need Mozilla or Google to force you to do it.

  4. It also depends on how much you care. Sure in a perfect world I’d want to stop all of it on all my content, but in that perfect world I would have infinite time to do all the work to convert all my websites to HTTPS. I don’t have infinite time, and neither do you. I try to pick my battles more carefully. You can waste a lot of time doing something because it seems the right thing to do and end up accomplishing nothing from all the work.

  5. What I care about is that sufficiently motivated people will be able to find my archive in the future. I don’t think the odds are actually very good, for a lot of reasons. This is just the newest.

  6. Given that a vast amount of content likely won’t move, Google and Mozilla are contemplating far more vandalism to the web than any of the ISPs they’re trying to short-circuit.

  7. Aren’t there other less draconian methods to try first? How about making it illegal where it is not and working with government to enforce the laws? How about developing competition that doesn’t do it, so everyone has a choice? That’s the way Google is changing how ISPs work in the US. Why not elsewhere? How about developing a kind of encryption that does not require websites to do anything? I don’t know if it’s possible, but I haven’t heard any discussion of that.

  8. Couldn’t you use a VPN to tunnel through the nasty ISPs?

  9. This is why we need to overthrow the tech industry as a governing body. It’s run by people who shoot first and ask questions later. This is an awful way to be having this discussion, after the decision is made, without any recourse? This is the best argument for taking this power away from the plutocrats in tech.


Original URL: http://scripting.com/2015/05/17/somethingDoesntSmellRightAboutTheRushToDeprecateHttp.html

Original article

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: