Deskew CLI Tool v1.30 Released

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools or check out README.

Main improvement for this version is better quality of rotated images. They're less blurry with default filtering, especially when rotated by less than one degree. You can now also select other filters, the choices are: nearest (no filtering, very fast), bilinear - default, bicubic, and Lanczos (overal best quality, pretty slow).
Deskew Rotation Quality Comparison

Change list for v1.30

  • fix #15: Better image quality after rotation - better default and also selectable nearest|linear|cubic|lanczos filtering
  • fix #5: Detect skew angle only (no rotation done) - optionally only skew detection
  • fix #17: Optional auto-crop after rotation
  • fix #3: Command line option to set output compression - now for TIFF and JPEG
  • fix #12: Bad behavior when an output is given and no deskewing is needed
  • libtiff in macOS is now picked up also when binaries are put directly in the directory with deskew
  • text output is flushed after every write (Linux/Unix): it used to be flushed only when writing to device but not file/pipe.

Downloads

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

GitHub Release

GUI Frontend for Deskew

I've created a simple GUI frontend for Deskew. Now it's easier to process many files without writing shell scripts. It needs the command line tool which is called for the each input file. You can set the basic and most of the advanced options for deskewing in the GUI.

Prebuilt executables for Windows and Linux are available in the download - you just place them to the same folder as the command line tool. Version for macOS is a bit more convenient - it's a self-contained app bundle with CLI tool already inside and all placed in DMG image. You can also set the explicit path to the command line tool in the program itself.

The GUI is written in Lazarus so it may not be a best native-looking application out there but it saved me some time - there wouldn't be any GUI if it would be a big time sink.

Download

  DeskewGui v0.90
» 4.1 MiB - 6,148 hits - March 18, 2019
GUI frontend for Deskew command line tool. Prebuilt binaries for Windows, macOS, and Linux. Windows and Linux versions need Deskew command line tool binaries.

Remember that for Windows and Linux you also need Deskew command line tool if you don't have it already:

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Screenshots

Basic options and files to deskew in Deskew GUI in Windows

Advanced options in Deskew GUI in macOS

Deskewing in progress in Deskew GUI in Windows

Output of the command line tool in Deskew GUI in Linux

Bug Reports And Source Code

GUI is in the same repository as the command line tool, you can find the links here Deskew Tools.

Deskew Tool v1.25 Released

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools.

Change List for Deskew 1.25

  • fixed issue #6: Preserve DPI measurement system (TIFF)
  • fixed issue #4: Output image not saved in requested format (when deskewing is skipped)
  • dynamic loading of libtiff library - adds TIFF support in macOS when libtiff is installed
  • fixed issue #8: Cannot compile in Free Pascal 3.0+ (Windows) - Fails to link precompiled LibTiff library
  • fixed issue #7: Windows FPC build fails with Access violation exception when loading certain TIFFs (especially those saved by Windows Photo Viewer etc.)
  • Linux ARM build is now also included in the release

Download

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Deskew Tool v1.20 Released

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools.

Change List for Deskew 1.20

  • much faster rotation, especially when background color is set (>2x faster, 2x less memory)
  • can skip deskewing step if detected skew angle is lower than parameter (possible speedup when processing large batches)
  • new option for timing of individual steps
  • fix: crash when last row of page is classified as text
  • misc: default back color is now opaque black, new forced output format "rgb24",
    background color can define also alpha channel, nicer formatting of text output

Download

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Deskew Tool Version 1.10

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools.

Change List for Deskew 1.10

  • TIFF support now also for Win64 and 32/64bit Linux platforms
  • forced output formats
  • fix: output file names were always lowercase
  • fix: preserves resolution metadata (e.g. 300dpi) of input when writing output

Continue reading

Deskew Tool Updated

There is a new version of Deskew command line tool introduced in post Deskewing Scanned Documents. Looks like quite a few people found it useful 🙂

What's new in the latest version?

  • Background color can be defined (empty space around the original page after the rotation is filled with this color)
  • "Area of interest" rectangle to force skew detection only into selected part of the page (useful when  e.g. noisy page borders or images confuse skew detection when processing the entire page)
  • 64 bit and Mac OSX support
  • PSD and TIFF file format support (TIFF only in Win32 for now, sorry)
  • Display of skew detection stats and program parameters

Download

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Source Code Repository

Public Mercurial source repository of Deskew is now hosted at BitBucket: https://bitbucket.org/galfar/app-deskew.

Deskewing Scanned Documents

Check out updates and new versions of Deskew tool.

Some time ago I wrote a simple command line tool for deskewing scanned documents called Deskew. Technically, it's a rotation since angles are preserved and skew transformation doesn't do that. However, deskewing is commonly used term in this context.

Deskewing some smart paper

My approach is fairly common for this problem - rotation angle is first determined using Hough transform and then the image is rotated accordingly. Classical Hough transform is able identify lines in the image and it was later extended to allow detection of any arbitrary shapes.

Lines of text can be thought of as horizontal lines in the image. In a skewed scanned document all the lines will be rotated by some small angle. We can start with the equation of the line y = k · x + q. Since we're interested in the angle, we can rewrite it as y = (sin(α) / cos(α)) · x + q. Finally, we can rearrange it as y · cos(α) − x · sin(α) = d. Now every point [x, y] in the image can have infinite number of lines going through it, where each is defined by two parameters: angle α and distance from the origin d.

We want to consider lines only for certain points of input image. Ideally, that would be the base lines on which the "text is sitting". Simple way of determining these points is to check for black pixels which have white pixels just below them. Now for each of the classified points, we determine parameters α and d for all the lines that go through them. To get some finite number of lines, we calculate d for angles α from a certain range (I use angle step of 0.1 degrees). We want to find a line that intersects as many classified points as possible – an accumulator is used to store "votes" for each calculated line. For each point that is believed to be on the text base line, we add one vote for each line that intersects it. At the end, we find the top lines that have the most votes. Ideally, these are the base lines of all lines of text in the document. Finally, we get the rotation angle by averaging angle α of the top lines and rotate the whole image accordingly.

Important part is that one: "check for black pixels which have white pixels just below". What's black and white is determined by comparing value of the current pixel against some given threshold. For images where background is plain white and the text is black it's easy just to use 0.5 as the threshold. But when the background/foreground distinction is not so sharp calculating the threshold adaptively based on the current image can be very useful. Deskew supports both adaptive threshold calculation as well as specifying constant threshold as command line parameter.

Deskewing some math exercise

Implementation is written in Object Pascal and uses Imaging library for reading and writing various image file formats. There are precompiled binaries for a few platforms, others be built from sources using Free Pascal compiler. Archive also contains few test images.

  Deskew v1.30
» 4.3 MiB - 20,412 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.