Deskew Tool Updated

There is a new version of Deskew command line tool introduced in post Deskewing Scanned Documents. Looks like quite a few people found it useful 🙂

What's new in the latest version?

  • Background color can be defined (empty space around the original page after the rotation is filled with this color)
  • "Area of interest" rectangle to force skew detection only into selected part of the page (useful when  e.g. noisy page borders or images confuse skew detection when processing the entire page)
  • 64 bit and Mac OSX support
  • PSD and TIFF file format support (TIFF only in Win32 for now, sorry)
  • Display of skew detection stats and program parameters

Download

  Deskew v1.30
» 4.3 MiB - 20,829 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Source Code Repository

Public Mercurial source repository of Deskew is now hosted at BitBucket: https://bitbucket.org/galfar/app-deskew.

Photo Sorting Tool

Let's say you have just spent few weeks in some exotic country with a bunch of friends. Everyone had a digital camera and made thousands of photos. Now you have several directories with hordes of oddly named photos and you just want see all of them in the order they were taken. Throwing them all in one folder and setting file ordering by date might help but everyone's camera can have a different internal time. File system dates can also be lost on their way to you (FTP upload etc.).

Being a programmer, rather that searching for some program on the Internet, I wrote my own quick and dirty command line tool for this task (mostly hard-coded paths etc.) in mid 2010. This year, I added GUI (where all the settings can be adjusted), some more date & time functions, and basically made the whole thing usable. And finally, beta release of PhotoMixer is available.
Continue reading

Deskewing Scanned Documents

Check out updates and new versions of Deskew tool.

Some time ago I wrote a simple command line tool for deskewing scanned documents called Deskew. Technically, it's a rotation since angles are preserved and skew transformation doesn't do that. However, deskewing is commonly used term in this context.

Deskewing some smart paper

My approach is fairly common for this problem - rotation angle is first determined using Hough transform and then the image is rotated accordingly. Classical Hough transform is able identify lines in the image and it was later extended to allow detection of any arbitrary shapes.

Lines of text can be thought of as horizontal lines in the image. In a skewed scanned document all the lines will be rotated by some small angle. We can start with the equation of the line y = k · x + q. Since we're interested in the angle, we can rewrite it as y = (sin(α) / cos(α)) · x + q. Finally, we can rearrange it as y · cos(α) − x · sin(α) = d. Now every point [x, y] in the image can have infinite number of lines going through it, where each is defined by two parameters: angle α and distance from the origin d.

We want to consider lines only for certain points of input image. Ideally, that would be the base lines on which the "text is sitting". Simple way of determining these points is to check for black pixels which have white pixels just below them. Now for each of the classified points, we determine parameters α and d for all the lines that go through them. To get some finite number of lines, we calculate d for angles α from a certain range (I use angle step of 0.1 degrees). We want to find a line that intersects as many classified points as possible – an accumulator is used to store "votes" for each calculated line. For each point that is believed to be on the text base line, we add one vote for each line that intersects it. At the end, we find the top lines that have the most votes. Ideally, these are the base lines of all lines of text in the document. Finally, we get the rotation angle by averaging angle α of the top lines and rotate the whole image accordingly.

Important part is that one: "check for black pixels which have white pixels just below". What's black and white is determined by comparing value of the current pixel against some given threshold. For images where background is plain white and the text is black it's easy just to use 0.5 as the threshold. But when the background/foreground distinction is not so sharp calculating the threshold adaptively based on the current image can be very useful. Deskew supports both adaptive threshold calculation as well as specifying constant threshold as command line parameter.

Deskewing some math exercise

Implementation is written in Object Pascal and uses Imaging library for reading and writing various image file formats. There are precompiled binaries for a few platforms, others be built from sources using Free Pascal compiler. Archive also contains few test images.

  Deskew v1.30
» 4.3 MiB - 20,829 hits - June 19, 2019
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

PasJpeg2000 Update in Progress

I'm currently rewriting much of a Jpeg 2000 for Pascal library. There is a new IO class responsible for decoding and encoding of Jpeg 2000 files instead of only VCL TBitmap descendant. It's cross platform and with only Delphi/FPC RTL dependencies. VCL and LCL TGraphic classes will be built using this IO class but it can be used independently as well (Imaging library will use it too).

New features of Jpeg 2000 for Pascal will be CMYK colorspace support and also indexed/palettized images support (yes, it's possible to have image using palette in Jpeg 2000). These features as well as proper alpha channel definitions are patched into OpenJpeg library. Its team is not very active in incorporating larger patches into their code, so patches will probably always be additional step for people who want to recompile OpenJpeg themselves for use with PasJpeg2000.

You can follow the progress of the new version in SVN repository here: branch v120.

Imaging in C++ Builder

I tried compiling Imaging in C++ Builder (it uses Delphi compiler to generate .obj file which C++ linker can link and also generates C++ header for Pascal unit) few years ago. It didn't work - there was internal compiler error, I think right in ImagingTypes unit.

Few days ago I tried C++ Builder 2010 and was pleasantly surprised. It worked! I tried just the library core for now (ImagingTypes, Imaging, ImagingFormats, Pascal only file handlers, etc.) and it works without problems. I'm not sure which C++ Builder version is required for successful compile though. Versions 6 and 2006 stopped with internal error, 2010 worked, and there are 2 other versions between.

Anyway, I'll try to check out most of the library until the 2010 trial expires for me. Hmm, I'm wondering how many people use C++ Builder for C++ development - I've never did something serious in it, basically just to get object files usable by Delphi - so I have no idea if it's ok.

PS: Another C++ Builder related news - patch for OpenJpeg library to get it compiling in BCB is posted here.

PasJpeg2000 News

JPEG 2000 for Pascal project is based on OpenJpeg library. For a very long time there was a bug that caused alpha (fourth and subsequent image channels) channel to be saved with all the samples having the value of 0.5 (128 for 8bit channels). This buggy behavior also depended on compiler settings - optimization level in case of GCC. You could use at most O1 in Windows and Linux, and only O0 in Mac OS X. Bug was also present when compiling with C++ Builder (to get object files usable in Delphi) but only when irreversible DWT transformation was enabled in OpenJpeg during encoding (it wasn't before, but working versions of both PasJpeg2000 and Imaging use it now when lossy compression is selected by user to get smaller files).
You can read more about it in this news group post. Basically it was all fixed by changing a condition in one if statement to prevent accessing the fourth element of a three element array.

So what can you expect in the next version of PasJpeg2000 library?
Higher GCC optimization levels should make it a lot faster when using Free Pascal (particularly in Mac OS X where O0 was used). Irreversible DWT transformation produces smaller lossy files than current PasJpeg2000 version and with optional MCT (multicomponent transform - basically RGB>YCbCr) you get even smaller ones. There's now also a patch that enables OpenJpeg to get palettes from JP2 files so indexed JPEG 2000 images could be supported too. And finally, there are some bug fixes (wrong reconstruction of subsampled files, ...).

Block Compression, DXTC, And Imaging

Imaging supported DXT image/texture compression since one of the earliest releases. Quality of compressed images isn't very high though (at least the compression isn't too slow). For future Imaging versions I plan to ditch the current compression code and add a new one. To be precise, two new ones - fast and lower quality (still probably better than current Imaging's compression), and slow and higher quality mode. Fast one will be based on Real-Time DXT Compression by Id Soft. I'm not decided on high quality one yet, but probably something like cluster fit algorithm from Squish library.

Mainly for testing purposes during implementation of these new methods, I want to create extension for Imaging that compares two images (original and one reconstructed from compressed original) and measures PSNR and some other quality metrics.

I'm also thinking of implementing DXT5 based format using YCoCg colorspace and PVRTC (texture compression currently used in iPhone).

Few links if you're interested:

Here's some quick comparison of DXT compressors (click the image to see full size). Value in brackets is MSE (mean square error) - lesser number means compressed image is more similar to original.

DXT compressors comparison

First JPEG 2000 for Pascal Release

I finally released first version of JPEG 2000 for Pascal library. It’s based on translated header and precompiled OpenJpeg library that was part of Imaging for a long time – now released separately.

There’s header translation working with both Delphi and FPC and original C library precompiled to object files (Delphi) and static libraries (FPC for Win32, Linux x86/x64, and Mac OS X).

I’ve written simple TBitmap descendant that loads and saves JPEG 2000 images. It’s only for Delphi now – can’t do LCL version with just few IFDEFS, there are lot more changes between VCL and LCL TBitmaps. I plan to write separate LCL version for one of upcoming releases (using TFPCustomImageWriter and TFPCustomImageReader classes).

You can get JPEG 2000 for Pascal library at it’s project page.

APNG Update

APNG loading and animating implementation for Imaging wasn't very hard work. However, there are not that many test images to be found on the Internet and most of the available ones are very simple. They're usually not using disposal methods and are basically just collection of independent images. Big difference compared to GIF files - some of them were quite difficult to animate right. I even got most of nontrivial APNG test files by converting some more complex animated GIF files. Tools for creating APNG images are not yet as sophisticated as some GIF animation tools - there's GIF to APNG converter, APNG Anime Maker, and some web based tools that assemble simple APNG file from bunch of uploaded single PNG frames.

Now I'm gonna extend PNG saver to allow saving multi images to simple APNG files, much like MNG saving works now. There's really not a good unified way how to pass some more information to image file savers in current library design - that's one of TODOs for new Imaging architecture.