Deskew Tool v1.20 Released

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools.

Change List for Deskew 1.20

  • much faster rotation, especially when background color is set (>2x faster, 2x less memory)
  • can skip deskewing step if detected skew angle is lower than parameter (possible speedup when processing large batches)
  • new option for timing of individual steps
  • fix: crash when last row of page is classified as text
  • misc: default back color is now opaque black, new forced output format “rgb24”,
    background color can define also alpha channel, nicer formatting of text output

Download

  Deskew 1.20
» 4.1 MiB - 5,962 hits - January 5, 2011 (last update November 1, 2016)
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Deskew Tool Version 1.10

New version of Deskew command line tool is ready. You can find general info about Deskew here Deskew Tools.

Change List for Deskew 1.10

  • TIFF support now also for Win64 and 32/64bit Linux platforms
  • forced output formats
  • fix: output file names were always lowercase
  • fix: preserves resolution metadata (e.g. 300dpi) of input when writing output

Continue reading

Deskew Tool Updated

There is a new version of Deskew command line tool introduced in post Deskewing Scanned Documents. Looks like quite a few people found it useful 🙂

What’s new in the latest version?

  • Background color can be defined (empty space around the original page after the rotation is filled with this color)
  • “Area of interest” rectangle to force skew detection only into selected part of the page (useful when  e.g. noisy page borders or images confuse skew detection when processing the entire page)
  • 64 bit and Mac OSX support
  • PSD and TIFF file format support (TIFF only in Win32 for now, sorry)
  • Display of skew detection stats and program parameters

Download

  Deskew 1.20
» 4.1 MiB - 5,962 hits - January 5, 2011 (last update November 1, 2016)
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Source Code Repository

Public Mercurial source repository of Deskew is now hosted at BitBucket: https://bitbucket.org/galfar/app-deskew.

Photo Sorting Tool

Let’s say you have just spent few weeks in some exotic country with a bunch of friends. Everyone had a digital camera and made thousands of photos. Now you have several directories with hordes of oddly named photos and you just want see all of them in the order they were taken. Throwing them all in one folder and setting file ordering by date might help but everyone’s camera can have a different internal time. File system dates can also be lost on their way to you (FTP upload etc.).

Being a programmer, rather that searching for some program on the Internet, I wrote my own quick and dirty command line tool for this task (mostly hard-coded paths etc.) in mid 2010. This year, I added GUI (where all the settings can be adjusted), some more date & time functions, and basically made the whole thing usable. And finally, beta release of PhotoMixer is available.
Continue reading

16bit half float in Pascal/Delphi

Floating point numbers with 16 bits of precision are used mostly in computer graphics. They are also called half precision floating point numbers (as having half the bits of single precision 32bit floats). There’s one sign bit, five bit exponent, and ten bits for mantissa. Half floats are not really meant to be used for arithmetic computations due to the limited precision (and no support in common CPUs/FPUs).

Half floats first appeared in early 2000s as samples in images and textures. Floats provide higher dynamic range than what is available with regular 8bit or 16bit integer samples. On the other hand, commonly used single and double precision floats have much higher memory cost per pixel. Half floats have more reasonable memory requirements and their precision is adequate for many usages in imaging.

16bit float formats have been supported by ATI and NVidia GPUs for many years. I’m not sure about other IHVs but at least Direct3D 10 capable GPUs should all support it.

Read on if your interested how to convert between half and single precision floats (Object Pascal code).

Continue reading

Deskewing Scanned Documents

Check out updates and new versions of Deskew tool.

Some time ago I wrote a simple command line tool for deskewing scanned documents called Deskew. Technically, it’s a rotation since angles are preserved and skew transformation doesn’t do that. However, deskewing is commonly used term in this context.

Deskewing some smart paper

My approach is fairly common for this problem – rotation angle is first determined using Hough transform and then the image is rotated accordingly. Classical Hough transform is able identify lines in the image and it was later extended to allow detection of any arbitrary shapes.

Lines of text can be thought of as horizontal lines in the image. In a skewed scanned document all the lines will be rotated by some small angle. We can start with the equation of the line y = k · x + q. Since we’re interested in the angle, we can rewrite it as y = (sin(α) / cos(α)) · x + q. Finally, we can rearrange it as y · cos(α) − x · sin(α) = d. Now every point [x, y] in the image can have infinite number of lines going through it, where each is defined by two parameters: angle α and distance from the origin d.

We want to consider lines only for certain points of input image. Ideally, that would be the base lines on which the “text is sitting”. Simple way of determining these points is to check for black pixels which have white pixels just below them. Now for each of the classified points, we determine parameters α and d for all the lines that go through them. To get some finite number of lines, we calculate d for angles α from a certain range (I use angle step of 0.1 degrees). We want to find a line that intersects as many classified points as possible – an accumulator is used to store “votes” for each calculated line. For each point that is believed to be on the text base line, we add one vote for each line that intersects it. At the end, we find the top lines that have the most votes. Ideally, these are the base lines of all lines of text in the document. Finally, we get the rotation angle by averaging angle α of the top lines and rotate the whole image accordingly.

Important part is that one: “check for black pixels which have white pixels just below”. What’s black and white is determined by comparing value of the current pixel against some given threshold. For images where background is plain white and the text is black it’s easy just to use 0.5 as the threshold. But when the background/foreground distinction is not so sharp calculating the threshold adaptively based on the current image can be very useful. Deskew supports both adaptive threshold calculation as well as specifying constant threshold as command line parameter.

Deskewing some math exercise

Implementation is written in Object Pascal and uses Imaging library for reading and writing various image file formats. There are precompiled binaries for a few platforms, others be built from sources using Free Pascal compiler. Archive also contains few test images.

  Deskew 1.20
» 4.1 MiB - 5,962 hits - January 5, 2011 (last update November 1, 2016)
Command line tool for deskewing scanned documents. Binaries for several platforms, test images, and Object Pascal source code included.

Ugly Images of Disabled Menu Items in Delphi

Ever used 32bit images stored in TImageList in your Delphi application? Toolbars and some other VCL controls have DisabledImages property which is automatically used to get images for disabled toolbar buttons. But what about menu components? They don’t have this property and drawing of disabled images is handled by TImageList with original enabled images (TMainMenu.Images property). And the results are really abysmal. How can this be fixed?

One way is to override DoDraw method of TImageList and change the code that draws disabled images. You can do regular RGB to grayscale conversion here or let  Windows draw it for you in grayscale with nearly no work on your part. You can do this by using ImageList_DrawIndirect with ILS_SATURATE parameter. Note that this works only on Windows XP and newer and for 32bit images only. For older targets or color depths doing your own RGB->grayscale conversion is an option (good idea would probably be to cache converted grayscale images somewhere so they won’t need to be converted on every draw call).

Here’s the code of DoDraw method using  ILS_SATURATE:

type
// Descendant of regular TImageList
TSIImageList = class(TImageList)
protected
  procedure DoDraw(Index: Integer; Canvas: TCanvas; X, Y: Integer;
    Style: Cardinal; Enabled: Boolean = True); override;
end;

procedure TSIImageList.DoDraw(Index: Integer; Canvas: TCanvas; X, Y: Integer;
  Style: Cardinal; Enabled: Boolean);
var
  Options: TImageListDrawParams;

  function GetRGBColor(Value: TColor): Cardinal;
  begin
    Result := ColorToRGB(Value);
    case Result of
      clNone: Result := CLR_NONE;
      clDefault: Result := CLR_DEFAULT;
    end;
  end;

begin
  if Enabled or (ColorDepth <> cd32Bit) then
    inherited
  else if HandleAllocated then
  begin
    FillChar(Options, SizeOf(Options), 0);
    Options.cbSize := SizeOf(Options);
    Options.himl := Self.Handle;
    Options.i := Index;
    Options.hdcDst := Canvas.Handle;
    Options.x := X;
    Options.y := Y;
    Options.cx := 0;
    Options.cy := 0;
    Options.xBitmap := 0;
    Options.yBitmap := 0;
    Options.rgbBk := GetRGBColor(BkColor);
    Options.rgbFg := GetRGBColor(BlendColor);
    Options.fStyle := Style;
    Options.fState := ILS_SATURATE; // Grayscale for 32bit images

    ImageList_DrawIndirect(@Options);
  end;
end;

Important note: For ILS_SATURATE to work correctly source image files must be 32bit with proper alpha channel data, setting color depth of TImageList to 32bit is not enough! If you don’t see any images drawn this is probably the cause: 8/24bit image is loaded from file and then inserted into 32bit TImageList. As there is no alpha channel data in source image it is drawn as fully transparent so you don’t see anything.

http://msdn.microsoft.com/en-us/library/bb761537(VS.85).aspx

Shift Right: Delphi vs C

Few weeks ago I converted a little function from C language to Delphi. I kept getting completely wrong results all the time even though I was sure the C to Pascal conversion was right (it was really just few lines). After some desperate time, I just tried replacing SHR operator by normal DIV (as A SHR 1 = A DIV 2 and so on). To my surprise, I immediately got the right results. Can Delphi’s (I didn’t test it in Free Pascal) SHR operator behave differently than C’s >> operator?

It does in fact. SHR treats its first operand as unsigned value even though it is a variable of signed type whereas >> takes the sign bit into account. In the function I converted the operand for right shift was often negative and Delphi’s SHR just ignored the value of the sign bit.

A Bit of Code

int a, b1, b2;
a = -512;
b1 = a >> 1;
b2 = a / 2;

After running this C code both b1 and b2 have a value of -256.

var A, B1, B2: Integer;
A := -512;
B1 := A shr 1;
B2 := A div 2;

This Delphi code however yields different result: B2 is -256 as expected but B1 has a value of 2147483392.

A Bit of Assembler

Assembler output of C code:

Unit1.cpp.22: b1 = a >> 1;
mov eax,[ebp-$0c]
sar eax,1
mov [ebp-$10],eax
Unit1.cpp.23: b2 = a / 2;
mov edx,[ebp-$0c]
sar edx,1
jns $00401bb9
adc edx,$00
mov [ebp-$14],edx

Assembler output of Delphi code:

Unit1.pas.371: B1 := A shr 1;
mov eax,[ebp-$0c]
shr eax,1
mov [ebp-$1c],eax
Unit1.pas.373: B2 := A div 2;
mov eax,[ebp-$0c]
sar eax,1
jns $00565315
adc eax,$00
mov [ebp-$20],eax

As you can see, asm output of C and Delphi divisions is identical. What differs is asm for shift right operator. Delphi uses shr instruction whereas C uses sar instruction. The difference: shr does logical shift and sar does arithmetic one.

The SHR instruction clears the most significant bit (see Figure 6-7 in the Intel Architecture Software Developer’s Manual, Volume 1); the SAR instruction sets or clears the most significant bit to correspond to the sign (most significant bit) of the original value in the destination operand.

Quoted from: http://faydoc.tripod.com/cpu/shr.htm

DirectX 11: GPUs and headers

Few days ago I was wondering how long it will take to see DirectX 11 headers translated to Delphi/Pascal. I knew that translations of new DX components Direct2D and DirectWrite are part of Delphi 2010, but what about Direct3D and DirectCompute?

I checked out Clootie graphics pages but no luck there. Unfortunately, it’s not so active anymore. After that I thought DX11 for Pascal wouldn’t appear in like several months at least. Surprise, I found it about 5 minutes later. John Bladen published his DX11 translation on DirectX for Delphi blog. Thanks John!

Now all that remains is buying a DX11 GPU. New Radeons should be released sometime during this week and I’m thinking of getting 5850 model in a few months. Rumours are that NVidia’s DX11 cards won’t be out before 2010, so probably no big price drops for 5800 Radeons are to be expected until then.

My pile of old graphic cards will get bigger once again. I don’t buy new CPU/motherboard/RAM too often (now it’s Core2 from 2007 and before that AthlonXP in 2002) and I usually manage to sell the old ones or put them on server duty somewhere. However, since GPU performance as well as requirements on it rise much more faster than that of CPU, I ended up with like 5 cards in last 8 years. I managed to sell Radeon 9000 and GeForce 6600 but I still have Riva TNT2 and Radeon X1950 in my closet somewhere (and S3 Trio!). Hopefully, I’ll find a buyer for my 1GB 4850 soon.

Lately, I’ve had bad luck buying some new stuff just before major price drop. Hopefully, this won’t happen with 5850. Not that I want it to remain expensive – just want to buy it after the drop, not  a little while before it.