Occasionally I’ve needed to scan pages of text (with diagrams, written and drawn on lined paper) into a set of images. A short while ago, to accomplish this while at the same time getting a little more experience in Mathematica, I decided to write a script to do it in the language. It takes in and transforms images like so:

Initial Image

Initial Image

Processed Image

Processed Image

I’m still not at present the most experienced in Mathematica, hence the slightly verbose code style incoming…

Also note that the script is built specifically for my situation – the fountain pen with black ink I like to use, the lighting conditions the photos are taken in, the image resolution of my phone camera, etc. If you’d like to use it, you will almost certainly need to change some (many) parameters.

Here it is:

(significant output shown on the right or below.)

(* Import/Export directories and general fluff *)

imagepath = StringJoin[NotebookDirectory[], "in/*.jpg"];
exportpath = StringJoin[NotebookDirectory[], "out/"];
imagesfullres = Import[#] & /@ FileNames[imagepath];
imagesmedium = Import[#, ImageSize -> 300] & /@ FileNames[imagepath];

dimmedium = ImageDimensions /@ imagesmedium
dimfullres = ImageDimensions /@ imagesfullres

filenames = FileNames[StringJoin[NotebookDirectory[], "in/*.jpg"]];
Nimages = Length@dimmedium;

Finds & shows large lines in the image, using a ridge filter + blurring. Computation done on the smaller image and then scaled to save time :-

lines = ImageLines[#, 0.25] & /@ 
  ImageAdjust /@ RidgeFilter /@ (Blur[#, 5] & /@ imagesmedium)
Apply[HighlightImage, #] & /@ Transpose[{imagesmedium, lines}]
corners = ImageCorners[#, 3, .01, 5] & /@ (Graphics[#, ImageSize -> 300] & /@ lines)

clusters = FindClusters /@ corners;
sortfunc[l_] := l[[2]]^2 + l[[1]]^2;
points = SortBy[#, sortfunc] & /@ ((Mean /@ #) & /@ clusters)

scaledpoints = Times[Flatten[Take[dimfullres, 1]/Take[dimmedium, 1]], #] & /@ # & /@ points;
scaledtransforms = Last /@ (FindGeometricTransform[{{0, 0}, {3024, 0}, {0, 4032}, {3024, 4032}}, #] & /@ scaledpoints)

The above calculates a series of Transformation Functions, which relate the final to the intial points ->

croppedfullres = 
 ImageAdjust /@ (Apply[ImagePerspectiveTransformation, #] & /@ 
    Transpose[{imagesfullres, scaledtransforms, 
      Table[DataRange -> Full, Nimages]}])

var = TotalVariationFilter /@ croppedfullres;
plain = Sharpen /@ (RemoveBackground[#, {"Foreground", {Black, 
        0.5}}] & /@ var)

The critical cells. In the first we apply our perspective transformations, creating images with only the region that we’re interested in. In the second (which takes the most time – TotalVariationFilter is particularly expensive at this scale and on images of this size), we get rid of the lines on the page as well as other noise.

lines = Graphics[
   Table[{Dashing[{.01, .29}], Brown, Opacity[0.25], 
     Line[{{0, y}, {3024, y}}]}, {y, 0, 4032, 100}], 
   Background -> RGBColor[1., 0.95, 0.88]];
linesneg = ColorNegate@Rasterize[lines, RasterSize -> 3024];
nobg = Style[#, Background -> Transparent] & /@ plain;
explines = 
  Sharpen /@ 
   ColorNegate /@ (ImageAdd[
        ColorNegate@
         Rasterize[#, Background -> White, RasterSize -> 3024], 
        linesneg] & /@ nobg);

This generates a soft beige background with squareish dotted lines, blends it with the images that include only the foregound, and sharpens it. We now have our finalised images – all we need to do now is export them:

filenamesout = 
  StringJoin[
     StringTake[StringJoin[StringSplit[#, "/"][[-1]]], {1, -5}], 
     "-exp-lines.png"] & /@ filenames;
filenamesoutfull = StringJoin[exportpath, #] & /@ filenamesout;
Apply[Export, #] & /@ 
 Transpose[{filenamesoutfull, explines, 
   Table[Background -> Black, Nimages], 
   Table[RasterSize -> 3024, Nimages], 
   Table[ImageSize -> 3024, Nimages]}]

I’ve found that there are very few artefacts (and very little noise) when doing the conversion this way – YMMV.