Enhancing Readability of Text Image, Best Way?

NS
Posted By
Nehmo Sergheyev
Jul 29, 2005
Views
1939
Replies
17
Status
Closed
What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?


|||||||||||||||| Nehmo Sergheyev ||||||||||||||||

MacBook Pro 16” Mockups 🔥

– in 4 materials (clay versions included)

– 12 scenes

– 48 MacBook Pro 16″ mockups

– 6000 x 4500 px

HL
Harry Limey
Jul 29, 2005
"Nehmo" wrote in message
What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

The best way would be to use OCR software and import actual text into whatever layout you are using.
K
Kingdom
Jul 29, 2005
"Nehmo" wrote in news:1122662601.132100.238030 @o13g2000cwo.googlegroups.com:

What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

Scan at a very high resolution ie 600 or more


f=Ma well, nearly…
H
Hecate
Jul 29, 2005
On 29 Jul 2005 11:43:21 -0700, "Nehmo" wrote:

What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

Use OCR software.



Hecate – The Real One

Fashion: Buying things you don’t need, with money
you don’t have, to impress people you don’t like…
NS
Nehmo Sergheyev
Jul 29, 2005
OCR apps require a lot of human input to get a good output, particularly if the text is difficult to read. I’m talking about enhancing the readably of an _image_ of text.



|||||||||||||||| Nehmo Sergheyev ||||||||||||||||
FN
Flo Nelson
Jul 30, 2005
"Nehmo" wrote in message
What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

If you’re scanning black or dark text on a light background, I’ve found plain old contrast works the best – other solutions for other combinations.

Flo
J
JoeB
Jul 30, 2005
"Nehmo" wrote in
news::

OCR apps require a lot of human input to get a good output, particularly if the text is difficult to read. I’m talking
about
enhancing the readably of an _image_ of text.

If you’ve followed the other suggestions to make sure you have done a high quality scan and you’re still having problems making it look good, then I suggest you follow Fred’s suggestion and post the image (or a portion of it) so that we can see what you mean by dirty.

I’m assuming from your comment about OCR software that the original was already difficult to read, which would make it tougher for OCR software to decipher it and make it difficult to expect to get a good scan either (i.e., the dirtiness is a result of a bad original, not the scanning process).

Regards,

JoeB
BM
Brooks Moses
Jul 30, 2005

[What a strange list of newsgroups you’ve posted this in! It’s nearly
off-topic in all but the paint-shop-pro and photoshop groups, so I’ve set followups to there.]

Nehmo wrote:
What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

The best way is, obviously, to recreate the image using an image editor and retype the text manually or use OCR and hand-edit.

Anything less is a compromise between quality and ease of application, and that’s going to be a matter of personal choice.

Meanwhile, here’s a recipe that I’ve used, given poor scans to start with:

* enlarge the image to 600dpi or so, using software that interpolates the pixels to do the enlargment. (Better yet, scan at 600dpi to start with.)

* use a "threshold" function to convert the image the black and white, manually adjusting the threshold for the best result.

* manually remove the worst of the black-dot noise, if any.

* reduce the image back down to the desired resolution, again using some software that interpolates values rather than simply doing sampling.

Also, if the scan is uneven so that a full-image threshold doesn’t work, you might find the bitmap-processing software that comes with the "potrace" (po-trace, not pot-race!) software — it’s designed for specifically that purpose. (For that matter, the potrace documentation talks a little bit about this sort of thing.)

– Brooks


The "bmoses-nospam" address is valid; no unmunging needed.
C
Chuck
Jul 30, 2005
One of the things I do is make sure that if the page is thin, that your scanning, is that I put a sheet of white paper behind it. This is especially important if it’s a book as it helps to eliminate background noise from the opposite page.

Then, after it is scanned, if opposite page information has bled through or the image appears somewhat grey or dirty, I merely adjust the contrast and brightness to get a cleaner image . . . from time to time it may be necessary to sharpen or clarify the image some.

Hope this helps.

If you are using OCR, then just scan directly to the OCR program. Most OCR programs say that 100-150 dpi is okay, but I always scan at 300dpi. OCR is only basing what it’s reading on the clarity of the text. It really doesn’t know what it’s reading, just comparing to what it’s been programmed to, to the image you scanned. So the image should be as clean as possible, unless you want to do a lot of proofing.


Chuck

"Nehmo" wrote in message
OCR apps require a lot of human input to get a good output, particularly if the text is difficult to read. I’m talking about enhancing the readably of an _image_ of text.



|||||||||||||||| Nehmo Sergheyev ||||||||||||||||
BM
Brooks Moses
Jul 30, 2005
Chuck wrote:
One of the things I do is make sure that if the page is thin, that your scanning, is that I put a sheet of white paper behind it. This is especially important if it’s a book as it helps to eliminate background noise from the opposite page.

Actually, if it’s really bad, black works better, since the text on the other side of the page is black — and thus it "disappears" against the black background.

– Brooks


The "bmoses-nospam" address is valid; no unmunging needed.
C
Chuck
Jul 30, 2005
"Brooks Moses" wrote in message
Chuck wrote:
One of the things I do is make sure that if the page is thin, that your scanning, is that I put a sheet of white paper behind it. This is especially important if it’s a book as it helps to eliminate background noise from the opposite page.

Actually, if it’s really bad, black works better, since the text on the other side of the page is black — and thus it "disappears" against the black background.

thanks for that tip . . .


Chuck
NS
Nehmo Sergheyev
Aug 1, 2005
I’m asking about image of text improvement, that’s _after_ you have the image, perhaps already in jpeg format. I’ll provide an example: people often make photostats of small-font law books on a law library’s copy machine. These Photostats are later scanned and put on a drive as a jpeg. These images do not OCR well, and if you used an OCR app and turned the jpeg into, let’s say, a .doc, you’d have to recreate the formatting as well.

So is there a usual method of improving the readability of an image of text (dark text on a light background)?

Contrast adjustments are a logical effort, but I, personally, haven’t been able to improve readability much with those.

I’m not talking about one particular example, I asking in general. The common factor would be that the resolution of the image of the text has deteriorated – the edges of the text are blurred to varying extents.


|||||||||||||||| Nehmo Sergheyev ||||||||||||||||
FH
Fred Hiltz
Aug 1, 2005
wrote:
[snip]
So is there a usual method of improving the readability of an image of text (dark text on a light background)?

Contrast adjustments are a logical effort, but I, personally, haven’t been able to improve readability much with those.
I’m not talking about one particular example, I asking in general. The common factor would be that the resolution of the image of the text has deteriorated – the edges of the text are blurred to varying extents.

There is no "usual method." Thinking back on projects where I have cleaned up text, I have almost always used a histogram adjustment, frequently used unsharp mask, frequently used the salt and pepper and despeckle filters, occasionally used color separations, occasionally used edge preserving smooth, and occasionally used high pass filtering.

Others will have used other techniques, I am sure. It depends a lot on the particular image and somewhat on your own tastes. That is why we asked for samples.

Fred Hiltz, fhiltz at yahoo dot com
C-Tech volunteer
E
Eric
Aug 2, 2005
wrote:

I’m asking about image of text improvement, that’s _after_ you have the image, perhaps already in jpeg format. I’ll provide an example: people often make photostats of small-font law books on a law library’s copy machine. These Photostats are later scanned and put on a drive as a jpeg. These images do not OCR well, and if you used an OCR app and turned the jpeg into, let’s say, a .doc, you’d have to recreate the formatting as well.

So is there a usual method of improving the readability of an image of text (dark text on a light background)?

Contrast adjustments are a logical effort, but I, personally, haven’t been able to improve readability much with those.

I’m not talking about one particular example, I asking in general. The common factor would be that the resolution of the image of the text has deteriorated – the edges of the text are blurred to varying extents.

You have a whole sack of poo.

JPEG is a crap format, especially for text. It is a lossy compression method, and it discards text particularly well.

The best method is to rescan them properly.

Eric
M
Marvin
Aug 2, 2005
Harry Limey wrote:
"Nehmo" wrote in message

What’s the best way to enhance the image of text to an image where the text is more recognizable?

Sometimes when you scan a page of text, the text isn’t clean. What’s the best graphic method to improve the readability of a scanned page of text?

The best way would be to use OCR software and import actual text into whatever layout you are using.
OCR works well when the scan is good, so your suggestion won’t solve his problem. With PSP 8, I find that unsharp mask helps a scanned page of text to work better with my OCR program – Read Iris Pro. Like other OCR programs, Read Iris Pro wants scans made at 300 ppi.
MJ
Milind Joshi
Aug 4, 2005
Hi,
Well, there are many ways to improve image quality, some very good suggestions have been given here.

Some techniques are background noise filtering, others are removal of "grid" items, or pixels that cannot possibly be text, and applying filters to smooth the text.

Unless one sees the image though, one cannot really suggest the best method.

Could you send me your image to info AT ideatechnosoft DOT com? You could also send us your current results. Your data and image will remain confidential.

Best Regards,
Milind Joshi
IDEA TECHNOSOFT INC.
http://www.ideatechnosoft.com
J
jrzyguy
Sep 4, 2005
Try using "levels" rather than contrast. With levels play with using the sliders on either end and in the middle. Much better than contrast. When you use the contrast tool…you contrast the whole image. WHile using levels (or curves) you can darken your darks and lighten your lights…trust me..it works…it just takes a bit of getting used to.

i never ever ever ever ever ever use the brightens and contrast tool. THat was lesson #1 at NYU.

"Milind Joshi" wrote in message
Hi,
Well, there are many ways to improve image quality, some very good suggestions have been given here.

Some techniques are background noise filtering, others are removal of "grid" items, or pixels that cannot possibly be text, and applying filters to smooth the text.

Unless one sees the image though, one cannot really suggest the best method.

Could you send me your image to info AT ideatechnosoft DOT com? You could also send us your current results. Your data and image will remain confidential.

Best Regards,
Milind Joshi
IDEA TECHNOSOFT INC.
http://www.ideatechnosoft.com
C
carlos
Sep 5, 2005
I had some success with Photoshop filters on 300 dpi tiff scans. Time consuming though because it has to be done page by page.

MacBook Pro 16” Mockups 🔥

– in 4 materials (clay versions included)

– 12 scenes

– 48 MacBook Pro 16″ mockups

– 6000 x 4500 px

Related Discussion Topics

Nice and short text about related topics in discussion sections