In article ,
says…
On Tue, 24 Feb 2004 15:25:47 +0100, "Bart van der Wolf" wrote:
I have a scanner without multiscanning, so I have to do it manually. The problem is how do I eliminate extreme values when averaging?
Let’s say 4 scans result in following values for a point: 123, 125, 127 and 250. Obviously, 250 is a fluke and this spike should be eliminated before averaging out the first three values.
Statistically, 250 could be correct and the other three could be wrong…;-) I’d just use them as they are.
Actually, statistically, 250 is way out, but it may be right *realistically*! ;o)
Question: How do I do this in Photoshop 6? Layering the 4 images with opacity of 50%, 33% & 25% would include the extreme value.
Perhaps you can give a little less weight to the one you mistrust.
That’s the problem, I can’t examine every pixel manually.
Statisticians often remove high values and low values before averaging out exactly to avoid accidental samples. They also use other rude words such as "standard deviation" and so on… 😉
Of course, given enough samples even a simple averaging would "tame" those extreme values, but a more intelligent approach may eliminate the need for superfluous scanning which both, saves time and is easier on the scanner.
Layering requires perfect registration of the individual images, so spend some time on getting that right. You could also consider using a program like Registax (http://aberrator.astronomy.net/registax/) to do the work for you.
I’ll have a look, although I really prefer to do it myself, if for no other reason than as a learning exercise.
If you have a data set with a large number of "outliers", you could use the median value instead of the mean ("average") value. That is, sorting the data in the set from low-to-high, and picking the middle one.
E.g. 126, 127, 5, 189, 128, 176, 46
Sort: 5, 46, 126, 127, 128, 176, 189
Median: 127
Mean: 113.9
Another refined method would be to use a trimmed mean. In this case you also sort the dataset, discard a number/percentage of the top and bottom values, and calculate the mean of the intermediate values.
E.g. the interquartile mean:
http://en.wikipedia.org/wiki/Interquartile_mean Problem with all these methods based on a median is that you need to sort the values in the dataset. This is a pretty slow operation. Especially to do this pixel for pixel, combining data from multiple image files.
And the result would probably only be a marginal improvement. But if you have spare time to write some code…