**On the Statistical Analysis of Dirty Pictures**Julian Besag**Image Processing**• Required in a very wide range of practical problems • Computer vision • Computer tomography • Agriculture • Many more… • Picture acquisition techniques are noisy**Problem Statement**• Given a noisy picture • And 2 source of information (assumptions) • A multivariate record for each pixel • Pixels close together tend to be alike • Reconstruct the true scene**Notation**• S – 2D region, partitioned into pixels numbered 1…n • x = (x1, x2, …, xn) – a coloring of S • x* (realization of X) – true coloring of S • y = (y1, y2, …, yn) (realization of Y) – observed pixel color**Assumption #1**• Given a scene x, the random variables Y1, Y2, …, Yn are conditionally independent and each Yi has the same known conditional density function f(yi|xi), dependent only on xi. • Probability of correct acquisition**Assumption #2**• The true coloring x* is a realization of a locally dependant Markov random field with specified distribution {p(x)}**Locally Dependent M.r.f.s**• Generally, the conditional distribution of pixel i depends on all other pixels, {S\i} • We are only concerned with local dependencies**Previous Methodology**• Maximum Probability Estimation • Classification by Maximum Marginal Probabilities**Maximum Probability Estimation**• Chose an estimate x such that it will have the maximum probability given a record vector y. • In Bayesian framework x is MAP estimate • In decision theory – 0-1 loss function**Maximum Probability Estimation**• Iterate over each pixel • Chose color xi at pixel i from probability • Slowly decreasing T will guarantee convergence**Classification by Maximum Marginal Probabilities**• Maximize the proportion of correctly classified pixels • Note that P(xi | y) depends on all records • Another proposal: use a small neighborhood for maximization • Still computationally hard because P is not available in closed form**Problems**• Large scale effects • Favors scenes of single color • Computationally expensive**Estimation by Iterated Conditional Modes**• The previously discussed methods have enormous computational demands, and undesirable large-scale properties. • We want a faster method with good large-scale properties.**Iterated Conditional Modes**• When applied to each pixel in turn, this procedure defines a single cycle of an iterative algorithm for estimating x***Examples of ICM**• Each example involves: • c unordered colors • Neighborhood is 8 surrounding pixels • A known scene x* • At each pixel i, a record yi is generated from a Gaussian distribution with mean and variance κ.**Extremes of β**• β = 0 gives the maximum likelihood classifier, with which ICM is initialized • β = ∞, xi is determined by a majority vote of its neighbors, with yi records only used to break ties.**Example 1**• 6 cycles of ICM were applied, with β = 1.5**Example 2**• Hand-drawn to display a wide array of features • yi records were generated by superimposing independent Gaussian noise, √κ = .6 • 8 cycles, β increasing from .5 to 1.5 over the 1st 6**Models for the true scene**• Most of the material here is speculative, a topic for future research • There are many kinds of images possessing special structures in the true scene. • What we have seen so far in the examples are discrete ordered colors.**Examples of special types of images**• Unordered colors • These are generally codes for some other attribute, such as crop identities • Excluded adjacencies • It may be known that certain colors cannot appear on neighboring pixels in the true scene.**More special cases…**• Grey-level scenes • Colors may have a natural ordering, such as intensity. The authors did not have the computing equipment to process, display, and experiment with 256 grey levels. • Continuous intensities • {p(x)} is a Gaussian M.r.f. with zero mean**More special cases…**• Special features, such as thin lines • Author had some success reproducing hedges and roads in radar images. • Pixel overlap**Parameter Estimation**• This may be computationally expensive • This is often unnecessary • We may need to estimate θ in l(y|x; θ) • Learn how records result from true scenes. • And we may need to estimate Φ in p(x;Φ) • Learn probabilities of true scenes.**Parameter Estimation, cont.**• Estimation from training data • Estimation during ICM**Example of Parameter Estimation**• Records produced with Gaussian noise, κ = .36 • Correct value of κ , gradually increasing β gives 1.2% error • Estimating β = 1.83and κ = .366 gives 1.2% error • κknown but β = 1.8 estimated gives 1.1% error**Block reconstruction**• Suppose the Bs form 2x2 blocks of four, with overlap between blocks • At each stage, the block in question must be assigned one of c4 colorings, based on 4 records, and 26 direct and diagonal adjacencies:**Block reconstruction example**• Univariate Gaussian records with κ = .9105 • Basic ICM with β = 1.5 gives 9% error rate • ICM with β = ∞ estimated gives 5.7% error**Conclusion**• We began by adopting a strict probabilistic formulation with regard to the true scene and generated records. • We then abandoned these in favor of ICM, on grounds of computation and to avoid unwelcome large-scale effects. • There is a vast number of problems in image processing and pattern recognition to which statisticians might usefully contribute.