Announcements • Quiz on Tuesday, March 10. • Material covered (Union not Intersection) • All lectures before today (March 3). • Forsyth and Ponce Readings: • Chapters 1.1, 4, 5.1, 5.2, 5.3, 7,8, 9.1, 9.2, 9.3, 6.5.2, • Extra reading: http://persci.mit.edu/people/adelson/publications/gazzan.dir/gazzan.htm
What you should know for quiz • This list is not inclusive. • Meaning of basic terms. For example: Perspective projection, scaled orthographic projection, horizon, vanishing point, Lambertian reflectance, BRDF, point light source, convolution (1d and 2d, discrete and continuous), high-pass filter, low-pass filter, high frequency signal, hysteresis, gradient, non-maximum suppression, Gaussian, the scale of a filter, texture synthesis, lightness constancy.
What you should know for quiz • How to work through simple examples by hand for all algorithms covered. Examples (not comprehensive) • Compute the perspective/scaled-orthographic projection of an object. • Convolve a kernel with an image in 1D • Compute the gradient of a function in 2D. • Predict the effect of hysteresis. • Reproduce the effects of non-maximum suppression. • Compute and compare the histograms of two textures using Chi-Squared test. • Compute the SSD between two point sets. • Predict appearance of a Lambertian object, given lighting. • Compute the results of a lightness constancy algorithm. • Predict the results of applying a specific filter to a specific image. • …
What you should know for quiz • Recall basic properties of operations described in class and in book. • Examples: convolution is associative, the image of a line under perspective projection is a line, …. • Prove some properties using this knowledge.
Quiz: Fourier Transform • You may be asked intuitive questions about f.t. Example: which is higher frequency, cos a or cos 2a. • You won’t be asked mathematical details not covered in class slides.
Perceptual Grouping • Forsyth and Ponce: 14.2, 15. • In coming classes, 16, then rest of 14. • Extra Reading: Laws of Organization in Perceptual Forms, Max Wertheimer (1923). http://psy.edu/~classics/Wertheimer/Forms/forms.htm
Perceptual Grouping • Up to now we’ve focused on local properties of images. • Perceptual grouping is about putting parts together into a whole: • Finding regions with a uniform property • Linking edges into object boundaries Surfaces and objects are critical. Also, simpler ``objects’’ such as lines
Human perceptual grouping • This has been significant inspiration to computer vision. • Why? • Perceptual grouping seems to rely partly on the nature of objects in the world. • This is hard to quantify, we hypothesize that human vision encodes the necessary knowledge.
Gestalt Principles of Grouping: some history • Behaviorists were dominant psychological theorists in early 20th century. • To make psych scientific, wanted to view it as rules describing relation between stimulus and response, described as atomic elements. • Not role for “mind”. • Influential early behaviorist was Pavlov
Gestalt movement claimed atomic stimulus and response don’t exist. • The mind perceives world as objects, as wholes, not as atomic primitives. • Can’t understand psych without understanding how we perceive the world.
I stand at the window and see a house, trees, sky. Theoretically I might say there were 327brightnesses and nuances of colour. Do I have "327"?No. I have sky, house, and trees. It is impossible to achieve "327 " as such. And yet even though such droll calculation were possible and implied, say, for the house 120, the trees 90, the sky 117 -- I should at least have this arrangement and division of the total, and not, say, 127 and 100 and 100; or 150 and 177. Max Wertheimer, 1923
I.A row of dots is presented upon a homogeneous ground. The alternate intervals are 3 mm. and 12 mm. Normally this row will be seen as ab/cd, not as a/bc/de. As a matter of fact it is for most people impossible to see the whole series simultaneously in the latter grouping. Max Wertheimer
Gestalt Movement • Perceptual organization was a big issue. • How we perceive the world in terms of things/objects, not pixels. • This was part of broader attack on behaviorism. • Gestalt viewed mind as constructing representations of the world, no learning/behavior could be understood without understanding this.
Issues in Perceptual Organization • What is the role of an edge in an image? To what object (if any) does it belong?
If you know what is in the next image, silently raise your hand. Don’t call out.
Issues in Perceptual Organization • What factors determine which parts of an image are combined in the same object?
Higher level Knowledge If you know what is in the next image, silently raise your hand. Don’t call out.
Other Factors • Common fate (ie., common motion). • Good continuation in time. • Parallelism • Collinearity
Computer Vision Again Divide P.O. approaches into two groups. • Parametric: We have a description of what we want, with parameters: Examples: lines, circles, constant intensity, constant intensity + Gaussian noise. • Non-parametric: We have constraints the group should satisfy, or optimality criteria. Example: Find the closed curve that is smoothest and that also best follows strong image gradients.
The Meta-Algorithm • Define what it means for a group to be good. • Usually this involves simplifications • Search for the best group. • Usually this is intractable, so short-cuts are needed.
Parametric Grouping: Grouping Points into Lines Basic Facts about Lines (a,b) • (x,y) is on line if (x,y).(a,b) = c • ax + by = c Distance from (x,y) to line is (a,b).*(x,y) = ax + by c
This is difficult because of: • Extraneous data: Clutter • Missing data • Noise
RANSAC: Random Sample Consensus • Generate a bunch of reasonable hypotheses. • Test to see which is the best.
RANSAC for Lines • Generate Lines using Pairs of Points How many samples? Suppose p is fraction of points from line. n points needed to define hypothesis (2 for lines) k samples chosen. Probability one sample correct is:
RANSAC for Lines: Continued • Decide how good a line is: • Count number of points within e of line. • Parameter e measures the amount of noise expected. • Other possibilities. For example, for these points, also look at how far they are. • Pick the best line.
The Hough Transform for Lines • A line is the set of points (x, y) such that • Different choices of q, d>0 give different lines • For any (x, y) there is a one parameter family of lines through this point. Just let (x,y) be constants and q, d be unknowns. • Each point gets to vote for each line in the family; if there is a line that has lots of votes, that should be the line passing through the points
Mechanics of the Hough transform • Construct an array representing q, d • For each point, render the curve (q, d) into this array, adding one at each cell • Difficulties • how big should the cells be? (too big, and we cannot distinguish between quite different lines; too small, and noise causes lines to be missed) • How many lines? • count the peaks in the Hough array • Who belongs to which line? • tag the votes • Can modify voting, peak finding to reflect noise. • Big problem if noise in Hough space different from noise in image space.
Some pros and cons • Complexity of RANSAC n*n*n • Complexity of Hough n*d • Error behavior: both can have problems, RANSAC perhaps easier to understand. • Clutter: RANSAC very robust, Hough falls apart at some point. • There are endless variations that improve some of Hough’s problems.