Fast Binary Shape Classification using
Bounded Nearest Neighbors
• Image (-flow)
– High dimension – raw data
– Redundancy
• Description
– Reduced dimension – compressed data
– mathematical-conceptual description
• Query
– Low dimension
– understandable data – semantic description
Image description – dimension reduction
[54 84 -12 32 -12 0 4]
[YES]
position of an object: [12 45]
Requirements
• Meaningful and compressed
• Invariance to – Translation – Rotation – Scale
• Invariant to minor changes and noise
• Adequacy for comparison
• Possibility of reconstruction
Shape description
• Basic
– Meaningful and
understandable information – Comparability depends on
the task, generally only basic features are not enough
– Perimeter – Area
– Eccentricity – Elongation – Extent – Orientation – Solidity
– Rectangularity
Shape descriptions
• Contour-based
– Compressed representation – Ineffective comparison (size of
complex feature vectors differ) – Representation of more complex
shapes and holes is difficult
– curve chains
– polygonal approximations – Central Distance functions
• Region-based
– Meaningful and complex representation
– Highly invariant features
– Standard moments – Zernike moments – 2D Fourier descriptor
• Edge detection in four directions
• Thresholding based on local differences
• Maximum edge flags selection
• Projection to four directions
• Normalization and smoothing
PPED as shape descriptor
• Edge detection in four directions
• Thresholding to constant value
• Maximum edge flags selection
• Projection to four directions
• Normalization and smoothing
PPED as shape descriptor
10 20 30 40 50 60
10 20
30
40 50
60 10
20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60 10
20 30 40 50 60
10 20 30 40 50 60
• Rotation invariance
– Calculate a characteristic direction
– e.g. simple shape orientation
– Re-rotate the shape
• Translation
and scale invariance
– Cut the image and center – Scale to 64*64 square
PPED as shape descriptor
100 200 300
50 100 150 200 250 300 350 400 450
10 20 30 40 50 60 10
20 30 40 50 60
Reinforce the description
• Add basic features – Eccentricity
– Area ratio
• Add features
orthogonal to PPED – Orthogonal to the
edge-information – Four-four moments
of the projected histogram
PPED as shape descriptor
50 100 150 200 250 300 350 50
100 150 200 250 300 350 400 450
0 100 200 300 400
0 20 40 60 80 100 120
0501001502002503003504004505000
20
40
60
80
100
120
140
160
• I. Filtering by the eccentricity, area, and the moments – Filter values are tuned by genetic programming
– Only few instances are selected for the II. part
• II. Bounded nearest neighborhood classification to labeled instances – The class of the input is detected as the label of the closest stored shape
filtering (1-10. features) NN on PPED distance
Shape classification by 4M-PPED
10 20 30 40 50 60
10
20
30
40
50
60
• In the Nearest Neighborhood classification there is
always a classification – Inability to omit
non-class (zero-class) elements
• Filter the input
– Basic features, simple hash – Fast, good, but not enough
• Acceptance region (boundary) – Constant threshold value
– Adaptive threshold value
Bounded nearest neighborhood
• Idea: To identify an object as a class member, we need to know, what objects are not part of the class
– collect zero-class elements
• For every labeled train instance set the boundary radius to
– half of the distance to the closest zero-class element, if there is any , or
– the distance of the closest other-class element, if there is any, or
– the distance of the furthest same-class element
• Disadvantages
– Smaller cover rate – bigger look-up table – Increased evaluation time
Adaptive boundary computation
• Faster evaluation
– No need to compare to all stored shape descriptors
• Better accuracy
– Using orthogonal shape information
• Higher cover
– By omitting elements that are visually different, the NN-boundary can be higher
• Filter-tuning by genetic computation
– Maximize the accuracy – cover on a parameter training set
Pre-filtering
10 20 30 40 50 60
10
20
30
40
50
60
Bionic Eyeglass Banknote Recognition task
• Patterns for training and test collected from live tests
• Filter weights tuned by genetic algorithm
Results
TS1 Global accuracy Cover Precision Av. lu. time (ms)
W/out Filters 84,40% 45% 98,30% 57,2
With Filters 88,50% 58,30% 99,40% 6,4
TS2 Global accuracy Cover Precision Av. lu. time (ms)
W/out Filters 88,50% 46% 100% 58,0
With Filters 91,50% 60,10% 100% 6,4