Candidate Elimination Algorithm: Problem-1: Origin Manufacturer Color Decade Type Example Type
Candidate Elimination Algorithm: Problem-1: Origin Manufacturer Color Decade Type Example Type
Problem 1:
Example
Origin Manufacturer Color Decade Type
Type
Japan Honda Blue 1980 Economy Positive
Japan Toyota Green 1970 Sports Negative
Japan Toyota Blue 1990 Economy Positive
USA Chrysler Red 1980 Economy Negative
Japan Honda White 1980 Economy Positive
Solution:
These models represent the most general and the most specific heuristics one might learn.
The actual heuristic to be learned, "Japanese Economy Car", probably lies between them somewhere within
the version space.
G = { (Japan, ?, ?, ?, Economy) }
S = { (Japan, ?, ?, ?, Economy) }
G and S are singleton sets and S = G.
Converged.
No more data, so algorithm stops.
Candidate Elimination Algorithm: Problem-2
Problem 2: Learning the concept of "Japanese Economy Car" (continued)
Example
Origin Manufacturer Color Decade Type
Type
Japan Honda Blue 1980 Economy Positive
Japan Toyota Green 1970 Sports Negative
Japan Toyota Blue 1990 Economy Positive
USA Chrysler Red 1980 Economy Negative
Japan Honda White 1980 Economy Positive
Japan Toyota Green 1980 Economy Positive
Japan Honda Red 1990 Economy Negative
Solution:
Answer
For your hypothesis space (H), you start with your sets of maximally general (G) and maximally specific
(S) hypotheses:
G0 = {<?, ?, ?>}
S0 = {<0, 0, 0>}
When you are presented with a negative example, you need to remove from S any hypothesis
inconsistent with the current observation and replace any inconsistent hypothesis in G with its minimal
specializations that are consistent with the observation but still more general than some member of S.
So for your first (negative) example, (big, red, circle), the minimal specializations would make the
new hypothesis space
G1 = {<small, ? , ?>, <?, blue, ?>, <?, ?, triangle>}
S1 = S0 = {<0, 0, 0>}
Note that S did not change.
For your next example, (small, red, triangle), which is also negative, you will need to further
specialize G. Note that the second hypothesis in G1 does not match the new observation so only the first
and third hypotheses in G1 need to be specialized. That would yield
G2 = {<small, blue, ?>, <small, ?, circle>, <?, blue, ?>, <big, ?, triangle>, <?,
blue, triangle>}
However, since the first and last hypotheses in G2 above are specializations of the middle hypothesis
(<?, blue, ?>), we drop those two, giving
G2 = {<small, ?, circle>, <?, blue, ?>, <big, ?, triangle>}
S2 = S1 = S0 = {<0, 0, 0>}
For the positive (small, red, circle) observation, you must generalize S and remove anything in G
that is inconsistent, which gives
G3 = {<small, ?, circle>}
S3 = {<small, red, circle>}
(big, blue, circle) is the next negative example. But since it in not consistent with G, there is
nothing to do so
G4 = G3 = {<small, ?, circle>}
S4 = S3 = {<small, red, circle>}
Lastly, you have the positive example of (small, blue, circle), which requires you to generalize S to
make it consistent with the example, giving
G5 = {<small, ?, circle>}
S5 = {<small, ?, circle>}
Since G and S are equal, you have learned the concept of "small circles".