Machine Learning Approaches to Classification Learning
Inductive Learning
Decision Tree Induction
Rule Induction
Instance-based Learning
Nural Networks
Genetic Algorithms
Inductive Learning
Categories:
Decision Tree Induction
“divide-and-conquer” inductive systems.
Knowledge is represented as decision trees.
Rule Induction
“separate-and-conquer” inductive systems.
Knowledge is represented as IF-THEN rules.
Advantages:
Fast compared to other techniques,
Simple and their generated models are easy to understand.
Obtain similar and sometimes better accuracies compared with other classification techniques.
Instance-based Learning (1)
Main characteristics:
Simply store the training instances for future use.
Defer generalising beyond these instances until a new instance must be classified.
Classify a test instance by finding the nearest stored instance according to some similarity function, and assigning the class of the latter to the former.
Instance-based Learning (2)
Advantages:
Simplicity.
The ability to model complex target concepts.
Information present in the training instances is never lost.
Disadvantages:
Greater storage requirements.
Higher computational costs when classifying new instances.
Higher sensitivity to noise and irrelevant attributes.
Neural Networks (1)
Main charachteristics
Computational models of the brain.
Function approximation tools which learn the relationship between independent variables and dependent variables.
Make no assumption about the statistical distribution or properties of the data, and therefore tend to be more useful in practical situations.
An inherently non-linear approach, giving them much accuracy when modelling complex data patterns.
Neural Networks (2)
Advantages:
Wide applicability.
Higher accuracy.
Robust to noise in the training data.
Well-suited to complex problems.
Disadvantages:
Difficulty in understanding the models they produce.
Computationally expensive to train.
Genetic Algorithms (1)
Main characteristics:
Optimisation algorithms based on evolution principles.
Potential solutions represented as strings (genes).
Best string is found using genetic operators:
Selection
Crossover
Mutation
Genetic Algorithms (2)
Advantages:
A potentially greater ability to avoid local minima than is possible with the simple greedy search employed by most learning techniques.