Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. The topmost decision node in a tree which corresponds to the best predictor called root node.
The core algorithm for building decision trees called ID3 which employs a top-down, greedy search through the space of possible branches with no backtracking. It uses Entropy and Information Gain to construct a decision tree.
Entropy: ID3 algorithm uses entropy to calculate the homogeneity of a sample. If the sample is completely homogeneous the entropy is zero and if the sample is an equally divided it has entropy of one.
Information Gain: The information gain is based on the decrease in entropy after a dataset is split on an attribute. Constructing a decision tree is all about finding attribute that returns the highest information gain (i.e., the most homogeneous branches).
It converts dataset into query statements and then draw the tree.
- It does not require any domain knowledge.
- It is easy to assimilate by human.
- Learning and classification steps of decision tree are simple and fast
- Decision trees can handle both categorical and numerical data.
- Performs well on large datasets
- Such algorithms cannot guarantee to return the globally-optimal decision tree
- Decision-tree learners can create over-complex trees that do not generalize well from the training data (Over fitting)
- Information gain in decision trees is biased in favor of those attributes with more levels
- It has issues when there are many missing values in dataset.