Decision Trees

Popular representation for interpretable classifiers; even among humans!
Example: I’ve just arrived at a restaurant. Should I stay (wait for a table) or go elsewhere?

One may choose to use the following set of rules to make their decision:

Consider this dataset:

Fit a Decision tree?

The following decision tree is achieved:

Rationale is perfectly captured by a decision tree with depth=7

Consider this dataset:

Why?

Decision trees make the following presumption about the structure of data:

Can figure class out based on a series of binary questions (yes/no) on individual features

Inductive Bias: Anything which makes an algorithm learn one pattern over another

Inductive bias of decision trees entails the use of axis-parallel splits to construct the decision boundary
Sensitive to rotations
Algorithm invariant to rotation?

Lazy Human; do not want to learn anything
Predict on input \(\mathbb{x}\) as follows:
- Find others who are in a similar situation as \(\mathbb{x}\)
- Choose the top \(K\) people w.r.t similarity
- Have them vote on the prediction

In our dataset, we define similarity to be inveresely proportional to the distance between datapoints; i.e

The closer the datapoints, the more similar they are

The following decision boundaries are achieved by the KNN algorithm:

Consider the following dataset:

The dataset is linearly seperable
Blue points \(\implies\) Class 1, and Red points \(\implies\) Class 2
Note that the range (and therefore scaling) is vastly different for Features 1 & 2

KNN Results in the following decision boundary:

KNN has satisfactory performance; but we raise some questions:
- Is a StandardScalar transformation advisable?
- Dataset is given to be linearly seperable; Can we do better?