Exp 3 [PDF]

Delhi Technological University

Machine Learning Lab

EXPERIMENT : 3 AIM : To implement Decision Tree ID3 (Iterative Dic

19 0 215 KB

Report DMCA / Copyright

DOWNLOAD FILE

Grammar 3 Exp

6 0 211 KB Read more

Soal Hortatory Exp.1

5 0 266 KB Read more

SOP EXP Logging Sample

3 0 672 KB Read more

SOP EXP Core Drilling

3 0 4 MB Read more

Klasikal Exp Learning

0 0 801 KB Read more

Exp - Atex Junction Box Killark

3 0 258 KB Read more

Exp Paper STUDI KELAYAKAN BISNIS

3 0 245 KB Read more

Revisi Diagram Alir Exp.1

0 0 198 KB Read more

Modul Exp UC Nuvoton-V1.0

2 0 1 MB Read more

SOP-EXP-2017-06-Survey For Exploration

2 0 380 KB Read more

File loading please wait...

Citation preview

Delhi Technological University

Machine Learning Lab

EXPERIMENT : 3 AIM : To implement Decision Tree ID3 (Iterative Dichotomiser 3) Algorithm.

THEORY : Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or more branches (e.g., Sunny, Overcast and Rainy). Leaf node (e.g., Play) represents a classification or decision. Entropy is a measure of the amount of uncertainty in the dataset. It is defined for a binary class with values a/b as: Entropy = - p(a)*log(p(a)) - p(b)*log(p(b)) Information gain IG(A) is the measure of the difference in entropy from before to after the dataset S is split on an attribute A. In other words, how much uncertainty in S was reduced after splitting set S on attribute A.

ALGORITHM ID3 (Examples, Target_Attribute, Attributes) Create a root node for the tree If all examples are positive, Return the single-node tree Root, with label = +. If all examples are negative, Return the single-node tree Root, with label = -. If number of predicting attributes is empty, then Return the single node tree Root, with label = most common value of the target attribute in the examples. Otherwise Begin A ← The Attribute that best classifies examples. Decision Tree attribute for Root = A. For each possible value, vi, of A, Add a new tree branch below Root, corresponding to the test A = vi. Let Examples(vi) be the subset of examples that have the value vi for A If Examples(vi) is empty Then below this new branch add a leaf node with label = most common target value in the examples Else below this new branch add the subtree ID3 (Examples(vi), Target_Attribute, Attributes – {A}) End Return Root

ALGORITHM def find_entropy(df): Class = df.columns[-1] #To make the code generic, changing target variable class name entropy = 0 values = df[Class].unique() for value in values:

fraction = df[Class].value_counts()[value]/len(df[Class]) entropy += -fraction*np.log2(fraction) return entropy

def find_entropy_attribute(df,attribute): Class = df.columns[-1] #To make the code generic, changing target variable class name target_variables = df[Class].unique() #This gives all 'Yes' and 'No' variables = df[attribute].unique() #This gives different features in that attribute (like 'Hot','Cold' in Temperature) entropy2 = 0 for variable in variables: entropy = 0 for target_variable in target_variables: num = len(df[attribute][df[attribute]==variable][df[Class] ==target_variable]) den = len(df[attribute][df[attribute]==variable]) fraction = num/(den) entropy += -fraction*log(fraction) fraction2 = den/len(df) entropy2 += fraction2*entropy return abs(entropy2)

def find_winner(df): Entropy_att = [] IG = [] for key in df.columns[:-1]: # Entropy_att.append(find_entropy_attribute(df,key)) IG.append(find_entropy(df)-find_entropy_attribute(df,key))

return df.columns[:-1][np.argmax(IG)]

def get_subtable(df, node,value): return df[df[node] == value].reset_index(drop=True)

def buildTree(df,tree=None): Class = df.columns[-1] #To make the code generic, changing target variable class name #Here we build our decision tree #Get attribute with maximum information gain node = find_winner(df) #Get distinct value of that attribute e.g Salary is node and Low,Med and High are values attValue = np.unique(df[node]) #Create an empty dictionary to create tree if tree is None:

tree={} tree[node] = {} #We make loop to construct a tree by calling this function recursively. #In this we check if the subset is pure and stops if it is pure. for value in attValue: subtable = get_subtable(df,node,value) clValue,counts = np.unique(subtable['class'],return_counts=True) if len(counts)==1:

#Checking purity of subset

tree[node][value] = clValue[0] else: tree[node][value] = buildTree(subtable)

#Calling the function recursively

return tree

OUTPUT

DISCUSSION A decision tree is a map of the possible outcomes of a series of related choices. It allows an individual or organization to weigh possible actions against one another based on their costs, probabilities, and benefits. They can be used either to drive informal discussion or to map out an algorithm that predicts the best choice mathematically. Using decision trees in machine learning has several advantages:  The cost of using the tree to predict data decreases with each additional data point  Works for either categorical or numerical data

 Can model problems with multiple outputs But they also have a few disadvantages:  When dealing with categorical data with multiple levels, the information gain is biased in favor of the attributes with the most levels.  Calculations can become complex when dealing with uncertainty and lots of linked outcomes.  Conjunctions between nodes are limited to AND, whereas decision graphs allow for nodes linked by OR.

FINDINGS AND LEARNINGS Advantage of ID3:    

Understandable prediction rules are created from the training data. Builds the fastest tree. Builds a short tree. Only need to test enough attributes until all data is classified.

Disadvantage of ID3:   

Data may be over-fitted or over-classified, if a small sample is tested. Only one attribute at a time is tested for making a decision. Difficulty in classifying continuous.