QBoard » Artificial Intelligence & ML » AI and ML - Conceptual » Can decision trees be used for multi class classification?

Can decision trees be used for multi class classification?

  •   August 7, 2021 7:04 PM IST
    0
  • You can directly call the decision tree of sklearn to achieve the task.
    For each split node, you need to choose a best attribute and a best split point(for continuous attribute), then you need to compute the impurity of left and right child node respectively. The child impurity could be computed based gini index.
      August 21, 2021 6:46 PM IST
    0
  • Of course it can. The binarity of the tree is only in the questions it asks. Each node of the tree represents a question with binary answer(unless it's a terminal node -> that one represents a "class") such as:
    Is the numeric variable http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>1</mn></msub><mo>&#x2265;</mo><msub><mi>&#x03B1;</mi><mn>1</mn></msub></math>"> id="MathJax-Span-15" class="math">x1α1x1≥α1 ?
    Is the categorical variable http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>2</mn></msub><mo>=</mo><mi>m</mi><mi>a</mi><mi>l</mi><mi>e</mi></math>"> id="MathJax-Span-24" class="math">x2=malex2=male ?
    The tree then splits the observations according to the answer. There can be many terminal nodes leading to different "classes" or even many nodes leading to the same class.
    Note: I put the word "class" in "" so that I don't have to rephrase it for regression trees.
    Note2: Look up random forests for example to overcome some obstacles with trees.
      August 7, 2021 10:50 PM IST
    0
  • In short, yes, you can use decision trees for this problem.

    However there are many other ways to predict the result of multiclass problems.

    If you want to use decision trees one way of doing it could be to assign a unique integer to each of your classes. All examples of class one will be assigned the value y=1, all the examples of class two will be assigned to value y=2 etc. After this you could train a decision classification tree.

    Here is a quick implementation in Python which I got by modifying the example in this hackernoon post (https://hackernoon.com/a-brief-look-at-sklearn-tree-decisiontreeclassifier-c2ee262eab9a)

    You can see that we have classes 0,1,2 and 3 in the data and the algorithm trains to be able to predict these perfectly (note that there is over training here but that is a side note)

    from sklearn import tree
    from sklearn.model_selection import train_test_split
    import numpy as np

    features = np.array([
    [29, 23, 72],
    [31, 25, 77],
    [31, 27, 82],
    [29, 29, 89],
    [31, 31, 72],
    [29, 33, 77],
    ]*10)

    labels = np.array([
    [0],
    [1],
    [2],
    [3],
    [2],
    [0],
    ]*10)

    X_train, X_test, y_train, y_test = train_test_split(
    features,
    labels,
    test_size=0.3,
    random_state=42,
    )

    clf = tree.DecisionTreeClassifier()
    clf.fit(X=X_train, y=y_train)
    clf.feature_importances_ # [ 1., 0., 0.]
    clf.score(X=X_test, y=y_test) # 1.0
    clf.predict(X_test) # array([0, 0, 0, 3, 1, 0, 3, 0, 0, 3, 2, 2, 1, 3, 2, 0, 2, 0])

      August 9, 2021 1:23 PM IST
    0