DecisionTree

class numpy_ml.trees.DecisionTree(classifier=True, max_depth=None, n_feats=None, criterion='entropy', seed=None)[source]

A decision tree model for regression and classification problems.

Parameters:
  • classifier (bool) – Whether to treat target values as categorical (classifier = True) or continuous (classifier = False). Default is True.
  • max_depth (int or None) – The depth at which to stop growing the tree. If None, grow the tree until all leaves are pure. Default is None.
  • n_feats (int) – Specifies the number of features to sample on each split. If None, use all features on each split. Default is None.
  • criterion ({'mse', 'entropy', 'gini'}) – The error criterion to use when calculating splits. When classifier is False, valid entries are {‘mse’}. When classifier is True, valid entries are {‘entropy’, ‘gini’}. Default is ‘entropy’.
  • seed (int or None) – Seed for the random number generator. Default is None.
fit(X, Y)[source]

Fit a binary decision tree to a dataset.

Parameters:
  • X (ndarray of shape (N, M)) – The training data of N examples, each with M features
  • Y (ndarray of shape (N,)) – An array of integer class labels for each example in X if self.classifier = True, otherwise the set of target values for each example in X.
predict(X)[source]

Use the trained decision tree to classify or predict the examples in X.

Parameters:X (ndarray of shape (N, M)) – The training data of N examples, each with M features
Returns:preds (ndarray of shape (N,)) – The integer class labels predicted for each example in X if self.classifier = True, otherwise the predicted target values.
predict_class_probs(X)[source]

Use the trained decision tree to return the class probabilities for the examples in X.

Parameters:X (ndarray of shape (N, M)) – The training data of N examples, each with M features
Returns:preds (ndarray of shape (N, n_classes)) – The class probabilities predicted for each example in X.