GradientBoostedDecisionTree¶

class numpy_ml.trees.GradientBoostedDecisionTree(n_iter, max_depth=None, classifier=True, learning_rate=1, loss='crossentropy', step_size='constant')[source]

A gradient boosted ensemble of decision trees.

Notes

Gradient boosted machines (GBMs) fit an ensemble of m weak learners such that:

$f_m(X) = b(X) + \eta w_1 g_1 + \ldots + \eta w_m g_m$

where b is a fixed initial estimate for the targets, $$\eta$$ is a learning rate parameter, and $$w_{\cdot}$$ and $$g_{\cdot}$$ denote the weights and learner predictions for subsequent fits.

We fit each w and g iteratively using a greedy strategy so that at each iteration i,

$w_i, g_i = \arg \min_{w_i, g_i} L(Y, f_{i-1}(X) + w_i g_i)$

On each iteration we fit a new weak learner to predict the negative gradient of the loss with respect to the previous prediction, $$f_{i-1}(X)$$. We then use the element-wise product of the predictions of this weak learner, $$g_i$$, with a weight, $$w_i$$, to compute the amount to adjust the predictions of our model at the previous iteration, $$f_{i-1}(X)$$:

$f_i(X) := f_{i-1}(X) + w_i g_i$
Parameters: n_iter (int) – The number of iterations / weak estimators to use when fitting each dimension / class of Y. max_depth (int) – The maximum depth of each decision tree weak estimator. Default is None. classifier (bool) – Whether Y contains class labels or real-valued targets. Default is True. learning_rate (float) – Value in [0, 1] controlling the amount each weak estimator contributes to the overall model prediction. Sometimes known as the shrinkage parameter in the GBM literature. Default is 1. loss ({'crossentropy', 'mse'}) – The loss to optimize for the GBM. Default is ‘crossentropy’. step_size ({"constant", "adaptive"}) – How to choose the weight for each weak learner. If “constant”, use a fixed weight of 1 for each learner. If “adaptive”, use a step size computed via line-search on the current iteration’s loss. Default is ‘constant’.
fit(X, Y)[source]

Fit the gradient boosted decision trees on a dataset.

Parameters: X (ndarray of shape (N, M)) – The training data of N examples, each with M features Y (ndarray of shape (N,)) – An array of integer class labels for each example in X if self.classifier = True, otherwise the set of target values for each example in X.
predict(X)[source]

Use the trained model to classify or predict the examples in X.

Parameters: X (ndarray of shape (N, M)) – The training data of N examples, each with M features preds (ndarray of shape (N,)) – The integer class labels predicted for each example in X if self.classifier = True, otherwise the predicted target values.