GradientBoostedDecisionTree
¶
-
class
numpy_ml.trees.
GradientBoostedDecisionTree
(n_iter, max_depth=None, classifier=True, learning_rate=1, loss='crossentropy', step_size='constant')[source]¶ A gradient boosted ensemble of decision trees.
Notes
Gradient boosted machines (GBMs) fit an ensemble of m weak learners such that:
\[f_m(X) = b(X) + \eta w_1 g_1 + \ldots + \eta w_m g_m\]where b is a fixed initial estimate for the targets, \(\eta\) is a learning rate parameter, and \(w_{\cdot}\) and \(g_{\cdot}\) denote the weights and learner predictions for subsequent fits.
We fit each w and g iteratively using a greedy strategy so that at each iteration i,
\[w_i, g_i = \arg \min_{w_i, g_i} L(Y, f_{i-1}(X) + w_i g_i)\]On each iteration we fit a new weak learner to predict the negative gradient of the loss with respect to the previous prediction, \(f_{i-1}(X)\). We then use the element-wise product of the predictions of this weak learner, \(g_i\), with a weight, \(w_i\), to compute the amount to adjust the predictions of our model at the previous iteration, \(f_{i-1}(X)\):
\[f_i(X) := f_{i-1}(X) + w_i g_i\]Parameters: - n_iter (int) – The number of iterations / weak estimators to use when fitting each dimension / class of Y.
- max_depth (int) – The maximum depth of each decision tree weak estimator. Default is None.
- classifier (bool) – Whether Y contains class labels or real-valued targets. Default is True.
- learning_rate (float) – Value in [0, 1] controlling the amount each weak estimator contributes to the overall model prediction. Sometimes known as the shrinkage parameter in the GBM literature. Default is 1.
- loss ({'crossentropy', 'mse'}) – The loss to optimize for the GBM. Default is ‘crossentropy’.
- step_size ({"constant", "adaptive"}) – How to choose the weight for each weak learner. If “constant”, use a fixed weight of 1 for each learner. If “adaptive”, use a step size computed via line-search on the current iteration’s loss. Default is ‘constant’.
-
predict
(X)[source]¶ Use the trained model to classify or predict the examples in X.
Parameters: X ( ndarray
of shape (N, M)) – The training data of N examples, each with M featuresReturns: preds ( ndarray
of shape (N,)) – The integer class labels predicted for each example in X ifself.classifier = True
, otherwise the predicted target values.