Questions tagged [gbm]

R package gbm, implementing Generalized Boosted Regression Models library.

R package gbm, implementing Generalized Boosted Regression Models library.

This package implements extensions to Freund and Schapire’s AdaBoost algorithm and Friedman’s gradient boosting machine.

Includes regression methods for least squares,absolute loss, t-distribution loss, quantile regression,logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart).

Who's using gbm?

The gbm package is used in examples in Software for Data Analysis by John Chambers.

gbm is also used in Elements of Statistical Learning by Hastie, Tibshirani and Friedman.

Richard A. Berk also uses gbm in his book, Statistical Learning from a Regression Perspective.

Source: gradientboostedmodels

328 questions
22
votes
2 answers

How to get different Variable Importance for each class in a binary h2o GBM in R?

I'm trying to explore the use of a GBM with h2o for a classification issue to replace a logistic regression (GLM). The non-linearity and interactions in my data make me think a GBM is more suitable. I've ran a baseline GBM (see below) and compared…
wake_wake
  • 1,176
  • 2
  • 15
  • 42
18
votes
3 answers

GBM R function: get variable importance separately for each class

I am using the gbm function in R (gbm package) to fit stochastic gradient boosting models for multiclass classification. I am simply trying to obtain the importance of each predictor separately for each class, like in this picture from the Hastie…
Antoine
  • 1,385
  • 4
  • 19
  • 44
16
votes
1 answer

gbm::interact.gbm vs. dismo::gbm.interactions

Background The reference manual for the gbm package states the interact.gbm function computes Friedman's H-statistic to assess the strength of variable interactions. the H-statistic is on the scale of [0-1]. The reference manual for the dismo…
GNG
  • 229
  • 1
  • 10
14
votes
2 answers

How to use XGBoost algorithm for regression in R?

I was trying the XGBoost technique for the prediction. As my dependent variable is continuous, I was doing the regression using XGBoost, but most of the references available in various portal are for classification. Though i know by using objective…
Amarjeet
  • 857
  • 1
  • 9
  • 14
12
votes
2 answers

GBM Rule Generation - Coding Advice

I use the R package GBM as probably my first choice for predictive modeling. There are so many great things about this algorithm but the one "bad" is that I cant easily use model code to score new data outside of R. I want to write code that can be…
B_Miner
  • 1,635
  • 1
  • 23
  • 54
12
votes
2 answers

subscript out of bounds in gbm function

I am having a strange problem. I have successfully ran this code on my laptop, but when I try to run it on another machine first I get this warning Distribution not specified, assuming bernoulli ..., which I expect but then I get this error: Error…
Herman Toothrot
  • 1,186
  • 2
  • 20
  • 42
12
votes
2 answers

using caret package to find optimal parameters of GBM

I'm using the R GBM package for boosting to do regression on some biological data of dimensions 10,000 X 932 and I want to know what are the best parameters settings for GBM package especially (n.trees, shrinkage, interaction.depth and…
DOSMarter
  • 1,337
  • 4
  • 17
  • 28
11
votes
2 answers

R: Plot trees from h2o.randomForest() and h2o.gbm()

Looking for an efficient way to plot trees in rstudio, H2O's Flow or in local html page from h2o's RF and GBM models similar to the one in the image in link below. Specifically, how do you plot trees for the objects, (fitted models) rf1 and gbm2…
Webby
  • 317
  • 1
  • 3
  • 10
11
votes
1 answer

In gbm multinomial dist, how to use predict to get categorical output?

My response is a categorical variable (some alphabets), so I used distribution='multinomial' when making the model, and now I want to predict the response and obtain the output in terms of these alphabets, instead of matrix of probabilities. However…
shavendy
  • 163
  • 1
  • 1
  • 8
10
votes
1 answer

R: implementing my own gradient boosting algorithm

I am trying to write my own gradient boosting algorithm. I understand there are existing packages like gbm and xgboost, but I wanted to understand how the algorithm works by writing my own. I am using the iris data set, and my outcome is…
Adrian
  • 6,591
  • 19
  • 56
  • 105
10
votes
1 answer

"valid deviance" is nan for GBM model, What does this means and how to get rid of this?

I am using Gradient boosting for classification. Though the result is improving but I am getting NaN in validdeviance. Model = gbm.fit( x= x_Train , y = y_Train , distribution = "bernoulli", n.trees = GBM_NTREES , shrinkage =…
Amarjeet
  • 857
  • 1
  • 9
  • 14
9
votes
1 answer

Inconsistent predictions from predict.gbm()

UPDATE: I have tried running the code on https://rdrr.io/snippets/ and it works fine. Therefore, I suspect a problem with my R installation, but it is extremely worrying that this can happen without errors or warnings. What are the best steps to…
Robert Long
  • 2,785
  • 4
  • 18
  • 38
9
votes
1 answer

Caret train method complains Something is wrong; all the RMSE metric values are missing

On numerous occasions I've been getting this error when trying to fit a gbm or rpart model. Finally I was able to reproduce it consistently using publicly available data. I have noticed that this error happens when using CV (or repeated cv). When I…
Fred R.
  • 435
  • 2
  • 6
  • 13
8
votes
1 answer

Python - Scikit find variable importance for categorical variables

I'm trying to use scikit learn in python to do a couple different classifier problems (RF, GBM, etc). In addition to building models and making predictions, I'd like to see variable importance. I know there is a way to get the…
screechOwl
  • 23,958
  • 54
  • 146
  • 246
8
votes
2 answers

GBM multinomial distribution, how to use predict() to get predicted class?

I am using the multinomial distribution from the gbm package in R. When I use the predict function, I get a series of values: 5.086328 -4.738346 -8.492738 -5.980720 -4.351102 -4.738044 -3.220387 -4.732654 but I want to get the probability of each…
Jim Johnson
  • 83
  • 1
  • 4
1
2 3
21 22