support vector machine train caret error kernlab class probability calculations failed; returning NAs

QBoard » Artificial Intelligence & ML » AI and ML - Conceptual » support vector machine train caret error kernlab class probability calculations failed; returning NAs

User Dashboard

support vector machine train caret error kernlab class probability calculations failed; returning NAs

Back To Topics

Tags : ml data python datasciecne

Rishi Pandya

131 2

i have some data and Y variable is a factor - Good or Bad. I am building a Support vector machine using 'train' method from 'caret' package. Using 'train' function i was able to finalize values of various tuning parameters and got the final Support vector machine . For the test data i can predict the 'class'. But when i try to predict probabilities for test data, i get below error (for example my model tells me that 1st data point in test data has y='good', but i want to know what is the probability of getting 'good' ...generally in case of support vector machine, model will calculate probability of prediction..if Y variable has 2 outcomes then model will predict probability of each outcome. The outcome which has the maximum probability is considered as the final solution)

**Warning message: In probFunction(method, modelFit, ppUnk) : kernlab class probability calculations failed; returning NAs**

sample code as below

library(caret) trainset <- data.frame( class=factor(c("Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Good", "Good", "Bad", "Bad", "Bad")), age=c(67, 22, 49, 45, 53, 35, 53, 35, 61, 28, 25, 24)) testset <- data.frame( class=factor(c("Good", "Bad", "Good" )), age=c(64, 23, 50)) library(kernlab) set.seed(231) ### finding optimal value of a tuning parameter sigDist <- sigest(class ~ ., data = trainset, frac = 1) ### creating a grid of two tuning parameters, .sigma comes from the earlier line. we are trying to find best value of .C svmTuneGrid <- data.frame(.sigma = sigDist[1], .C = 2^(-2:7)) set.seed(1056) svmFit <- train(class ~ ., data = trainset, method = "svmRadial", preProc = c("center", "scale"), tuneGrid = svmTuneGrid, trControl = trainControl(method = "repeatedcv", repeats = 5)) ### svmFit finds the optimal values of tuning parameters and builds the model using the best parameters ### to predict class of test data predictedClasses <- predict(svmFit, testset ) str(predictedClasses) ### predict probablities but i get an error predictedProbs <- predict(svmFit, newdata = testset , type = "prob") head(predictedProbs)

new question below this line: as per below output there are 9 support vectors. how to recognize out of 12 training data points which are those 9?

svmFit$finalModel

Support Vector Machine object of class "ksvm"

SV type: C-svc (classification) parameter : cost C = 1

Gaussian Radial Basis kernel function. Hyperparameter : sigma = 0.72640759446315

Number of Support Vectors : 9

Objective Function Value : -5.6994 Training error : 0.083333

August 31, 2021 3:45 PM IST

0
Tarun Reddy

84

if i type in 'svmFit$finalModel' Support Vector Machine object of class "ksvm" SV type: C-svc (classification) parameter : cost C = 1 Gaussian Radial Basis kernel function. Hyperparameter : sigma = 0.72640759446315 Number of Support Vectors : 9 Objective Function Value : -5.6994 Training error : 0.083333

September 12, 2021 1:02 AM IST

0
Maryam Bains

317
The problem is your y variable. When you are asking for the class probabilities, the train and / or the predict function puts them into a data frame with a column for each class. If the factor levels are not valid variable names, they are automatically changed (e.g. "0" becomes "X0"). See also this post.

If you change this line in your code it should work:
```
a[,1] = factor(a[,1], labels = c("no", "yes"))
```
January 11, 2022 3:47 PM IST

0
Vaibhav Mali

259
In the train control statement, you have to specify if you want the class probabilities classProbs = TRUE returned.
```
svmFit <- train(class ~ .,
    data = trainset,
    method = "svmRadial",
    preProc = c("center", "scale"),
    tuneGrid = svmTuneGrid,
    trControl = trainControl(method = "repeatedcv", repeats = 5, 
classProbs =  TRUE))

predictedClasses <- predict(svmFit, testset )
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
```
giving the probabilities of being in the Bad or Good class in the test dataset as:
```
print(predictedProbs)
    Bad      Good
1 0.2302979 0.7697021
2 0.7135050 0.2864950
3 0.2230889 0.7769111
```
EDIT
To answer your new question, you can access the position of the support vectors in your original data set with alphaindex(svmFit$finalModel) with coefficients coef(svmFit$finalModel).
September 1, 2021 1:18 PM IST

0

Member Sign In

Member Sign In

Create Account

support vector machine train caret error kernlab class probability calculations failed; returning NAs

Connect With Us