diff options
| author | Gertjan van den Burg <gertjanvandenburg@gmail.com> | 2018-03-27 12:31:28 +0100 |
|---|---|---|
| committer | Gertjan van den Burg <gertjanvandenburg@gmail.com> | 2018-03-27 12:31:28 +0100 |
| commit | 004941896bac692d354c41a3334d20ee1d4627f7 (patch) | |
| tree | 2b11e42d8524843409e2bf8deb4ceb74c8b69347 /man | |
| parent | updates to GenSVM C library (diff) | |
| download | rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.tar.gz rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.zip | |
GenSVM R package
Diffstat (limited to 'man')
| -rw-r--r-- | man/coef.gensvm.Rd | 42 | ||||
| -rw-r--r-- | man/coef.gensvm.grid.Rd | 37 | ||||
| -rw-r--r-- | man/gensvm-package.Rd | 111 | ||||
| -rw-r--r-- | man/gensvm.Rd | 136 | ||||
| -rw-r--r-- | man/gensvm.accuracy.Rd | 34 | ||||
| -rw-r--r-- | man/gensvm.generate.cv.idx.Rd | 13 | ||||
| -rw-r--r-- | man/gensvm.grid.Rd | 161 | ||||
| -rw-r--r-- | man/gensvm.load.full.grid.Rd | 35 | ||||
| -rw-r--r-- | man/gensvm.load.small.grid.Rd | 35 | ||||
| -rw-r--r-- | man/gensvm.load.tiny.grid.Rd | 33 | ||||
| -rw-r--r-- | man/gensvm.maxabs.scale.Rd | 64 | ||||
| -rw-r--r-- | man/gensvm.rank.score.Rd | 23 | ||||
| -rw-r--r-- | man/gensvm.refit.Rd | 57 | ||||
| -rw-r--r-- | man/gensvm.train.test.split.Rd | 62 | ||||
| -rw-r--r-- | man/plot.gensvm.Rd | 68 | ||||
| -rw-r--r-- | man/plot.gensvm.grid.Rd | 41 | ||||
| -rw-r--r-- | man/predict.gensvm.Rd | 50 | ||||
| -rw-r--r-- | man/predict.gensvm.grid.Rd | 49 | ||||
| -rw-r--r-- | man/print.gensvm.Rd | 43 | ||||
| -rw-r--r-- | man/print.gensvm.grid.Rd | 38 |
20 files changed, 1132 insertions, 0 deletions
diff --git a/man/coef.gensvm.Rd b/man/coef.gensvm.Rd new file mode 100644 index 0000000..73d7a9a --- /dev/null +++ b/man/coef.gensvm.Rd @@ -0,0 +1,42 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/coef.gensvm.R +\name{coef.gensvm} +\alias{coef.gensvm} +\title{Get the coefficients of the fitted GenSVM model} +\usage{ +\method{coef}{gensvm}(object, ...) +} +\arguments{ +\item{object}{a \code{gensvm} object} + +\item{\dots}{further arguments are ignored} +} +\value{ +The coefficients of the GenSVM model. This is a matrix of size +\eqn{(n_{features} + 1) x (n_{classes} - 1)}. This matrix is used to project +the input data to a low dimensional space using the equation: \eqn{XW + t} +where \eqn{X} is the input matrix, \eqn{t} is the first row of the matrix +returned by this function, and \eqn{W} is the \eqn{n_{features} x +(n_{classes} - 1)} matrix formed by the remaining rows. +} +\description{ +Returns the model coefficients of the GenSVM object +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +fit <- gensvm(x, y) +V <- coef(fit) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/coef.gensvm.grid.Rd b/man/coef.gensvm.grid.Rd new file mode 100644 index 0000000..b8f8a40 --- /dev/null +++ b/man/coef.gensvm.grid.Rd @@ -0,0 +1,37 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/coef.gensvm.grid.R +\name{coef.gensvm.grid} +\alias{coef.gensvm.grid} +\title{Get the parameter grid from a GenSVM Grid object} +\usage{ +\method{coef}{gensvm.grid}(object, ...) +} +\arguments{ +\item{object}{a \code{gensvm.grid} object} + +\item{\dots}{further arguments are ignored} +} +\value{ +The parameter grid of the GenSVMGrid object as a data frame. +} +\description{ +Returns the parameter grid of a \code{gensvm.grid} object. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +grid <- gensvm.grid(x, y) +pg <- coef(grid) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/gensvm-package.Rd b/man/gensvm-package.Rd new file mode 100644 index 0000000..56e28ac --- /dev/null +++ b/man/gensvm-package.Rd @@ -0,0 +1,111 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm-package.R +\docType{package} +\name{gensvm-package} +\alias{gensvm-package} +\alias{gensvm.package} +\title{GenSVM: A Generalized Multiclass Support Vector Machine} +\description{ +The GenSVM classifier is a generalized multiclass support vector machine +(SVM). This classifier aims to find decision boundaries that separate the +classes with as wide a margin as possible. In GenSVM, the loss functions +that measures how misclassifications are counted is very flexible. This +allows the user to tune the classifier to the dataset at hand and +potentially obtain higher classification accuracy. Moreover, this +flexibility means that GenSVM has a number of alternative multiclass SVMs as +special cases. One of the other advantages of GenSVM is that it is trained +in the primal space, allowing the use of warm starts during optimization. +This means that for common tasks such as cross validation or repeated model +fitting, GenSVM can be trained very quickly. +} +\details{ +This package provides functions for training the GenSVM model either as a +separate model or through a cross-validated parameter grid search. In both +cases the GenSVM C library is used for speed. Auxiliary functions for +evaluating and using the model are also provided. +} +\section{GenSVM functions}{ + +The main GenSVM functions are: +\describe{ +\item{\code{\link{gensvm}}}{Fit a GenSVM model for specific model +parameters.} +\item{\code{\link{gensvm.grid}}}{Run a cross-validated grid search for +GenSVM.} +} + +For the GenSVM and GenSVMGrid models the following two functions are +available. When applied to a GenSVMGrid object, the function is applied to +the best GenSVM model. +\describe{ +\item{\code{\link{plot}}}{Plot the low-dimensional \emph{simplex} space +where the decision boundaries are fixed (for problems with 3 classes).} +\item{\code{\link{predict}}}{Predict the class labels of new data using the +GenSVM model.} +} + +Moreover, for the GenSVM and GenSVMGrid models a \code{coef} function is +defined: +\describe{ +\item{\code{\link{coef.gensvm}}}{Get the coefficients of the fitted GenSVM +model.} +\item{\code{\link{coef.gensvm.grid}}}{Get the parameter grid of the GenSVM +grid search.} +} + +The following utility functions are also included: +\describe{ +\item{\code{\link{gensvm.accuracy}}}{Compute the accuracy score between true +and predicted class labels} +\item{\code{\link{gensvm.maxabs.scale}}}{Scale each column of the dataset by +its maximum absolute value, preserving sparsity and mapping the data to [-1, +1]} +\item{\code{\link{gensvm.train.test.split}}}{Split a dataset into a training +and testing sample} +\item{\code{\link{gensvm.refit}}}{Refit a fitted GenSVM model with slightly +different parameters or on a different dataset} +} +} + +\section{Kernels in GenSVM}{ + + +GenSVM can be used for both linear and nonlinear multiclass support vector +machine classification. In general, linear classification will be faster but +depending on the dataset higher classification performance can be achieved +using a nonlinear kernel. + +The following nonlinear kernels are implemented in the GenSVM package: +\describe{ + \item{RBF}{The Radial Basis Function kernel is a well-known kernel function + based on the Euclidean distance between objects. It is defined as + \deqn{ + k(x_i, x_j) = exp( -\gamma || x_i - x_j ||^2 ) + } + } + \item{Polynomial}{A polynomial kernel can also be used in GenSVM. This + kernel function is implemented very generally and therefore takes three + parameters (\code{coef}, \code{gamma}, and \code{degree}). It is defined + as: + \deqn{ + k(x_i, x_j) = ( \gamma x_i' x_j + coef)^{degree} + } + } + \item{Sigmoid}{The sigmoid kernel is the final kernel implemented in + GenSVM. This kernel has two parameters and is implemented as follows: + \deqn{ + k(x_i, x_j) = \tanh( \gamma x_i' x_j + coef) + } + } + } +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/gensvm.Rd b/man/gensvm.Rd new file mode 100644 index 0000000..1db0558 --- /dev/null +++ b/man/gensvm.Rd @@ -0,0 +1,136 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.R +\name{gensvm} +\alias{gensvm} +\title{Fit the GenSVM model} +\usage{ +gensvm(X, y, p = 1, lambda = 1e-08, kappa = 0, epsilon = 1e-06, + weights = "unit", kernel = "linear", gamma = "auto", coef = 1, + degree = 2, kernel.eigen.cutoff = 1e-08, verbose = FALSE, + random.seed = NULL, max.iter = 1e+08, seed.V = NULL) +} +\arguments{ +\item{X}{data matrix with the predictors} + +\item{y}{class labels} + +\item{p}{parameter for the L_p norm of the loss function (1.0 <= p <= 2.0)} + +\item{lambda}{regularization parameter for the loss function (lambda > 0)} + +\item{kappa}{parameter for the hinge function in the loss function (kappa > +-1.0)} + +\item{weights}{type of instance weights to use. Options are 'unit' for unit +weights and 'group' for group size correction weight (eq. 4 in the paper).} + +\item{kernel}{the kernel type to use in the classifier. It must be one of +'linear', 'poly', 'rbf', or 'sigmoid'. See the section "Kernels in GenSVM" +in \code{\link{gensvm-package}} for more info.} + +\item{gamma}{kernel parameter for the rbf, polynomial, and sigmoid kernel. +If gamma is 'auto', then 1/n_features will be used.} + +\item{coef}{parameter for the polynomial and sigmoid kernel.} + +\item{degree}{parameter for the polynomial kernel} + +\item{kernel.eigen.cutoff}{Cutoff point for the reduced eigendecomposition +used with kernel-GenSVM. Eigenvectors for which the ratio between their +corresponding eigenvalue and the largest eigenvalue is smaller than this +cutoff value will be dropped.} + +\item{verbose}{Turn on verbose output and fit progress} + +\item{random.seed}{Seed for the random number generator (useful for +reproducible output)} + +\item{max.iter}{Maximum number of iterations of the optimization algorithm.} + +\item{seed.V}{Matrix to warm-start the optimization algorithm. This is +typically the output of \code{coef(fit)}. Note that this function will +silently drop seed.V if the dimensions don't match the provided data.} +} +\value{ +A "gensvm" S3 object is returned for which the print, predict, coef, +and plot methods are available. It has the following items: +\item{call}{The call that was used to construct the model.} +\item{p}{The value of the lp norm in the loss function} +\item{lambda}{The regularization parameter used in the model.} +\item{kappa}{The hinge function parameter used.} +\item{epsilon}{The stopping criterion used.} +\item{weights}{The instance weights type used.} +\item{kernel}{The kernel function used.} +\item{gamma}{The value of the gamma parameter of the kernel, if applicable} +\item{coef}{The value of the coef parameter of the kernel, if applicable} +\item{degree}{The degree of the kernel, if applicable} +\item{kernel.eigen.cutoff}{The cutoff value of the reduced +eigendecomposition of the kernel matrix.} +\item{verbose}{Whether or not the model was fitted with progress output} +\item{random.seed}{The random seed used to seed the model.} +\item{max.iter}{Maximum number of iterations of the algorithm.} +\item{n.objects}{Number of objects in the dataset} +\item{n.features}{Number of features in the dataset} +\item{n.classes}{Number of classes in the dataset} +\item{classes}{Array with the actual class labels} +\item{V}{Coefficient matrix} +\item{n.iter}{Number of iterations performed in training} +\item{n.support}{Number of support vectors in the final model} +\item{training.time}{Total training time} +\item{X.train}{When training with nonlinear kernels, the training data is +needed to perform prediction. For these kernels it is therefore stored in +the fitted model.} +} +\description{ +Fits the Generalized Multiclass Support Vector Machine model +with the given parameters. See the package documentation +(\code{\link{gensvm-package}}) for more general information about GenSVM. +} +\note{ +This function returns partial results when the computation is interrupted by +the user. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# fit using the default parameters +fit <- gensvm(x, y) + +# fit and show progress +fit <- gensvm(x, y, verbose=T) + +# fit with some changed parameters +fit <- gensvm(x, y, lambda=1e-8) + +# Early stopping defined through epsilon +fit <- gensvm(x, y, epsilon=1e-3) + +# Early stopping defined through max.iter +fit <- gensvm(x, y, max.iter=1000) + +# Nonlinear training +fit <- gensvm(x, y, kernel='rbf') +fit <- gensvm(x, y, kernel='poly', degree=2, gamma=1.0) + +# Setting the random seed and comparing results +fit <- gensvm(x, y, random.seed=123) +fit2 <- gensvm(x, y, random.seed=123) +all.equal(coef(fit), coef(fit2)) + + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{coef}}, \code{\link{print}}, \code{\link{predict}}, +\code{\link{plot}}, and \code{\link{gensvm.grid}}. +} + diff --git a/man/gensvm.accuracy.Rd b/man/gensvm.accuracy.Rd new file mode 100644 index 0000000..60a0f89 --- /dev/null +++ b/man/gensvm.accuracy.Rd @@ -0,0 +1,34 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.accuracy.R +\name{gensvm.accuracy} +\alias{gensvm.accuracy} +\title{Compute the accuracy score} +\usage{ +gensvm.accuracy(y.true, y.pred) +} +\arguments{ +\item{y.true}{vector of true labels} + +\item{y.pred}{vector of predicted labels} +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +fit <- gensvm(x, y) +gensvm.accuracy(predict(fit, x), y) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{predict.gensvm.grid}} +} + diff --git a/man/gensvm.generate.cv.idx.Rd b/man/gensvm.generate.cv.idx.Rd new file mode 100644 index 0000000..34f4f64 --- /dev/null +++ b/man/gensvm.generate.cv.idx.Rd @@ -0,0 +1,13 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.generate.cv.idx} +\alias{gensvm.generate.cv.idx} +\title{Generate a vector of cross-validation indices} +\usage{ +gensvm.generate.cv.idx(n, folds) +} +\description{ +This function generates a vector of length \code{n} with values from 0 to +\code{folds-1} to mark train and test splits. +} + diff --git a/man/gensvm.grid.Rd b/man/gensvm.grid.Rd new file mode 100644 index 0000000..6dbec22 --- /dev/null +++ b/man/gensvm.grid.Rd @@ -0,0 +1,161 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.grid} +\alias{gensvm.grid} +\title{Cross-validated grid search for GenSVM} +\usage{ +gensvm.grid(X, y, param.grid = "tiny", refit = TRUE, scoring = NULL, + cv = 3, verbose = 0, return.train.score = TRUE) +} +\arguments{ +\item{X}{training data matrix. We denote the size of this matrix by +n_samples x n_features.} + +\item{y}{training vector of class labes of length n_samples. The number of +unique labels in this vector is denoted by n_classes.} + +\item{param.grid}{String (\code{'tiny'}, \code{'small'}, or \code{'full'}) +or data frame with parameter configurations to evaluate. Typically this is +the output of \code{expand.grid}. For more details, see "Using a Parameter +Grid" below.} + +\item{refit}{boolean variable. If true, the best model from cross validation +is fitted again on the entire dataset.} + +\item{scoring}{metric to use to evaluate the classifier performance during +cross validation. The metric should be an R function that takes two +arguments: y_true and y_pred and that returns a float such that higher +values are better. If it is NULL, the accuracy score will be used.} + +\item{cv}{the number of cross-validation folds to use or a vector with the +same length as \code{y} where each unique value denotes a test split.} + +\item{verbose}{integer to indicate the level of verbosity (higher is more +verbose)} + +\item{return.train.score}{whether or not to return the scores on the +training splits} +} +\value{ +A "gensvm.grid" S3 object with the following items: +\item{call}{Call that produced this object} +\item{param.grid}{Sorted version of the parameter grid used in training} +\item{cv.results}{A data frame with the cross validation results} +\item{best.estimator}{If refit=TRUE, this is the GenSVM model fitted with +the best hyperparameter configuration, otherwise it is NULL} +\item{best.score}{Mean cross-validated test score for the model with the +best hyperparameter configuration} +\item{best.params}{Parameter configuration that provided the highest mean +cross-validated test score} +\item{best.index}{Row index of the cv.results data frame that corresponds to +the best hyperparameter configuration} +\item{n.splits}{The number of cross-validation splits} +\item{n.objects}{The number of instances in the data} +\item{n.features}{The number of features of the data} +\item{n.classes}{The number of classes in the data} +\item{classes}{Array with the unique classes in the data} +\item{total.time}{Training time for the grid search} +\item{cv.idx}{Array with cross validation indices used to split the data} +} +\description{ +This function performs a cross-validated grid search of the +model parameters to find the best hyperparameter configuration for a given +dataset. This function takes advantage of GenSVM's ability to use warm +starts to speed up computation. The function uses the GenSVM C library for +speed. +} +\note{ +This function returns partial results when the computation is interrupted by +the user. +} +\section{Using a Parameter Grid}{ + +To evaluate certain paramater configurations, a data frame can be supplied +to the \code{param.grid} argument of the function. Such a data frame can +easily be generated using the R function \code{expand.grid}, or could be +created through other ways to test specific parameter configurations. + +Three parameter grids are predefined: +\describe{ +\item{\code{'tiny'}}{This parameter grid is generated by the function +\code{\link{gensvm.load.tiny.grid}} and is the default parameter grid. It +consists of parameter configurations that are likely to perform well on +various datasets.} +\item{\code{'small'}}{This grid is generated by +\code{\link{gensvm.load.small.grid}} and generates a data frame with 90 +configurations. It is typically fast to train but contains some +configurations that are unlikely to perform well. It is included for +educational purposes.} +\item{\code{'full'}}{This grid loads the parameter grid as used in the +GenSVM paper. It consists of 342 configurations and is generated by the +\code{\link{gensvm.load.full.grid}} function. Note that in the GenSVM paper +cross validation was done with this parameter grid, but the final training +step used \code{epsilon=1e-8}. The \code{\link{gensvm.refit}} function is +useful in this scenario.} +} + +When you provide your own parameter grid, beware that only certain column +names are allowed in the data frame corresponding to parameters for the +GenSVM model. These names are: + +\describe{ +\item{p}{Parameter for the lp norm. Must be in [1.0, 2.0].} +\item{kappa}{Parameter for the Huber hinge function. Must be larger than +-1.} +\item{lambda}{Parameter for the regularization term. Must be larger than 0.} +\item{weight}{Instance weight specification. Allowed values are "unit" for +unit weights and "group" for group-size correction weights} +\item{epsilon}{Stopping parameter for the algorithm. Must be larger than 0.} +\item{max.iter}{Maximum number of iterations of the algorithm. Must be +larger than 0.} +\item{kernel}{The kernel to used, allowed values are "linear", "poly", +"rbf", and "sigmoid". The default is "linear"} +\item{coef}{Parameter for the "poly" and "sigmoid" kernels. See the section +"Kernels in GenSVM" in the code{ink{gensvm-package}} page for more info.} +\item{degree}{Parameter for the "poly" kernel. See the section "Kernels in +GenSVM" in the code{ink{gensvm-package}} page for more info.} +\item{gamma}{Parameter for the "poly", "rbf", and "sigmoid" kernels. See the +section "Kernels in GenSVM" in the code{ink{gensvm-package}} page for more +info.} +} + +For variables that are not present in the \code{param.grid} data frame the +default parameter values in the \code{\link{gensvm}} function will be used. + +Note that this function reorders the parameter grid to make the warm starts +as efficient as possible, which is why the param.grid in the result will not +be the same as the param.grid in the input. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# use the default parameter grid +grid <- gensvm.grid(x, y) + +# use a smaller parameter grid +pg <- expand.grid(p=c(1.0, 1.5, 2.0), kappa=c(-0.9, 1.0), epsilon=c(1e-3)) +grid <- gensvm.grid(x, y, param.grid=pg) + +# print the result +print(grid) + +# Using a custom scoring function (accuracy as percentage) +acc.pct <- function(yt, yp) { return (100 * sum(yt == yp) / length(yt)) } +grid <- gensvm.grid(x, y, scoring=acc.pct) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{predict.gensvm.grid}}, \code{\link{print.gensvm.grid}}, and +\code{\link{gensvm}}. +} + diff --git a/man/gensvm.load.full.grid.Rd b/man/gensvm.load.full.grid.Rd new file mode 100644 index 0000000..5398ef7 --- /dev/null +++ b/man/gensvm.load.full.grid.Rd @@ -0,0 +1,35 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.load.full.grid} +\alias{gensvm.load.full.grid} +\title{Load a large parameter grid for the GenSVM grid search} +\usage{ +gensvm.load.full.grid() +} +\description{ +This loads the parameter grid from the GenSVM paper. It +consists of 342 configurations and is constructed from all possible +combinations of the following parameter sets: + +\code{p = c(1.0, 1.5, 2.0)} + +\code{lambda = 2^seq(-18, 18, 2)} + +\code{kappa = c(-0.9, 0.5, 5.0)} + +\code{weight = c('unit', 'group')} +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{gensvm.grid}}, \code{\link{gensvm.load.tiny.grid}}, +\code{\link{gensvm.load.full.grid}}. +} + diff --git a/man/gensvm.load.small.grid.Rd b/man/gensvm.load.small.grid.Rd new file mode 100644 index 0000000..0866f0c --- /dev/null +++ b/man/gensvm.load.small.grid.Rd @@ -0,0 +1,35 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.load.small.grid} +\alias{gensvm.load.small.grid} +\title{Load the default parameter grid for the GenSVM grid search} +\usage{ +gensvm.load.small.grid() +} +\description{ +This function loads a default parameter grid to use for the +GenSVM gridsearch. It contains all possible combinations of the following +parameter sets: + +\code{p = c(1.0, 1.5, 2.0)} + +\code{lambda = c(1e-8, 1e-6, 1e-4, 1e-2, 1)} + +\code{kappa = c(-0.9, 0.5, 5.0)} + +\code{weight = c('unit', 'group')} +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{gensvm.grid}}, \code{\link{gensvm.load.tiny.grid}}, +\code{\link{gensvm.load.small.grid}}. +} + diff --git a/man/gensvm.load.tiny.grid.Rd b/man/gensvm.load.tiny.grid.Rd new file mode 100644 index 0000000..9ef0694 --- /dev/null +++ b/man/gensvm.load.tiny.grid.Rd @@ -0,0 +1,33 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.load.tiny.grid} +\alias{gensvm.load.tiny.grid} +\title{Load a tiny parameter grid for the GenSVM grid search} +\usage{ +gensvm.load.tiny.grid() +} +\description{ +This function returns a parameter grid to use in the GenSVM +grid search. This grid was obtained by analyzing the experiments done for +the GenSVM paper and selecting the configurations that achieve accuracy +within the 95th percentile on over 90% of the datasets. It is a good start +for a parameter search with a reasonably high chance of achieving good +performance on most datasets. + +Note that this grid is only tested to work well in combination with the +linear kernel. +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} +\seealso{ +\code{\link{gensvm.grid}}, \code{\link{gensvm.load.small.grid}}, +\code{\link{gensvm.load.full.grid}}. +} + diff --git a/man/gensvm.maxabs.scale.Rd b/man/gensvm.maxabs.scale.Rd new file mode 100644 index 0000000..50c6413 --- /dev/null +++ b/man/gensvm.maxabs.scale.Rd @@ -0,0 +1,64 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.maxabs.scale.R +\name{gensvm.maxabs.scale} +\alias{gensvm.maxabs.scale} +\title{Scale each column of a matrix by its maximum absolute value} +\usage{ +gensvm.maxabs.scale(x, x.test = NULL) +} +\arguments{ +\item{x}{a matrix to scale} + +\item{x.test}{(optional) a test matrix to scale as well.} +} +\value{ +if x.test=NULL a scaled matrix where the maximum value of the +columns is 1 and the minimum value of the columns isn't below -1. If x.test +is supplied, a list with elements \code{x} and \code{x.test} representing +the scaled datasets. +} +\description{ +Scaling a dataset can creatly decrease the computation time of +GenSVM. This function scales the data by dividing each column of a matrix by +the maximum absolute value of that column. This preserves sparsity in the +data while mapping each column to the interval [-1, 1]. + +Optionally a test dataset can be provided as well. In this case, the scaling +will be computed on the first argument (\code{x}) and applied to the test +dataset. Note that the return value is a list when this argument is +supplied. +} +\examples{ +x <- iris[, -5] + +# check the min and max of the columns +apply(x, 2, min) +apply(x, 2, max) + +# scale the data +x.scale <- gensvm.maxabs.scale(x) + +# check again (max should be 1.0, min shouldn't be below -1) +apply(x.scale, 2, min) +apply(x.scale, 2, max) + +# with a train and test dataset +x <- iris[, -5] +split <- gensvm.train.test.split(x) +x.train <- split$x.train +x.test <- split$x.test +scaled <- gensvm.maxabs.scale(x.train, x.test) +x.train.scl <- scaled$x +x.test.scl <- scaled$x.test + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/gensvm.rank.score.Rd b/man/gensvm.rank.score.Rd new file mode 100644 index 0000000..21d6bcd --- /dev/null +++ b/man/gensvm.rank.score.Rd @@ -0,0 +1,23 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.grid.R +\name{gensvm.rank.score} +\alias{gensvm.rank.score} +\title{Compute the ranks for the numbers in a given vector} +\usage{ +gensvm.rank.score(x) +} +\arguments{ +\item{x}{array of numeric values} +} +\details{ +This function computes the ranks for the values in an array. The highest +value gets the smallest rank. Ties are broken by assigning the smallest +value. +} +\examples{ +x <- c(7, 0.1, 0.5, 0.1, 10, 100, 200) +gensvm.rank.score(x) +[ 4 6 5 6 3 2 1 ] + +} + diff --git a/man/gensvm.refit.Rd b/man/gensvm.refit.Rd new file mode 100644 index 0000000..194cde3 --- /dev/null +++ b/man/gensvm.refit.Rd @@ -0,0 +1,57 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.refit.R +\name{gensvm.refit} +\alias{gensvm.refit} +\title{Train an already fitted model on new data} +\usage{ +gensvm.refit(fit, X, y, p = NULL, lambda = NULL, kappa = NULL, + epsilon = NULL, weights = NULL, kernel = NULL, gamma = NULL, + coef = NULL, degree = NULL, kernel.eigen.cutoff = NULL, + max.iter = NULL, verbose = NULL, random.seed = NULL) +} +\arguments{ +\item{fit}{Fitted \code{gensvm} object} + +\item{X}{Data matrix of the new data} + +\item{y}{Label vector of the new data} + +\item{verbose}{Turn on verbose output and fit progress. If NULL (the +default) the value from the fitted model is chosen.} +} +\value{ +a new fitted \code{gensvm} model +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# fit a standard model and refit with slightly different parameters +fit <- gensvm(x, y) +fit2 <- gensvm.refit(x, y, epsilon=1e-8) + +# refit a model returned by a grid search +grid <- gensvm.grid(x, y) +fit <- gensvm.refit(fit, x, y, epsilon=1e-8) + +# refit on different data +idx <- runif(nrow(x)) > 0.5 +x1 <- x[idx, ] +x2 <- x[!idx, ] +y1 <- y[idx] +y2 <- y[!idx] + +fit1 <- gensvm(x1, y1) +fit2 <- gensvm.refit(fit1, x2, y2) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/gensvm.train.test.split.Rd b/man/gensvm.train.test.split.Rd new file mode 100644 index 0000000..a99940f --- /dev/null +++ b/man/gensvm.train.test.split.Rd @@ -0,0 +1,62 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/gensvm.train.test.split.R +\name{gensvm.train.test.split} +\alias{gensvm.train.test.split} +\title{Create a train/test split of a dataset} +\usage{ +gensvm.train.test.split(x, y = NULL, train.size = NULL, test.size = NULL, + shuffle = TRUE, random.state = NULL, return.idx = FALSE) +} +\arguments{ +\item{x}{array to split} + +\item{y}{another array to split (typically this is a vector)} + +\item{train.size}{size of the training dataset. This can be provided as +float or as int. If it's a float, it should be between 0.0 and 1.0 and +represents the fraction of the dataset that should be placed in the training +dataset. If it's an int, it represents the exact number of samples in the +training dataset. If it is NULL, the complement of \code{test.size} will be +used.} + +\item{test.size}{size of the test dataset. Similarly to train.size both a +float or an int can be supplied. If it's NULL, the complement of train.size +will be used. If both train.size and test.size are NULL, a default test.size +of 0.25 will be used.} + +\item{shuffle}{shuffle the rows or not} + +\item{random.state}{seed for the random number generator (int)} +} +\description{ +Often it is desirable to split a dataset into a training and +testing sample. This function is included in GenSVM to make it easy to do +so. The function is inspired by a similar function in Scikit-Learn. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# using the default values +split <- gensvm.train.test.split(x, y) + +# using the split in a GenSVM model +fit <- gensvm(split$x.train, split$y.train) +gensvm.accuracy(split$y.test, predict(fit, split$x.test)) + +# using attach makes the results directly available +attach(gensvm.train.test.split(x, y)) +fit <- gensvm(x.train, y.train) +gensvm.accuracy(y.test, predict(fit, x.test)) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/plot.gensvm.Rd b/man/plot.gensvm.Rd new file mode 100644 index 0000000..b597e18 --- /dev/null +++ b/man/plot.gensvm.Rd @@ -0,0 +1,68 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/plot.gensvm.R +\name{plot.gensvm} +\alias{plot.gensvm} +\title{Plot the simplex space of the fitted GenSVM model} +\usage{ +\method{plot}{gensvm}(fit, x, y.true = NULL, with.margins = TRUE, + with.shading = TRUE, with.legend = TRUE, center.plot = TRUE, ...) +} +\arguments{ +\item{fit}{A fitted \code{gensvm} object} + +\item{x}{the dataset to plot} + +\item{y.true}{the true data labels. If provided the objects will be colored +using the true labels instead of the predicted labels. This makes it easy to +identify misclassified objects.} + +\item{with.margins}{plot the margins} + +\item{with.shading}{show shaded areas for the class regions} + +\item{with.legend}{show the legend for the class labels} + +\item{center.plot}{ensure that the boundaries and margins are always visible +in the plot} + +\item{...}{further arguments are ignored} +} +\value{ +returns the object passed as input +} +\description{ +This function creates a plot of the simplex space for a fitted +GenSVM model and the given data set, as long as the dataset consists of only +3 classes. For more than 3 classes, the simplex space is too high +dimensional to easily visualize. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# train the model +fit <- gensvm(x, y) + +# plot the simplex space +plot(fit, x) + +# plot and use the true colors (easier to spot misclassified samples) +plot(fit, x, y.true=y) + +# plot only misclassified samples +x.mis <- x[predict(fit, x) != y, ] +y.mis.true <- y[predict(fit, x) != y, ] +plot(fit, x.bad) +plot(fit, x.bad, y.true=y.mis.true) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/plot.gensvm.grid.Rd b/man/plot.gensvm.grid.Rd new file mode 100644 index 0000000..d54196f --- /dev/null +++ b/man/plot.gensvm.grid.Rd @@ -0,0 +1,41 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/plot.gensvm.grid.R +\name{plot.gensvm.grid} +\alias{plot.gensvm.grid} +\title{Plot the simplex space of the best fitted model in the GenSVMGrid} +\usage{ +\method{plot}{gensvm.grid}(grid, x, ...) +} +\arguments{ +\item{grid}{A \code{gensvm.grid} object trained with refit=TRUE} + +\item{x}{the dataset to plot} + +\item{...}{further arguments are passed to the plot function} +} +\value{ +returns the object passed as input +} +\description{ +This is a wrapper which calls the plot function for the best +model in the provided GenSVMGrid object. See the documentation for +\code{\link{plot.gensvm}} for more information. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +grid <- gensvm.grid(x, y) +plot(grid, x) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/predict.gensvm.Rd b/man/predict.gensvm.Rd new file mode 100644 index 0000000..0c55a43 --- /dev/null +++ b/man/predict.gensvm.Rd @@ -0,0 +1,50 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/predict.gensvm.R +\name{predict.gensvm} +\alias{predict} +\alias{predict.gensvm} +\title{Predict class labels with the GenSVM model} +\usage{ +\method{predict}{gensvm}(fit, x.test, ...) +} +\arguments{ +\item{fit}{Fitted \code{gensvm} object} + +\item{x.test}{Matrix of new values for \code{x} for which predictions need +to be made.} + +\item{\dots}{further arguments are ignored} +} +\value{ +a vector of class labels, with the same type as the original class +labels. +} +\description{ +This function predicts the class labels of new data using a +fitted GenSVM model. +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# create a training and test sample +attach(gensvm.train.test.split(x, y)) +fit <- gensvm(x.train, y.train) + +# predict the class labels of the test sample +y.test.pred <- predict(fit, x.test) + +# compute the accuracy with gensvm.accuracy +gensvm.accuracy(y.test, y.test.pred) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/predict.gensvm.grid.Rd b/man/predict.gensvm.grid.Rd new file mode 100644 index 0000000..d4cbd68 --- /dev/null +++ b/man/predict.gensvm.grid.Rd @@ -0,0 +1,49 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/predict.gensvm.grid.R +\name{predict.gensvm.grid} +\alias{predict.gensvm.grid} +\title{Predict class labels from the GenSVMGrid class} +\usage{ +\method{predict}{gensvm.grid}(grid, newx, ...) +} +\arguments{ +\item{grid}{A \code{gensvm.grid} object trained with \code{refit=TRUE}} + +\item{newx}{Matrix of new values for \code{x} for which predictions need to +be computed.} + +\item{\dots}{further arguments are passed to predict.gensvm()} +} +\value{ +a vector of class labels, with the same type as the original class +labels provided to gensvm.grid() +} +\description{ +Predict class labels using the best model from a grid search. +After doing a grid search with the \code{\link{gensvm.grid}} function, this +function can be used to make predictions of class labels. It uses the best +GenSVM model found during the grid search to do the predictions. Note that +this model is only available if \code{refit=TRUE} was specified in the +\code{\link{gensvm.grid}} call (the default). +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# run a grid search +grid <- gensvm.grid(x, y) + +# predict training sample +y.hat <- predict(grid, x) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/print.gensvm.Rd b/man/print.gensvm.Rd new file mode 100644 index 0000000..75a44b2 --- /dev/null +++ b/man/print.gensvm.Rd @@ -0,0 +1,43 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/print.gensvm.R +\name{print.gensvm} +\alias{print.gensvm} +\title{Print the fitted GenSVM model} +\usage{ +\method{print}{gensvm}(fit, ...) +} +\arguments{ +\item{fit}{A \code{gensvm} object to print} + +\item{\dots}{further arguments are ignored} +} +\value{ +returns the object passed as input. This can be useful for chaining +operations on a fit object. +} +\description{ +Prints a short description of the fitted GenSVM model +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# fit and print the model +fit <- gensvm(x, y) +print(fit) + +# (advanced) use the fact that print returns the fitted model +fit <- gensvm(x, y) +predict(print(fit), x) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + diff --git a/man/print.gensvm.grid.Rd b/man/print.gensvm.grid.Rd new file mode 100644 index 0000000..8a65370 --- /dev/null +++ b/man/print.gensvm.grid.Rd @@ -0,0 +1,38 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/print.gensvm.grid.R +\name{print.gensvm.grid} +\alias{print.gensvm.grid} +\title{Print the fitted GenSVMGrid model} +\usage{ +\method{print}{gensvm.grid}(grid, ...) +} +\arguments{ +\item{grid}{a \code{gensvm.grid} object to print} + +\item{\dots}{further arguments are ignored} +} +\value{ +returns the object passed as input +} +\description{ +Prints the summary of the fitted GenSVMGrid model +} +\examples{ +x <- iris[, -5] +y <- iris[, 5] + +# fit a grid search and print the resulting object +grid <- gensvm.grid(x, y) +print(grid) + +} +\author{ +Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr +Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com> +} +\references{ +Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized +Multiclass Support Vector Machine}, Journal of Machine Learning Research, +17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}. +} + |
