GenSVM R package

author: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2018-03-27 12:31:28 +0100
committer: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2018-03-27 12:31:28 +0100
commit: 004941896bac692d354c41a3334d20ee1d4627f7 (patch)
tree: 2b11e42d8524843409e2bf8deb4ceb74c8b69347 /man
parent: updates to GenSVM C library (diff)
download: rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.tar.gz
rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.zip
20 files changed, 1132 insertions, 0 deletions
diff --git a/man/coef.gensvm.Rd b/man/coef.gensvm.Rd
new file mode 100644
index 0000000..73d7a9a
--- /dev/null
+++ b/man/coef.gensvm.Rd
@@ -0,0 +1,42 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/coef.gensvm.R
+\name{coef.gensvm}
+\alias{coef.gensvm}
+\title{Get the coefficients of the fitted GenSVM model}
+\usage{
+\method{coef}{gensvm}(object, ...)
+}
+\arguments{
+\item{object}{a \code{gensvm} object}
+
+\item{\dots}{further arguments are ignored}
+}
+\value{
+The coefficients of the GenSVM model. This is a matrix of size
+\eqn{(n_{features} + 1) x (n_{classes} - 1)}. This matrix is used to project 
+the input data to a low dimensional space using the equation: \eqn{XW + t} 
+where \eqn{X} is the input matrix, \eqn{t} is the first row of the matrix 
+returned by this function, and \eqn{W} is the \eqn{n_{features} x 
+(n_{classes} - 1)} matrix formed by the remaining rows.
+}
+\description{
+Returns the model coefficients of the GenSVM object
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+fit <- gensvm(x, y)
+V <- coef(fit)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/coef.gensvm.grid.Rd b/man/coef.gensvm.grid.Rd
new file mode 100644
index 0000000..b8f8a40
--- /dev/null
+++ b/man/coef.gensvm.grid.Rd
@@ -0,0 +1,37 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/coef.gensvm.grid.R
+\name{coef.gensvm.grid}
+\alias{coef.gensvm.grid}
+\title{Get the parameter grid from a GenSVM Grid object}
+\usage{
+\method{coef}{gensvm.grid}(object, ...)
+}
+\arguments{
+\item{object}{a \code{gensvm.grid} object}
+
+\item{\dots}{further arguments are ignored}
+}
+\value{
+The parameter grid of the GenSVMGrid object as a data frame.
+}
+\description{
+Returns the parameter grid of a \code{gensvm.grid} object.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+grid <- gensvm.grid(x, y)
+pg <- coef(grid)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/gensvm-package.Rd b/man/gensvm-package.Rd
new file mode 100644
index 0000000..56e28ac
--- /dev/null
+++ b/man/gensvm-package.Rd
@@ -0,0 +1,111 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm-package.R
+\docType{package}
+\name{gensvm-package}
+\alias{gensvm-package}
+\alias{gensvm.package}
+\title{GenSVM: A Generalized Multiclass Support Vector Machine}
+\description{
+The GenSVM classifier is a generalized multiclass support vector machine 
+(SVM). This classifier aims to find decision boundaries that separate the 
+classes with as wide a margin as possible. In GenSVM, the loss functions 
+that measures how misclassifications are counted is very flexible.  This 
+allows the user to tune the classifier to the dataset at hand and 
+potentially obtain higher classification accuracy. Moreover, this 
+flexibility means that GenSVM has a number of alternative multiclass SVMs as 
+special cases. One of the other advantages of GenSVM is that it is trained 
+in the primal space, allowing the use of warm starts during optimization.  
+This means that for common tasks such as cross validation or repeated model 
+fitting, GenSVM can be trained very quickly.
+}
+\details{
+This package provides functions for training the GenSVM model either as a 
+separate model or through a cross-validated parameter grid search. In both 
+cases the GenSVM C library is used for speed. Auxiliary functions for 
+evaluating and using the model are also provided.
+}
+\section{GenSVM functions}{
+
+The main GenSVM functions are:
+\describe{
+\item{\code{\link{gensvm}}}{Fit a GenSVM model for specific model 
+parameters.}
+\item{\code{\link{gensvm.grid}}}{Run a cross-validated grid search for 
+GenSVM.}
+}
+
+For the GenSVM and GenSVMGrid models the following two functions are 
+available. When applied to a GenSVMGrid object, the function is applied to 
+the best GenSVM model.
+\describe{
+\item{\code{\link{plot}}}{Plot the low-dimensional \emph{simplex} space 
+where the decision boundaries are fixed (for problems with 3 classes).}
+\item{\code{\link{predict}}}{Predict the class labels of new data using the 
+GenSVM model.}
+}
+
+Moreover, for the GenSVM and GenSVMGrid models a \code{coef} function is 
+defined:
+\describe{
+\item{\code{\link{coef.gensvm}}}{Get the coefficients of the fitted GenSVM 
+model.}
+\item{\code{\link{coef.gensvm.grid}}}{Get the parameter grid of the GenSVM 
+grid search.}
+}
+
+The following utility functions are also included:
+\describe{
+\item{\code{\link{gensvm.accuracy}}}{Compute the accuracy score between true 
+and predicted class labels}
+\item{\code{\link{gensvm.maxabs.scale}}}{Scale each column of the dataset by 
+its maximum absolute value, preserving sparsity and mapping the data to [-1, 
+1]}
+\item{\code{\link{gensvm.train.test.split}}}{Split a dataset into a training 
+and testing sample}
+\item{\code{\link{gensvm.refit}}}{Refit a fitted GenSVM model with slightly 
+different parameters or on a different dataset}
+}
+}
+
+\section{Kernels in GenSVM}{
+
+
+GenSVM can be used for both linear and nonlinear multiclass support vector 
+machine classification. In general, linear classification will be faster but 
+depending on the dataset higher classification performance can be achieved 
+using a nonlinear kernel.
+
+The following nonlinear kernels are implemented in the GenSVM package:
+\describe{
+ \item{RBF}{The Radial Basis Function kernel is a well-known kernel function 
+ based on the Euclidean distance between objects. It is defined as
+ \deqn{
+     k(x_i, x_j) = exp( -\gamma || x_i - x_j ||^2 )
+     }
+     }
+ \item{Polynomial}{A polynomial kernel can also be used in GenSVM. This 
+ kernel function is implemented very generally and therefore takes three 
+ parameters (\code{coef}, \code{gamma}, and \code{degree}). It is defined 
+ as:
+ \deqn{
+     k(x_i, x_j) = ( \gamma x_i' x_j + coef)^{degree}
+ }
+ }
+ \item{Sigmoid}{The sigmoid kernel is the final kernel implemented in 
+ GenSVM. This kernel has two parameters and is implemented as follows:
+ \deqn{
+     k(x_i, x_j) = \tanh( \gamma x_i' x_j + coef)
+ }
+ }
+ }
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/gensvm.Rd b/man/gensvm.Rd
new file mode 100644
index 0000000..1db0558
--- /dev/null
+++ b/man/gensvm.Rd
@@ -0,0 +1,136 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.R
+\name{gensvm}
+\alias{gensvm}
+\title{Fit the GenSVM model}
+\usage{
+gensvm(X, y, p = 1, lambda = 1e-08, kappa = 0, epsilon = 1e-06,
+  weights = "unit", kernel = "linear", gamma = "auto", coef = 1,
+  degree = 2, kernel.eigen.cutoff = 1e-08, verbose = FALSE,
+  random.seed = NULL, max.iter = 1e+08, seed.V = NULL)
+}
+\arguments{
+\item{X}{data matrix with the predictors}
+
+\item{y}{class labels}
+
+\item{p}{parameter for the L_p norm of the loss function (1.0 <= p <= 2.0)}
+
+\item{lambda}{regularization parameter for the loss function (lambda > 0)}
+
+\item{kappa}{parameter for the hinge function in the loss function (kappa > 
+-1.0)}
+
+\item{weights}{type of instance weights to use. Options are 'unit' for unit 
+weights and 'group' for group size correction weight (eq. 4 in the paper).}
+
+\item{kernel}{the kernel type to use in the classifier. It must be one of 
+'linear', 'poly', 'rbf', or 'sigmoid'. See the section "Kernels in GenSVM" 
+in \code{\link{gensvm-package}} for more info.}
+
+\item{gamma}{kernel parameter for the rbf, polynomial, and sigmoid kernel.
+If gamma is 'auto', then 1/n_features will be used.}
+
+\item{coef}{parameter for the polynomial and sigmoid kernel.}
+
+\item{degree}{parameter for the polynomial kernel}
+
+\item{kernel.eigen.cutoff}{Cutoff point for the reduced eigendecomposition 
+used with kernel-GenSVM. Eigenvectors for which the ratio between their 
+corresponding eigenvalue and the largest eigenvalue is smaller than this 
+cutoff value will be dropped.}
+
+\item{verbose}{Turn on verbose output and fit progress}
+
+\item{random.seed}{Seed for the random number generator (useful for 
+reproducible output)}
+
+\item{max.iter}{Maximum number of iterations of the optimization algorithm.}
+
+\item{seed.V}{Matrix to warm-start the optimization algorithm. This is 
+typically the output of \code{coef(fit)}. Note that this function will 
+silently drop seed.V if the dimensions don't match the provided data.}
+}
+\value{
+A "gensvm" S3 object is returned for which the print, predict, coef, 
+and plot methods are available. It has the following items:
+\item{call}{The call that was used to construct the model.}
+\item{p}{The value of the lp norm in the loss function}
+\item{lambda}{The regularization parameter used in the model.}
+\item{kappa}{The hinge function parameter used.}
+\item{epsilon}{The stopping criterion used.}
+\item{weights}{The instance weights type used.}
+\item{kernel}{The kernel function used.}
+\item{gamma}{The value of the gamma parameter of the kernel, if applicable}
+\item{coef}{The value of the coef parameter of the kernel, if applicable}
+\item{degree}{The degree of the kernel, if applicable}
+\item{kernel.eigen.cutoff}{The cutoff value of the reduced 
+eigendecomposition of the kernel matrix.}
+\item{verbose}{Whether or not the model was fitted with progress output}
+\item{random.seed}{The random seed used to seed the model.}
+\item{max.iter}{Maximum number of iterations of the algorithm.}
+\item{n.objects}{Number of objects in the dataset}
+\item{n.features}{Number of features in the dataset}
+\item{n.classes}{Number of classes in the dataset}
+\item{classes}{Array with the actual class labels}
+\item{V}{Coefficient matrix}
+\item{n.iter}{Number of iterations performed in training}
+\item{n.support}{Number of support vectors in the final model}
+\item{training.time}{Total training time}
+\item{X.train}{When training with nonlinear kernels, the training data is 
+needed to perform prediction. For these kernels it is therefore stored in 
+the fitted model.}
+}
+\description{
+Fits the Generalized Multiclass Support Vector Machine model 
+with the given parameters. See the package documentation 
+(\code{\link{gensvm-package}}) for more general information about GenSVM.
+}
+\note{
+This function returns partial results when the computation is interrupted by 
+the user.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# fit using the default parameters
+fit <- gensvm(x, y)
+
+# fit and show progress
+fit <- gensvm(x, y, verbose=T)
+
+# fit with some changed parameters
+fit <- gensvm(x, y, lambda=1e-8)
+
+# Early stopping defined through epsilon
+fit <- gensvm(x, y, epsilon=1e-3)
+
+# Early stopping defined through max.iter
+fit <- gensvm(x, y, max.iter=1000)
+
+# Nonlinear training
+fit <- gensvm(x, y, kernel='rbf')
+fit <- gensvm(x, y, kernel='poly', degree=2, gamma=1.0)
+
+# Setting the random seed and comparing results
+fit <- gensvm(x, y, random.seed=123)
+fit2 <- gensvm(x, y, random.seed=123)
+all.equal(coef(fit), coef(fit2))
+
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{coef}}, \code{\link{print}}, \code{\link{predict}}, 
+\code{\link{plot}}, and \code{\link{gensvm.grid}}.
+}
+
diff --git a/man/gensvm.accuracy.Rd b/man/gensvm.accuracy.Rd
new file mode 100644
index 0000000..60a0f89
--- /dev/null
+++ b/man/gensvm.accuracy.Rd
@@ -0,0 +1,34 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.accuracy.R
+\name{gensvm.accuracy}
+\alias{gensvm.accuracy}
+\title{Compute the accuracy score}
+\usage{
+gensvm.accuracy(y.true, y.pred)
+}
+\arguments{
+\item{y.true}{vector of true labels}
+
+\item{y.pred}{vector of predicted labels}
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+fit <- gensvm(x, y)
+gensvm.accuracy(predict(fit, x), y)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{predict.gensvm.grid}}
+}
+
diff --git a/man/gensvm.generate.cv.idx.Rd b/man/gensvm.generate.cv.idx.Rd
new file mode 100644
index 0000000..34f4f64
--- /dev/null
+++ b/man/gensvm.generate.cv.idx.Rd
@@ -0,0 +1,13 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.generate.cv.idx}
+\alias{gensvm.generate.cv.idx}
+\title{Generate a vector of cross-validation indices}
+\usage{
+gensvm.generate.cv.idx(n, folds)
+}
+\description{
+This function generates a vector of length \code{n} with values from 0 to 
+\code{folds-1} to mark train and test splits.
+}
+
diff --git a/man/gensvm.grid.Rd b/man/gensvm.grid.Rd
new file mode 100644
index 0000000..6dbec22
--- /dev/null
+++ b/man/gensvm.grid.Rd
@@ -0,0 +1,161 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.grid}
+\alias{gensvm.grid}
+\title{Cross-validated grid search for GenSVM}
+\usage{
+gensvm.grid(X, y, param.grid = "tiny", refit = TRUE, scoring = NULL,
+  cv = 3, verbose = 0, return.train.score = TRUE)
+}
+\arguments{
+\item{X}{training data matrix. We denote the size of this matrix by 
+n_samples x n_features.}
+
+\item{y}{training vector of class labes of length n_samples. The number of 
+unique labels in this vector is denoted by n_classes.}
+
+\item{param.grid}{String (\code{'tiny'}, \code{'small'}, or \code{'full'}) 
+or data frame with parameter configurations to evaluate.  Typically this is 
+the output of \code{expand.grid}. For more details, see "Using a Parameter 
+Grid" below.}
+
+\item{refit}{boolean variable. If true, the best model from cross validation 
+is fitted again on the entire dataset.}
+
+\item{scoring}{metric to use to evaluate the classifier performance during 
+cross validation. The metric should be an R function that takes two 
+arguments: y_true and y_pred and that returns a float such that higher 
+values are better. If it is NULL, the accuracy score will be used.}
+
+\item{cv}{the number of cross-validation folds to use or a vector with the 
+same length as \code{y} where each unique value denotes a test split.}
+
+\item{verbose}{integer to indicate the level of verbosity (higher is more 
+verbose)}
+
+\item{return.train.score}{whether or not to return the scores on the 
+training splits}
+}
+\value{
+A "gensvm.grid" S3 object with the following items:
+\item{call}{Call that produced this object}
+\item{param.grid}{Sorted version of the parameter grid used in training}
+\item{cv.results}{A data frame with the cross validation results}
+\item{best.estimator}{If refit=TRUE, this is the GenSVM model fitted with 
+the best hyperparameter configuration, otherwise it is NULL}
+\item{best.score}{Mean cross-validated test score for the model with the 
+best hyperparameter configuration}
+\item{best.params}{Parameter configuration that provided the highest mean 
+cross-validated test score}
+\item{best.index}{Row index of the cv.results data frame that corresponds to 
+the best hyperparameter configuration}
+\item{n.splits}{The number of cross-validation splits}
+\item{n.objects}{The number of instances in the data}
+\item{n.features}{The number of features of the data}
+\item{n.classes}{The number of classes in the data}
+\item{classes}{Array with the unique classes in the data}
+\item{total.time}{Training time for the grid search}
+\item{cv.idx}{Array with cross validation indices used to split the data}
+}
+\description{
+This function performs a cross-validated grid search of the 
+model parameters to find the best hyperparameter configuration for a given 
+dataset. This function takes advantage of GenSVM's ability to use warm 
+starts to speed up computation. The function uses the GenSVM C library for 
+speed.
+}
+\note{
+This function returns partial results when the computation is interrupted by 
+the user.
+}
+\section{Using a Parameter Grid}{
+
+To evaluate certain paramater configurations, a data frame can be supplied 
+to the \code{param.grid} argument of the function. Such a data frame can 
+easily be generated using the R function \code{expand.grid}, or could be 
+created through other ways to test specific parameter configurations.
+
+Three parameter grids are predefined:
+\describe{
+\item{\code{'tiny'}}{This parameter grid is generated by the function 
+\code{\link{gensvm.load.tiny.grid}} and is the default parameter grid. It 
+consists of parameter configurations that are likely to perform well on 
+various datasets.}
+\item{\code{'small'}}{This grid is generated by 
+\code{\link{gensvm.load.small.grid}} and generates a data frame with 90 
+configurations. It is typically fast to train but contains some 
+configurations that are unlikely to perform well. It is included for 
+educational purposes.}
+\item{\code{'full'}}{This grid loads the parameter grid as used in the 
+GenSVM paper. It consists of 342 configurations and is generated by the 
+\code{\link{gensvm.load.full.grid}} function. Note that in the GenSVM paper 
+cross validation was done with this parameter grid, but the final training 
+step used \code{epsilon=1e-8}. The \code{\link{gensvm.refit}} function is 
+useful in this scenario.}
+}
+
+When you provide your own parameter grid, beware that only certain column 
+names are allowed in the data frame corresponding to parameters for the 
+GenSVM model. These names are:
+
+\describe{
+\item{p}{Parameter for the lp norm. Must be in [1.0, 2.0].}
+\item{kappa}{Parameter for the Huber hinge function. Must be larger than 
+-1.}
+\item{lambda}{Parameter for the regularization term. Must be larger than 0.}
+\item{weight}{Instance weight specification. Allowed values are "unit" for 
+unit weights and "group" for group-size correction weights}
+\item{epsilon}{Stopping parameter for the algorithm. Must be larger than 0.}
+\item{max.iter}{Maximum number of iterations of the algorithm. Must be 
+larger than 0.}
+\item{kernel}{The kernel to used, allowed values are "linear", "poly", 
+"rbf", and "sigmoid". The default is "linear"}
+\item{coef}{Parameter for the "poly" and "sigmoid" kernels. See the section 
+"Kernels in GenSVM" in the code{ink{gensvm-package}} page for more info.}
+\item{degree}{Parameter for the "poly" kernel. See the section "Kernels in 
+GenSVM" in the code{ink{gensvm-package}} page for more info.}
+\item{gamma}{Parameter for the "poly", "rbf", and "sigmoid" kernels. See the 
+section "Kernels in GenSVM" in the code{ink{gensvm-package}} page for more 
+info.}
+}
+
+For variables that are not present in the \code{param.grid} data frame the 
+default parameter values in the \code{\link{gensvm}} function will be used.
+
+Note that this function reorders the parameter grid to make the warm starts 
+as efficient as possible, which is why the param.grid in the result will not 
+be the same as the param.grid in the input.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# use the default parameter grid
+grid <- gensvm.grid(x, y)
+
+# use a smaller parameter grid
+pg <- expand.grid(p=c(1.0, 1.5, 2.0), kappa=c(-0.9, 1.0), epsilon=c(1e-3))
+grid <- gensvm.grid(x, y, param.grid=pg)
+
+# print the result
+print(grid)
+
+# Using a custom scoring function (accuracy as percentage)
+acc.pct <- function(yt, yp) { return (100 * sum(yt == yp) / length(yt)) }
+grid <- gensvm.grid(x, y, scoring=acc.pct)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{predict.gensvm.grid}}, \code{\link{print.gensvm.grid}}, and 
+\code{\link{gensvm}}.
+}
+
diff --git a/man/gensvm.load.full.grid.Rd b/man/gensvm.load.full.grid.Rd
new file mode 100644
index 0000000..5398ef7
--- /dev/null
+++ b/man/gensvm.load.full.grid.Rd
@@ -0,0 +1,35 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.load.full.grid}
+\alias{gensvm.load.full.grid}
+\title{Load a large parameter grid for the GenSVM grid search}
+\usage{
+gensvm.load.full.grid()
+}
+\description{
+This loads the parameter grid from the GenSVM paper. It 
+consists of 342 configurations and is constructed from all possible 
+combinations of the following parameter sets:
+
+\code{p = c(1.0, 1.5, 2.0)}
+
+\code{lambda = 2^seq(-18, 18, 2)}
+
+\code{kappa = c(-0.9, 0.5, 5.0)}
+
+\code{weight = c('unit', 'group')}
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{gensvm.grid}}, \code{\link{gensvm.load.tiny.grid}}, 
+\code{\link{gensvm.load.full.grid}}.
+}
+
diff --git a/man/gensvm.load.small.grid.Rd b/man/gensvm.load.small.grid.Rd
new file mode 100644
index 0000000..0866f0c
--- /dev/null
+++ b/man/gensvm.load.small.grid.Rd
@@ -0,0 +1,35 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.load.small.grid}
+\alias{gensvm.load.small.grid}
+\title{Load the default parameter grid for the GenSVM grid search}
+\usage{
+gensvm.load.small.grid()
+}
+\description{
+This function loads a default parameter grid to use for the 
+GenSVM gridsearch. It contains all possible combinations of the following 
+parameter sets:
+
+\code{p = c(1.0, 1.5, 2.0)}
+
+\code{lambda = c(1e-8, 1e-6, 1e-4, 1e-2, 1)}
+
+\code{kappa = c(-0.9, 0.5, 5.0)}
+
+\code{weight = c('unit', 'group')}
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{gensvm.grid}}, \code{\link{gensvm.load.tiny.grid}}, 
+\code{\link{gensvm.load.small.grid}}.
+}
+
diff --git a/man/gensvm.load.tiny.grid.Rd b/man/gensvm.load.tiny.grid.Rd
new file mode 100644
index 0000000..9ef0694
--- /dev/null
+++ b/man/gensvm.load.tiny.grid.Rd
@@ -0,0 +1,33 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.load.tiny.grid}
+\alias{gensvm.load.tiny.grid}
+\title{Load a tiny parameter grid for the GenSVM grid search}
+\usage{
+gensvm.load.tiny.grid()
+}
+\description{
+This function returns a parameter grid to use in the GenSVM 
+grid search. This grid was obtained by analyzing the experiments done for 
+the GenSVM paper and selecting the configurations that achieve accuracy 
+within the 95th percentile on over 90% of the datasets. It is a good start 
+for a parameter search with a reasonably high chance of achieving good 
+performance on most datasets.
+
+Note that this grid is only tested to work well in combination with the 
+linear kernel.
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+\seealso{
+\code{\link{gensvm.grid}}, \code{\link{gensvm.load.small.grid}}, 
+\code{\link{gensvm.load.full.grid}}.
+}
+
diff --git a/man/gensvm.maxabs.scale.Rd b/man/gensvm.maxabs.scale.Rd
new file mode 100644
index 0000000..50c6413
--- /dev/null
+++ b/man/gensvm.maxabs.scale.Rd
@@ -0,0 +1,64 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.maxabs.scale.R
+\name{gensvm.maxabs.scale}
+\alias{gensvm.maxabs.scale}
+\title{Scale each column of a matrix by its maximum absolute value}
+\usage{
+gensvm.maxabs.scale(x, x.test = NULL)
+}
+\arguments{
+\item{x}{a matrix to scale}
+
+\item{x.test}{(optional) a test matrix to scale as well.}
+}
+\value{
+if x.test=NULL a scaled matrix where the maximum value of the 
+columns is 1 and the minimum value of the columns isn't below -1. If x.test 
+is supplied, a list with elements \code{x} and \code{x.test} representing 
+the scaled datasets.
+}
+\description{
+Scaling a dataset can creatly decrease the computation time of 
+GenSVM. This function scales the data by dividing each column of a matrix by 
+the maximum absolute value of that column. This preserves sparsity in the 
+data while mapping each column to the interval [-1, 1].
+
+Optionally a test dataset can be provided as well. In this case, the scaling 
+will be computed on the first argument (\code{x}) and applied to the test 
+dataset. Note that the return value is a list when this argument is 
+supplied.
+}
+\examples{
+x <- iris[, -5]
+
+# check the min and max of the columns
+apply(x, 2, min)
+apply(x, 2, max)
+
+# scale the data
+x.scale <- gensvm.maxabs.scale(x)
+
+# check again (max should be 1.0, min shouldn't be below -1)
+apply(x.scale, 2, min)
+apply(x.scale, 2, max)
+
+# with a train and test dataset
+x <- iris[, -5]
+split <- gensvm.train.test.split(x)
+x.train <- split$x.train
+x.test <- split$x.test
+scaled <- gensvm.maxabs.scale(x.train, x.test)
+x.train.scl <- scaled$x
+x.test.scl <- scaled$x.test
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/gensvm.rank.score.Rd b/man/gensvm.rank.score.Rd
new file mode 100644
index 0000000..21d6bcd
--- /dev/null
+++ b/man/gensvm.rank.score.Rd
@@ -0,0 +1,23 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.grid.R
+\name{gensvm.rank.score}
+\alias{gensvm.rank.score}
+\title{Compute the ranks for the numbers in a given vector}
+\usage{
+gensvm.rank.score(x)
+}
+\arguments{
+\item{x}{array of numeric values}
+}
+\details{
+This function computes the ranks for the values in an array. The highest 
+value gets the smallest rank. Ties are broken by assigning the smallest 
+value.
+}
+\examples{
+x <- c(7, 0.1, 0.5, 0.1, 10, 100, 200)
+gensvm.rank.score(x)
+[ 4 6 5 6 3 2 1 ]
+
+}
+
diff --git a/man/gensvm.refit.Rd b/man/gensvm.refit.Rd
new file mode 100644
index 0000000..194cde3
--- /dev/null
+++ b/man/gensvm.refit.Rd
@@ -0,0 +1,57 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.refit.R
+\name{gensvm.refit}
+\alias{gensvm.refit}
+\title{Train an already fitted model on new data}
+\usage{
+gensvm.refit(fit, X, y, p = NULL, lambda = NULL, kappa = NULL,
+  epsilon = NULL, weights = NULL, kernel = NULL, gamma = NULL,
+  coef = NULL, degree = NULL, kernel.eigen.cutoff = NULL,
+  max.iter = NULL, verbose = NULL, random.seed = NULL)
+}
+\arguments{
+\item{fit}{Fitted \code{gensvm} object}
+
+\item{X}{Data matrix of the new data}
+
+\item{y}{Label vector of the new data}
+
+\item{verbose}{Turn on verbose output and fit progress. If NULL (the 
+default) the value from the fitted model is chosen.}
+}
+\value{
+a new fitted \code{gensvm} model
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# fit a standard model and refit with slightly different parameters
+fit <- gensvm(x, y)
+fit2 <- gensvm.refit(x, y, epsilon=1e-8)
+
+# refit a model returned by a grid search
+grid <- gensvm.grid(x, y)
+fit <- gensvm.refit(fit, x, y, epsilon=1e-8)
+
+# refit on different data
+idx <- runif(nrow(x)) > 0.5
+x1 <- x[idx, ]
+x2 <- x[!idx, ]
+y1 <- y[idx]
+y2 <- y[!idx]
+
+fit1 <- gensvm(x1, y1)
+fit2 <- gensvm.refit(fit1, x2, y2)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/gensvm.train.test.split.Rd b/man/gensvm.train.test.split.Rd
new file mode 100644
index 0000000..a99940f
--- /dev/null
+++ b/man/gensvm.train.test.split.Rd
@@ -0,0 +1,62 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/gensvm.train.test.split.R
+\name{gensvm.train.test.split}
+\alias{gensvm.train.test.split}
+\title{Create a train/test split of a dataset}
+\usage{
+gensvm.train.test.split(x, y = NULL, train.size = NULL, test.size = NULL,
+  shuffle = TRUE, random.state = NULL, return.idx = FALSE)
+}
+\arguments{
+\item{x}{array to split}
+
+\item{y}{another array to split (typically this is a vector)}
+
+\item{train.size}{size of the training dataset. This can be provided as 
+float or as int. If it's a float, it should be between 0.0 and 1.0 and 
+represents the fraction of the dataset that should be placed in the training 
+dataset.  If it's an int, it represents the exact number of samples in the 
+training dataset. If it is NULL, the complement of \code{test.size} will be 
+used.}
+
+\item{test.size}{size of the test dataset. Similarly to train.size both a 
+float or an int can be supplied. If it's NULL, the complement of train.size 
+will be used. If both train.size and test.size are NULL, a default test.size 
+of 0.25 will be used.}
+
+\item{shuffle}{shuffle the rows or not}
+
+\item{random.state}{seed for the random number generator (int)}
+}
+\description{
+Often it is desirable to split a dataset into a training and 
+testing sample. This function is included in GenSVM to make it easy to do 
+so. The function is inspired by a similar function in Scikit-Learn.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# using the default values
+split <- gensvm.train.test.split(x, y)
+
+# using the split in a GenSVM model
+fit <- gensvm(split$x.train, split$y.train)
+gensvm.accuracy(split$y.test, predict(fit, split$x.test))
+
+# using attach makes the results directly available
+attach(gensvm.train.test.split(x, y))
+fit <- gensvm(x.train, y.train)
+gensvm.accuracy(y.test, predict(fit, x.test))
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/plot.gensvm.Rd b/man/plot.gensvm.Rd
new file mode 100644
index 0000000..b597e18
--- /dev/null
+++ b/man/plot.gensvm.Rd
@@ -0,0 +1,68 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/plot.gensvm.R
+\name{plot.gensvm}
+\alias{plot.gensvm}
+\title{Plot the simplex space of the fitted GenSVM model}
+\usage{
+\method{plot}{gensvm}(fit, x, y.true = NULL, with.margins = TRUE,
+  with.shading = TRUE, with.legend = TRUE, center.plot = TRUE, ...)
+}
+\arguments{
+\item{fit}{A fitted \code{gensvm} object}
+
+\item{x}{the dataset to plot}
+
+\item{y.true}{the true data labels. If provided the objects will be colored 
+using the true labels instead of the predicted labels. This makes it easy to 
+identify misclassified objects.}
+
+\item{with.margins}{plot the margins}
+
+\item{with.shading}{show shaded areas for the class regions}
+
+\item{with.legend}{show the legend for the class labels}
+
+\item{center.plot}{ensure that the boundaries and margins are always visible 
+in the plot}
+
+\item{...}{further arguments are ignored}
+}
+\value{
+returns the object passed as input
+}
+\description{
+This function creates a plot of the simplex space for a fitted 
+GenSVM model and the given data set, as long as the dataset consists of only 
+3 classes.  For more than 3 classes, the simplex space is too high 
+dimensional to easily visualize.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# train the model
+fit <- gensvm(x, y)
+
+# plot the simplex space
+plot(fit, x)
+
+# plot and use the true colors (easier to spot misclassified samples)
+plot(fit, x, y.true=y)
+
+# plot only misclassified samples
+x.mis <- x[predict(fit, x) != y, ]
+y.mis.true <- y[predict(fit, x) != y, ]
+plot(fit, x.bad)
+plot(fit, x.bad, y.true=y.mis.true)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/plot.gensvm.grid.Rd b/man/plot.gensvm.grid.Rd
new file mode 100644
index 0000000..d54196f
--- /dev/null
+++ b/man/plot.gensvm.grid.Rd
@@ -0,0 +1,41 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/plot.gensvm.grid.R
+\name{plot.gensvm.grid}
+\alias{plot.gensvm.grid}
+\title{Plot the simplex space of the best fitted model in the GenSVMGrid}
+\usage{
+\method{plot}{gensvm.grid}(grid, x, ...)
+}
+\arguments{
+\item{grid}{A \code{gensvm.grid} object trained with refit=TRUE}
+
+\item{x}{the dataset to plot}
+
+\item{...}{further arguments are passed to the plot function}
+}
+\value{
+returns the object passed as input
+}
+\description{
+This is a wrapper which calls the plot function for the best 
+model in the provided GenSVMGrid object. See the documentation for 
+\code{\link{plot.gensvm}} for more information.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+grid <- gensvm.grid(x, y)
+plot(grid, x)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/predict.gensvm.Rd b/man/predict.gensvm.Rd
new file mode 100644
index 0000000..0c55a43
--- /dev/null
+++ b/man/predict.gensvm.Rd
@@ -0,0 +1,50 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/predict.gensvm.R
+\name{predict.gensvm}
+\alias{predict}
+\alias{predict.gensvm}
+\title{Predict class labels with the GenSVM model}
+\usage{
+\method{predict}{gensvm}(fit, x.test, ...)
+}
+\arguments{
+\item{fit}{Fitted \code{gensvm} object}
+
+\item{x.test}{Matrix of new values for \code{x} for which predictions need 
+to be made.}
+
+\item{\dots}{further arguments are ignored}
+}
+\value{
+a vector of class labels, with the same type as the original class 
+labels.
+}
+\description{
+This function predicts the class labels of new data using a 
+fitted GenSVM model.
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# create a training and test sample
+attach(gensvm.train.test.split(x, y))
+fit <- gensvm(x.train, y.train)
+
+# predict the class labels of the test sample
+y.test.pred <- predict(fit, x.test)
+
+# compute the accuracy with gensvm.accuracy
+gensvm.accuracy(y.test, y.test.pred)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/predict.gensvm.grid.Rd b/man/predict.gensvm.grid.Rd
new file mode 100644
index 0000000..d4cbd68
--- /dev/null
+++ b/man/predict.gensvm.grid.Rd
@@ -0,0 +1,49 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/predict.gensvm.grid.R
+\name{predict.gensvm.grid}
+\alias{predict.gensvm.grid}
+\title{Predict class labels from the GenSVMGrid class}
+\usage{
+\method{predict}{gensvm.grid}(grid, newx, ...)
+}
+\arguments{
+\item{grid}{A \code{gensvm.grid} object trained with \code{refit=TRUE}}
+
+\item{newx}{Matrix of new values for \code{x} for which predictions need to 
+be computed.}
+
+\item{\dots}{further arguments are passed to predict.gensvm()}
+}
+\value{
+a vector of class labels, with the same type as the original class 
+labels provided to gensvm.grid()
+}
+\description{
+Predict class labels using the best model from a grid search.  
+After doing a grid search with the \code{\link{gensvm.grid}} function, this 
+function can be used to make predictions of class labels. It uses the best 
+GenSVM model found during the grid search to do the predictions. Note that 
+this model is only available if \code{refit=TRUE} was specified in the 
+\code{\link{gensvm.grid}} call (the default).
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# run a grid search
+grid <- gensvm.grid(x, y)
+
+# predict training sample
+y.hat <- predict(grid, x)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/print.gensvm.Rd b/man/print.gensvm.Rd
new file mode 100644
index 0000000..75a44b2
--- /dev/null
+++ b/man/print.gensvm.Rd
@@ -0,0 +1,43 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/print.gensvm.R
+\name{print.gensvm}
+\alias{print.gensvm}
+\title{Print the fitted GenSVM model}
+\usage{
+\method{print}{gensvm}(fit, ...)
+}
+\arguments{
+\item{fit}{A \code{gensvm} object to print}
+
+\item{\dots}{further arguments are ignored}
+}
+\value{
+returns the object passed as input. This can be useful for chaining 
+operations on a fit object.
+}
+\description{
+Prints a short description of the fitted GenSVM model
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# fit and print the model
+fit <- gensvm(x, y)
+print(fit)
+
+# (advanced) use the fact that print returns the fitted model
+fit <- gensvm(x, y)
+predict(print(fit), x)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
diff --git a/man/print.gensvm.grid.Rd b/man/print.gensvm.grid.Rd
new file mode 100644
index 0000000..8a65370
--- /dev/null
+++ b/man/print.gensvm.grid.Rd
@@ -0,0 +1,38 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/print.gensvm.grid.R
+\name{print.gensvm.grid}
+\alias{print.gensvm.grid}
+\title{Print the fitted GenSVMGrid model}
+\usage{
+\method{print}{gensvm.grid}(grid, ...)
+}
+\arguments{
+\item{grid}{a \code{gensvm.grid} object to print}
+
+\item{\dots}{further arguments are ignored}
+}
+\value{
+returns the object passed as input
+}
+\description{
+Prints the summary of the fitted GenSVMGrid model
+}
+\examples{
+x <- iris[, -5]
+y <- iris[, 5]
+
+# fit a grid search and print the resulting object
+grid <- gensvm.grid(x, y)
+print(grid)
+
+}
+\author{
+Gerrit J.J. van den Burg, Patrick J.F. Groenen \cr
+Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
+}
+\references{
+Van den Burg, G.J.J. and Groenen, P.J.F. (2016). \emph{GenSVM: A Generalized 
+Multiclass Support Vector Machine}, Journal of Machine Learning Research, 
+17(225):1--42. URL \url{http://jmlr.org/papers/v17/14-526.html}.
+}
+
author	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2018-03-27 12:31:28 +0100
committer	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2018-03-27 12:31:28 +0100
commit	004941896bac692d354c41a3334d20ee1d4627f7 (patch)
tree	2b11e42d8524843409e2bf8deb4ceb74c8b69347 /man
parent	updates to GenSVM C library (diff)
download	rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.tar.gz rgensvm-004941896bac692d354c41a3334d20ee1d4627f7.zip