% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/irlba.R
\name{irlba}
\alias{irlba}
\title{Find a few approximate singular values and corresponding
singular vectors of a matrix.}
\usage{
irlba(A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE,
  tol = 1e-05, v = NULL, right_only = FALSE, verbose = FALSE,
  scale = NULL, center = NULL, shift = NULL, mult = NULL,
  fastpath = TRUE, svtol = tol, smallest = FALSE, ...)
}
\arguments{
\item{A}{numeric real- or complex-valued matrix or real-valued sparse matrix.}

\item{nv}{number of right singular vectors to estimate.}

\item{nu}{number of left singular vectors to estimate (defaults to \code{nv}).}

\item{maxit}{maximum number of iterations.}

\item{work}{working subspace dimension, larger values can speed convergence at the cost of more memory use.}

\item{reorth}{if \code{TRUE}, apply full reorthogonalization to both SVD bases, otherwise
only apply reorthogonalization to the right SVD basis vectors; the latter case is cheaper per
iteration but, overall, may require more iterations for convergence. Automatically \code{TRUE}
when \code{fastpath=TRUE} (see below).}

\item{tol}{convergence is determined when \eqn{\|A^TU - VS\| < tol\|A\|}{||A^T U - VS|| < tol*||A||},
and when the maximum relative change in estimated singular values from one iteration to the
next is less than \code{svtol = tol} (see \code{svtol} below),
where the spectral norm ||A|| is approximated by the
largest estimated singular value, and U, V, S are the matrices corresponding
to the estimated left and right singular vectors, and diagonal matrix of
estimated singular values, respectively.}

\item{v}{optional starting vector or output from a previous run of \code{irlba} used
to restart the algorithm from where it left off (see the notes).}

\item{right_only}{logical value indicating return only the right singular vectors
(\code{TRUE}) or both sets of vectors (\code{FALSE}). The right_only option can be
cheaper to compute and use much less memory when \code{nrow(A) >> ncol(A)} but note
that obtained solutions typically lose accuracy due to lack of re-orthogonalization in the
algorithm and that \code{right_only = TRUE} sets \code{fastpath = FALSE} (only use this option
for really large problems that run out of memory and when \code{nrow(A) >> ncol(A)}).
Consider increasing the \code{work} option to improve accuracy with \code{right_only=TRUE}.}

\item{verbose}{logical value that when \code{TRUE} prints status messages during the computation.}

\item{scale}{optional column scaling vector whose values divide each column of \code{A};
must be as long as the number of columns of \code{A} (see notes).}

\item{center}{optional column centering vector whose values are subtracted from each
column of \code{A}; must be as long as the number of columns of \code{A} and may
not be used together with the deflation options below (see notes).}

\item{shift}{optional shift value (square matrices only, see notes).}

\item{mult}{DEPRECATED optional custom matrix multiplication function (default is \code{\%*\%}, see notes).}

\item{fastpath}{try a fast C algorithm implementation if possible; set \code{fastpath=FALSE} to use the
reference R implementation. See the notes for more details.}

\item{svtol}{additional stopping tolerance on maximum allowed absolute relative change across each
estimated singular value between iterations.
The default value of this parameter is to set it to \code{tol}. You can set \code{svtol=Inf} to
effectively disable this stopping criterion. Setting \code{svtol=Inf} allows the method to
terminate on the first Lanczos iteration if it finds an invariant subspace, but with less certainty
that the converged subspace is the desired one. (It may, for instance, miss some of the largest
singular values in difficult problems.)}

\item{smallest}{set \code{smallest=TRUE} to estimate the smallest singular values and associated
singular vectors. WARNING: this option is somewhat experimental, and may produce poor
estimates for ill-conditioned matrices.}

\item{...}{optional additional arguments used to support experimental and deprecated features.}
}
\value{
Returns a list with entries:
\describe{
  \item{d:}{ max(nu, nv) approximate singular values}
  \item{u:}{ nu approximate left singular vectors (only when right_only=FALSE)}
  \item{v:}{ nv approximate right singular vectors}
  \item{iter:}{ The number of Lanczos iterations carried out}
  \item{mprod:}{ The total number of matrix vector products carried out}
}
}
\description{
The augmented implicitly restarted Lanczos bidiagonalization algorithm
(IRLBA) finds a few approximate largest (or, optionally, smallest) singular
values and corresponding
singular vectors of a sparse or dense matrix using a method of Baglama and
Reichel.  It is a fast and memory-efficient way to compute a partial SVD.
}
\note{
The syntax of \code{irlba} partially follows \code{svd}, with an important
exception. The usual R \code{svd} function always returns a complete set of
singular values, even if the number of singular vectors \code{nu} or \code{nv}
is set less than the maximum. The \code{irlba} function returns a number of
estimated singular values equal to the maximum of the number of specified
singular vectors \code{nu} and \code{nv}.

Use the optional \code{scale} parameter to implicitly scale each column of
the matrix \code{A} by the values in the \code{scale} vector, computing the
truncated SVD of the column-scaled \code{sweep(A, 2, scale, FUN=`/`)}, or
equivalently, \code{A \%*\% diag(1 / scale)}, without explicitly forming the
scaled matrix. \code{scale} must be a non-zero vector of length equal
to the number of columns of \code{A}.

Use the optional \code{center} parameter to implicitly subtract the values
in the \code{center} vector from each column of \code{A}, computing the
truncated SVD of \code{sweep(A, 2, center, FUN=`-`)},
without explicitly forming the centered matrix. \code{center}
must be a vector of length equal to the number of columns of \code{A}.
This option may be used to efficiently compute principal components without
explicitly forming the centered matrix (which can, importantly, preserve
sparsity in the matrix). See the examples.

The optional \code{shift} scalar valued argument applies only to square matrices; use it
to estimate the partial svd of \code{A + diag(shift, nrow(A), nrow(A))}
(without explicitly forming the shifted matrix).

(Deprecated) Specify an optional alternative matrix multiplication operator in the
\code{mult} parameter. \code{mult} must be a function of two arguments,
and must handle both cases where one argument is a vector and the other
a matrix. This option is deprecated and will be removed in a future version.
The new preferred method simply uses R itself to define a custom matrix class
with your user-defined matrix multiplication operator. See the examples.

Use the \code{v} option to supply a starting vector for the iterative
method. A random vector is used by default (precede with \code{set.seed()}
for reproducibility). Optionally set \code{v} to
the output of a previous run of \code{irlba} to restart the method, adding
additional singular values/vectors without recomputing the solution
subspace. See the examples.

The function may generate the following warnings:
\itemize{
  \item{"did not converge--results might be invalid!; try increasing work or maxit"
  means that the algorithm didn't
  converge -- this is potentially a serious problem and the returned results may not be valid. \code{irlba}
  reports a warning here instead of an error so that you can inspect whatever is returned. If this
  happens, carefully heed the warning and inspect the result. You may also try setting \code{fastpath=FALSE}.}
  \item{"You're computing a large percentage of total singular values, standard svd might work better!"
    \code{irlba} is designed to efficiently compute a few of the largest singular values and associated
     singular vectors of a matrix. The standard \code{svd} function will be more efficient for computing
     large numbers of singular values than \code{irlba}.}
   \item{"convergence criterion below machine epsilon" means that the product of \code{tol} and the
     largest estimated singular value is really small and the normal convergence criterion is only
     met up to round off error.}
}
The function might return an error for several reasons including a situation when the starting
vector \code{v} is near the null space of the matrix. In that case, try a different \code{v}.

The \code{fastpath=TRUE} option only supports real-valued matrices and sparse matrices
of type \code{dgCMatrix} (for now). Other problems fall back to the reference
R implementation.
}
\examples{
set.seed(1)

A <- matrix(runif(400), nrow=20)
S <- irlba(A, 3)
S$d

# Compare with svd
svd(A)$d[1:3]

# Restart the algorithm to compute more singular values
# (starting with an existing solution S)
S1 <- irlba(A, 5, v=S)

# Estimate smallest singular values
irlba(A, 3, smallest=TRUE)$d

#Compare with
tail(svd(A)$d, 3)

# Principal components (see also prcomp_irlba)
P <- irlba(A, nv=1, center=colMeans(A))

# Compare with prcomp and prcomp_irlba (might vary up to sign)
cbind(P$v,
      prcomp(A)$rotation[, 1],
      prcomp_irlba(A)$rotation[, 1])

# A custom matrix multiplication function that scales the columns of A
# (cf the scale option). This function scales the columns of A to unit norm.
col_scale <- sqrt(apply(A, 2, crossprod))
setClass("scaled_matrix", contains="matrix", slots=c(scale="numeric"))
setMethod("\%*\%", signature(x="scaled_matrix", y="numeric"),
   function(x ,y) x@.Data \%*\% (y / x@scale))
setMethod("\%*\%", signature(x="numeric", y="scaled_matrix"),
   function(x ,y) (x \%*\% y@.Data) / y@scale)
a <- new("scaled_matrix", A, scale=col_scale)
irlba(a, 3)$d

# Compare with:
svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]


}
\references{
Baglama, James, and Lothar Reichel. "Augmented implicitly restarted Lanczos bidiagonalization methods." SIAM Journal on Scientific Computing 27.1 (2005): 19-42.
}
\seealso{
\code{\link{svd}}, \code{\link{prcomp}}, \code{\link{partial_eigen}}, \code{\link{svdr}}
}