% Generated by roxygen2: do not edit by hand % Please edit documentation in R/order.R \name{order-radix} \alias{order-radix} \alias{vec_order_radix} \alias{vec_sort_radix} \title{Order and sort vectors} \usage{ vec_order_radix( x, ..., direction = "asc", na_value = "largest", nan_distinct = FALSE, chr_proxy_collate = NULL ) vec_sort_radix( x, ..., direction = "asc", na_value = "largest", nan_distinct = FALSE, chr_proxy_collate = NULL ) } \arguments{ \item{x}{A vector} \item{...}{These dots are for future extensions and must be empty.} \item{direction}{Direction to sort in. \itemize{ \item A single \code{"asc"} or \code{"desc"} for ascending or descending order respectively. \item For data frames, a length \code{1} or \code{ncol(x)} character vector containing only \code{"asc"} or \code{"desc"}, specifying the direction for each column. }} \item{na_value}{Ordering of missing values. \itemize{ \item A single \code{"largest"} or \code{"smallest"} for ordering missing values as the largest or smallest values respectively. \item For data frames, a length \code{1} or \code{ncol(x)} character vector containing only \code{"largest"} or \code{"smallest"}, specifying how missing values should be ordered within each column. }} \item{nan_distinct}{A single logical specifying whether or not \code{NaN} should be considered distinct from \code{NA} for double and complex vectors. If \code{TRUE}, \code{NaN} will always be ordered between \code{NA} and non-missing numbers.} \item{chr_proxy_collate}{A function generating an alternate representation of character vectors to use for collation, often used for locale-aware ordering. \itemize{ \item If \code{NULL}, no transformation is done. \item Otherwise, this must be a function of one argument. If the input contains a character vector, it will be passed to this function after it has been translated to UTF-8. This function should return a character vector with the same length as the input. The result should sort as expected in the C-locale, regardless of encoding. } For data frames, \code{chr_proxy_collate} will be applied to all character columns. Common transformation functions include: \code{tolower()} for case-insensitive ordering and \code{stringi::stri_sort_key()} for locale-aware ordering.} } \value{ \itemize{ \item \code{vec_order_radix()} an integer vector the same size as \code{x}. \item \code{vec_sort_radix()} a vector with the same size and type as \code{x}. } } \description{ \code{vec_order_radix()} computes the order of \code{x}. For data frames, the order is computed along the rows by computing the order of the first column and using subsequent columns to break ties. \code{vec_sort_radix()} sorts \code{x}. It is equivalent to \code{vec_slice(x, vec_order_radix(x))}. } \section{Differences with \code{order()}}{ Unlike the \code{na.last} argument of \code{order()} which decides the positions of missing values irrespective of the \code{decreasing} argument, the \code{na_value} argument of \code{vec_order_radix()} interacts with \code{direction}. If missing values are considered the largest value, they will appear last in ascending order, and first in descending order. Character vectors are ordered in the C-locale. This is different from \code{base::order()}, which respects \code{base::Sys.setlocale()}. Sorting in a consistent locale can produce more reproducible results between different sessions and platforms, however, the results of sorting in the C-locale can be surprising. For example, capital letters sort before lower case letters. Sorting \code{c("b", "C", "a")} with \code{vec_sort_radix()} will return \code{c("C", "a", "b")}, but with \code{base::order()} will return \code{c("a", "b", "C")} unless \code{base::order(method = "radix")} is explicitly set, which also uses the C-locale. While sorting with the C-locale can be useful for algorithmic efficiency, in many real world uses it can be the cause of data analysis mistakes. To balance these trade-offs, you can supply a \code{chr_proxy_collate} function to transform character vectors into an alternative representation that orders in the C-locale in a less surprising way. For example, providing \code{\link[base:chartr]{base::tolower()}} as a transform will order the original vector in a case-insensitive manner. Locale-aware ordering can be achieved by providing \code{stringi::stri_sort_key()} as a transform, setting the collation options as appropriate for your locale. Character vectors are always translated to UTF-8 before ordering, and before any transform is applied by \code{chr_proxy_collate}. For complex vectors, if either the real or imaginary component is \code{NA} or \code{NaN}, then the entire observation is considered missing. } \section{Dependencies of \code{vec_order_radix()}}{ \itemize{ \item \code{\link[=vec_proxy_order]{vec_proxy_order()}} } } \section{Dependencies of \code{vec_sort_radix()}}{ \itemize{ \item \code{\link[=vec_order_radix]{vec_order_radix()}} \item \code{\link[=vec_slice]{vec_slice()}} } } \examples{ if (FALSE) { x <- round(sample(runif(5), 9, replace = TRUE), 3) x <- c(x, NA) vec_order_radix(x) vec_sort_radix(x) vec_sort_radix(x, direction = "desc") # Can also handle data frames df <- data.frame(g = sample(2, 10, replace = TRUE), x = x) vec_order_radix(df) vec_sort_radix(df) vec_sort_radix(df, direction = "desc") # For data frames, `direction` and `na_value` are allowed to be vectors # with length equal to the number of columns in the data frame vec_sort_radix( df, direction = c("desc", "asc"), na_value = c("largest", "smallest") ) # Character vectors are ordered in the C locale, which orders capital letters # below lowercase ones y <- c("B", "A", "a") vec_sort_radix(y) # To order in a case-insensitive manner, provide a `chr_proxy_collate` # function that transforms the strings to all lowercase vec_sort_radix(y, chr_proxy_collate = tolower) } } \keyword{internal}