% Generated by roxygen2: do not edit by hand % Please edit documentation in R/match.R \name{vec_locate_matches} \alias{vec_locate_matches} \title{Locate observations matching specified conditions} \usage{ vec_locate_matches( needles, haystack, ..., condition = "==", filter = "none", incomplete = "compare", no_match = NA_integer_, remaining = "drop", multiple = "all", nan_distinct = FALSE, chr_proxy_collate = NULL, needles_arg = "", haystack_arg = "", error_call = current_env() ) } \arguments{ \item{needles, haystack}{Vectors used for matching. \itemize{ \item \code{needles} represents the vector to search for. \item \code{haystack} represents the vector to search in. } Prior to comparison, \code{needles} and \code{haystack} are coerced to the same type.} \item{...}{These dots are for future extensions and must be empty.} \item{condition}{Condition controlling how \code{needles} should be compared against \code{haystack} to identify a successful match. \itemize{ \item One of: \code{"=="}, \code{">"}, \code{">="}, \code{"<"}, or \code{"<="}. \item For data frames, a length \code{1} or \code{ncol(needles)} character vector containing only the above options, specifying how matching is determined for each column. }} \item{filter}{Filter to be applied to the matched results. \itemize{ \item \code{"none"} doesn't apply any filter. \item \code{"min"} returns only the minimum haystack value matching the current needle. \item \code{"max"} returns only the maximum haystack value matching the current needle. \item For data frames, a length \code{1} or \code{ncol(needles)} character vector containing only the above options, specifying a filter to apply to each column. } Filters don't have any effect on \code{"=="} conditions, but are useful for computing "rolling" matches with other conditions. A filter can return multiple haystack matches for a particular needle if the maximum or minimum haystack value is duplicated in \code{haystack}. These can be further controlled with \code{multiple}.} \item{incomplete}{Handling of missing values and \link[=vec_detect_complete]{incomplete} observations in \code{needles}. \itemize{ \item \code{"compare"} uses \code{condition} to determine whether or not a missing value in \code{needles} matches a missing value in \code{haystack}. If \code{condition} is \code{==}, \code{>=}, or \code{<=}, then missing values will match. \item \code{"match"} always allows missing values in \code{needles} to match missing values in \code{haystack}, regardless of the \code{condition}. \item \code{"drop"} drops incomplete observations in \code{needles} from the result. \item \code{"error"} throws an error if any \code{needles} are incomplete. \item If a single integer is provided, this represents the value returned in the \code{haystack} column for observations of \code{needles} that are incomplete. If \code{no_match = NA}, setting \code{incomplete = NA} forces incomplete observations in \code{needles} to be treated like unmatched values. } \code{nan_distinct} determines whether a \code{NA} is allowed to match a \code{NaN}.} \item{no_match}{Handling of \code{needles} without a match. \itemize{ \item \code{"drop"} drops \code{needles} with zero matches from the result. \item \code{"error"} throws an error if any \code{needles} have zero matches. \item If a single integer is provided, this represents the value returned in the \code{haystack} column for observations of \code{needles} that have zero matches. The default represents an unmatched needle with \code{NA}. }} \item{remaining}{Handling of \code{haystack} values that \code{needles} never matched. \itemize{ \item \code{"drop"} drops remaining \code{haystack} values from the result. Typically, this is the desired behavior if you only care when \code{needles} has a match. \item \code{"error"} throws an error if there are any remaining \code{haystack} values. \item If a single integer is provided (often \code{NA}), this represents the value returned in the \code{needles} column for the remaining \code{haystack} values that \code{needles} never matched. Remaining \code{haystack} values are always returned at the end of the result. }} \item{multiple}{Handling of \code{needles} with multiple matches. For each needle: \itemize{ \item \code{"all"} returns all matches detected in \code{haystack}. \item \code{"any"} returns any match detected in \code{haystack} with no guarantees on which match will be returned. It is often faster than \code{"first"} and \code{"last"} if you just need to detect if there is at least one match. \item \code{"first"} returns the first match detected in \code{haystack}. \item \code{"last"} returns the last match detected in \code{haystack}. \item \code{"warning"} throws a warning if multiple matches are detected, but otherwise falls back to \code{"all"}. \item \code{"error"} throws an error if multiple matches are detected. }} \item{nan_distinct}{A single logical specifying whether or not \code{NaN} should be considered distinct from \code{NA} for double and complex vectors. If \code{TRUE}, \code{NaN} will always be ordered between \code{NA} and non-missing numbers.} \item{chr_proxy_collate}{A function generating an alternate representation of character vectors to use for collation, often used for locale-aware ordering. \itemize{ \item If \code{NULL}, no transformation is done. \item Otherwise, this must be a function of one argument. If the input contains a character vector, it will be passed to this function after it has been translated to UTF-8. This function should return a character vector with the same length as the input. The result should sort as expected in the C-locale, regardless of encoding. } For data frames, \code{chr_proxy_collate} will be applied to all character columns. Common transformation functions include: \code{tolower()} for case-insensitive ordering and \code{stringi::stri_sort_key()} for locale-aware ordering.} \item{needles_arg, haystack_arg}{Argument tags for \code{needles} and \code{haystack} used in error messages.} \item{error_call}{The execution environment of a currently running function, e.g. \code{caller_env()}. The function will be mentioned in error messages as the source of the error. See the \code{call} argument of \code{\link[rlang:abort]{abort()}} for more information.} } \value{ A two column data frame containing the locations of the matches. \itemize{ \item \code{needles} is an integer vector containing the location of the needle currently being matched. \item \code{haystack} is an integer vector containing the location of the corresponding match in the haystack for the current needle. } } \description{ \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}} \code{vec_locate_matches()} is a more flexible version of \code{\link[=vec_match]{vec_match()}} used to identify locations where each observation of \code{needles} matches one or multiple observations in \code{haystack}. Unlike \code{vec_match()}, \code{vec_locate_matches()} returns all matches by default, and can match on binary conditions other than equality, such as \code{>}, \code{>=}, \code{<}, and \code{<=}. } \details{ \code{\link[=vec_match]{vec_match()}} is identical to (but often slightly faster than): \if{html}{\out{