% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils-encoding.R
\name{as_utf8_character}
\alias{as_utf8_character}
\title{Coerce to a character vector and attempt encoding conversion}
\usage{
as_utf8_character(x)
}
\arguments{
\item{x}{An object to coerce.}
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}}

Unlike specifying the \code{encoding} argument in \code{as_string()} and
\code{as_character()}, which is only declarative, these functions
actually attempt to convert the encoding of their input. There are
two possible cases:
\itemize{
\item The string is tagged as UTF-8 or latin1, the only two encodings
for which R has specific support. In this case, converting to the
same encoding is a no-op, and converting to native always works
as expected, as long as the native encoding, the one specified by
the \code{LC_CTYPE} locale has support for all characters occurring in
the strings. Unrepresentable characters are serialised as unicode
points: "<U+xxxx>".
\item The string is not tagged. R assumes that it is encoded in the
native encoding. Conversion to native is a no-op, and conversion
to UTF-8 should work as long as the string is actually encoded in
the locale codeset.
}

When translating to UTF-8, the strings are parsed for serialised
unicode points (e.g. strings looking like "U+xxxx") with
\code{\link[=chr_unserialise_unicode]{chr_unserialise_unicode()}}. This helps to alleviate the effects of
character-to-symbol-to-character roundtrips on systems with
non-UTF-8 native encoding.
}
\examples{
# Let's create a string marked as UTF-8 (which is guaranteed by the
# Unicode escaping in the string):
utf8 <- "caf\uE9"
Encoding(utf8)
charToRaw(utf8)
}
\keyword{internal}