Skip to content

parse_factor() is similar to factor(), but generates a warning if levels have been specified and some elements of x are not found in those levels.


  levels = NULL,
  ordered = FALSE,
  na = c("", "NA"),
  locale = default_locale(),
  include_na = TRUE,
  trim_ws = TRUE

col_factor(levels = NULL, ordered = FALSE, include_na = FALSE)



Character vector of values to parse.


Character vector of the allowed levels. When levels = NULL (the default), levels are discovered from the unique values of x, in the order in which they appear in x.


Is it an ordered factor?


Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.


The locale controls defaults that vary from place to place. The default locale is US-centric (like R), but you can use locale() to create your own locale that controls things like the default time zone, encoding, decimal mark, big mark, and day/month names.


If TRUE and x contains at least one NA, then NA is included in the levels of the constructed factor.


Should leading and trailing whitespace (ASCII spaces and tabs) be trimmed from each field before parsing it?


# discover the levels from the data
parse_factor(c("a", "b"))
#> [1] a b
#> Levels: a b
parse_factor(c("a", "b", "-99"))
#> [1] a   b   -99
#> Levels: a b -99
parse_factor(c("a", "b", "-99"), na = c("", "NA", "-99"))
#> [1] a    b    <NA>
#> Levels: a b <NA>
parse_factor(c("a", "b", "-99"), na = c("", "NA", "-99"), include_na = FALSE)
#> [1] a    b    <NA>
#> Levels: a b

# provide the levels explicitly
parse_factor(c("a", "b"), levels = letters[1:5])
#> [1] a b
#> Levels: a b c d e

x <- c("cat", "dog", "caw")
animals <- c("cat", "dog", "cow")

# base::factor() silently converts elements that do not match any levels to
# NA
factor(x, levels = animals)
#> [1] cat  dog  <NA>
#> Levels: cat dog cow

# parse_factor() generates same factor as base::factor() but throws a warning
# and reports problems
parse_factor(x, levels = animals)
#> Warning: 1 parsing failure.
#> row col           expected actual
#>   3  -- value in level set    caw
#> [1] cat  dog  <NA>
#> attr(,"problems")
#> # A tibble: 1 × 4
#>     row   col expected           actual
#>   <int> <int> <chr>              <chr> 
#> 1     3    NA value in level set caw   
#> Levels: cat dog cow