GitHunt

vetr

R build status


Project Status: WIP - Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.
Dependencies direct/recursive

Trust, but Verify

Easily

When you write functions that operate on S3 or unclassed objects you can
either trust that your inputs will be structured as expected, or
tediously check that they are.

vetr takes the tedium out of structure verification so that you can
trust, but verify. It lets you express structural requirements
declaratively with templates, and it auto-generates human-friendly error
messages as needed.

Quickly

vetr is written in C to minimize overhead from parameter checks in
your functions. It has no dependencies.

Declarative Checks with Templates

Templates

Declare a template that an object should conform to, and let vetr take
care of the rest:

library(vetr)
tpl <- numeric(1L)
vet(tpl, 1:3)
## [1] "`length(1:3)` should be 1 (is 3)"
vet(tpl, "hello")
## [1] "`\"hello\"` should be type \"numeric\" (is \"character\")"
vet(tpl, 42)
## [1] TRUE

The template concept is based on vapply, but generalizes to all S3
objects and adds some special features to facilitate comparison. For
example, zero length templates match any length:

tpl <- integer()
vet(tpl, 1L:3L)
## [1] TRUE
vet(tpl, 1L)
## [1] TRUE

And for convenience short (<= 100 length) integer-like numerics are
considered integer:

tpl <- integer(1L)
vet(tpl, 1)       # this is a numeric, not an integer
## [1] TRUE
vet(tpl, 1.0001)
## [1] "`1.0001` should be type \"integer-like\" (is \"double\")"

vetr can compare recursive objects such as lists, or data.frames:

tpl.iris <- iris[0, ]      # 0 row DF matches any number of rows in object
iris.fake <- iris
levels(iris.fake$Species)[3] <- "sibirica"   # tweak levels

vet(tpl.iris, iris)
## [1] TRUE
vet(tpl.iris, iris.fake)
## [1] "`levels(iris.fake$Species)[3]` should be \"virginica\" (is \"sibirica\")"

From our declared template iris[0, ], vetr infers all the required
checks. In this case, vet(iris[0, ], iris.fake, stop=TRUE) is
equivalent to:

stopifnot_iris <- function(x) {
  stopifnot(
    is.data.frame(x),
    is.list(x),
    length(x) == length(iris),
    identical(lapply(x, class), lapply(iris, class)),
    is.integer(attr(x, 'row.names')),
    identical(names(x), names(iris)),
    identical(typeof(x$Species), "integer"),
    identical(levels(x$Species), levels(iris$Species))
  )
}
stopifnot_iris(iris.fake)
## Error in stopifnot_iris(iris.fake): identical(levels(x$Species), levels(iris$Species)) is not TRUE

vetr saved us typing, and the time and thought needed to come up with
what needs to be compared.

You could just as easily have created templates for nested lists, or
data frames in lists. Templates are compared to objects with the alike
function. For a thorough description of templates and how they work see
the alike
vignette
.
For template examples see example(alike).

Auto-Generated Error Messages

Let’s revisit the error message:

vet(tpl.iris, iris.fake)
## [1] "`levels(iris.fake$Species)[3]` should be \"virginica\" (is \"sibirica\")"

It tells us:

  • The reason for the failure
  • What structure would be acceptable instead
  • The location of failure levels(iris.fake$Species)[3]

vetr does what it can to reduce the time from error to resolution. The
location of failure is generated such that you can easily copy it in
part or full to the R prompt for further examination.

Vetting Expressions

You can combine templates with && / ||:

vet(numeric(1L) || NULL, NULL)
## [1] TRUE
vet(numeric(1L) || NULL, 42)
## [1] TRUE
vet(numeric(1L) || NULL, "foo")
## [1] "`\"foo\"` should be `NULL`, or type \"numeric\" (is \"character\")"

Templates only check structure. When you need to check values use . to
refer to the object:

vet(numeric(1L) && . > 0, -42)  # strictly positive scalar numeric
## [1] "`-42 > 0` is not TRUE (FALSE)"
vet(numeric(1L) && . > 0, 42)
## [1] TRUE

If you do use the . symbol in your vetting expressions in your
packages, you will need to include utils::globalVariables(".") as a
top-level call to avoid the “no visible binding for global variable ‘.’”
R CMD check NOTE.

You can compose vetting expressions as language objects and combine
them:

scalar.num.pos <- quote(numeric(1L) && . > 0)
foo.or.bar <- quote(character(1L) && . %in% c('foo', 'bar'))
vet.exp <- quote(scalar.num.pos || foo.or.bar)

vet(vet.exp, 42)
## [1] TRUE
vet(vet.exp, "foo")
## [1] TRUE
vet(vet.exp, "baz")
## [1] "At least one of these should pass:"                         
## [2] "  - `\"baz\" %in% c(\"foo\", \"bar\")` is not TRUE (FALSE)" 
## [3] "  - `\"baz\"` should be type \"numeric\" (is \"character\")"

all_bw is available for value range checks (~10x faster than
isTRUE(all(. >= x & . <= y)) for large vectors):

vet(all_bw(., 0, 1), runif(5) + 1)
## [1] "`all_bw(runif(5) + 1, 0, 1)` is not TRUE (is chr: \"`1.643241` at index 1 not in `[0,1]`\")"

There are a number of predefined vetting tokens you can use in your
vetting expressions such as:

vet(NUM.POS, -runif(5))    # positive numeric; see `?vet_token` for others
## [1] "`-runif(5)` should contain only positive values, but has negatives"

Vetting expressions are designed to be intuitive to use, but their
implementation is complex. We recommend you look at example(vet) for
usage ideas, or at the “Non Standard Evaluation” section of the
vignette

for the gory details.

vetr in Functions

If you are vetting function inputs, you can use the vetr function,
which works just like vet except that it is streamlined for use within
functions:

fun <- function(x, y) {
  vetr(numeric(1L), logical(1L))
  TRUE   # do work...
}
fun(1:2, "foo")
## Error in fun(x = 1:2, y = "foo"): For argument `x`, `length(1:2)` should be 1 (is 2)
fun(1, "foo")
## Error in fun(x = 1, y = "foo"): For argument `y`, `"foo"` should be type "logical" (is "character")

vetr automatically matches the vetting expressions to the
corresponding arguments and fetches the argument values from the
function environment.

See
vignette
for additional details on how the vetr function works.

Additional Documentation

Development Status

vetr is still in development, although most of the features are
considered mature. The most likely area of change is the treatment of
function and language templates (e.g. alike(sum, max)), and more
flexible treatment of list templates (e.g. in future lists may be
allowed to be different lengths so long as every named element in the
template exists in the object).

Installation

This package is available on CRAN:

install.packages('vetr')

It has no runtime dependencies.

For the development version use
remotes::install_github('brodieg/vetr@development') or:

f.dl <- tempfile()
f.uz <- tempfile()
github.url <- 'https://github.com/brodieG/vetr/archive/development.zip'
download.file(github.url, f.dl)
unzip(f.dl, exdir=f.uz)
install.packages(file.path(f.uz, 'vetr-development'), repos=NULL, type='source')
unlink(c(f.dl, f.uz))

The master branch typically mirrors CRAN and should be stable.

Alternatives

There are many alternatives available to vetr. We do a survey of the
following in our parameter validation
functions

review:

The following packages also perform related tasks, although we have not
used them and do not review them:

  • valaddin v0.1.0 by Eugene Ha,
    a framework for augmenting existing functions with validation
    contracts. Currently the package is undergoing a major overhaul so
    we will add it to the comparison once the new release (v0.3.0) is
    out.
  • ensurer v1.1 by Stefan M.
    Bache, a framework for flexibly creating and combining validation
    contracts. The development version adds an experimental method for
    creating type safe functions, but it is not published to CRAN so we
    do not test it here.
  • validate by Mark van
    der Loo and Edwin de Jonge, with a primary focus on validating data
    in data frames and similar data structures.
  • assertr by Tony
    Fischetti, also focused on data validation in data frames and
    similar structures.
  • types by Jim Hester, which
    implements but does not enforce type hinting.
  • argufy by Gábor Csárdi,
    which implements parameter validation via roxygen tags (not released
    to CRAN).
  • typed by Antoine
    Fabri, which enforces types of symbols, function parameters, and
    return values.
  • erify by Renfei Mao, with a
    focus on readable error messages.

Acknowledgments

Thank you to:

  • R Core for developing and maintaining such a wonderful language.
  • CRAN maintainers, for patiently shepherding packages onto CRAN and
    maintaining the repository, and Uwe Ligges in particular for
    maintaining Winbuilder.
  • Users and others who have reported bugs and/or helped contribute
    fixes (see NEWS.md).
  • Tomas Kalibera for rchk and
    rcnst to help detect errors in compiled code, and in particular for
    his infinite patience in helping me resolve the issues he identified
    for me.
  • Jim Hester because
    covr rocks.
  • Dirk Eddelbuettel and Carl
    Boettiger
    for the
    rocker project, and Gábor
    Csárdi
    and the
    R-consortium for
    Rhub, without which testing bugs on
    R-devel and other platforms would be a nightmare.
  • Winston Chang for the
    r-debug docker container,
    in particular because of the valgrind level 2 instrumented version
    of R.
  • Hadley Wickham and Peter
    Danenberg
    for
    roxygen2.
  • Yihui Xie for
    knitr and J.J.
    Allaire
    etal for
    rmarkdown, and by
    extension John MacFarlane for pandoc.
  • Michel Lang for pushing me to implement
    all_bw to compete with his own package
    checkmate.
  • Eugene Ha for pointing me to several
    other relevant packages, which in turn led to the survey of related
    packages
    .
  • Stefan M. Bache for the idea of having a function for testing
    objects directly (originally vetr only worked with function
    arguments), which I took from ensurer.
  • Olaf Mersmann for
    microbenchmark,
    because microsecond matter, and Joshua
    Ulrich
    for making it lightweight.
  • All open source developers out there that make their work freely
    available for others to use.
  • Github, Codecov,
    Vagrant,
    Docker, Ubuntu,
    Brew for providing infrastructure that greatly
    simplifies open source development.
  • Free Software Foundation for developing the
    GPL license and promotion of the free software movement.

About the Author

Brodie Gaslam is a hobbyist programmer based on the US East Coast.

brodieG/vetr | GitHunt