This package incorporates C++ Standard Template Library containers into R. In doing so, it meets various objectives and targets different groups of R users.
First, it speeds up development workflows. If you previously wanted to make use of C++ containers, you had to write C++ code and wait for compilation every time you changed it. To evaluate the data held in C++, you also had to write print or export functions for each setup. cppcontainers does these steps for you. It wraps C++ containers, their methods, and helper functions and compiles them when the package is installed. There is no need to write C++ code or wait for compilation after the package has been installed. You can use the containers as interactively and without waiting time as R’s native data frames.
Second, it allows user who do not administer their machines without an installed compiler (e.g., without Rtools on Windows) to use C++ containers, through the binary version of the package.
Third, it makes C++ easily accessible to R users. Much of the R community does not understand C++ and does not want to undertake the effort of learning it. This can impede collaboration between team members who are familiar with C++ and those who are not. cppcontainers circumvents this problem by abstracting away much of the underlying C++ code, framing everything in an R-style syntax (incl. 1-based indices), and explaining functions in a language intuitive to R users.
Fourth, it pushes R towards a general purpose programming language. One of the big differences between R and general purpose languages is the availability of data structures. In R, almost everything is some sort of vector; a vector of primitives, a vector of pointers, etc. Even scalars are vectors of length one. This simplicity makes R very accessible and easy to learn for its audience, that is largely comprised of people with a background in quantitative, empirical methods but without training in software engineering. However, users who come from a general purpose programming language, such as C++ or Python, likely miss a larger collection of data structures. cppcontainers fills this gap.
Fifth, this package brings the performance of C++ Standard Template
Library containers to R. These data structures are tailored towards
certain use cases and can speed up execution times. Of course, there is
some overhead involved compared to entirely writing the code in C++.
Yet, by providing a thin wrapper around Rcpp::XPtr
,
cppcontainers attempts to minimize this overhead.
It is important to keep this motivation in mind. There is a lot more what you can do with C++ and consequently Rcpp, than what this package implements. This, e.g., includes nested containers. The aim of this package is not to incorporate all features of an entire other language into R. It adds functionality in one specific domain.
cppcontainers wraps many of the Standard Template Library’s container types. The following list briefly introduces them in terms that are easy to grasp for users with little technical expertise. It avoids a discussion of performance characteristics or thread-safety implications.
Associative containers
Container adapters
Sequence containers
There is a lot to know on the efficiency of element access at different positions, element insertion and removal, space efficiency, etc. It is too much to cover in this vignette. The list above primarily elaborates on uniqueness and sorting. It grants a first idea of what container types the package wraps and on which container types to read up on in other sources of documentation.
In this package, containers can hold primitives of four different types: integers, doubles, strings, and booleans. These correspond to the integer, numeric, character, and logical types in R. Native C++ strings are different from R strings, which is why the string type requires comparatively more conversions and can be a little less efficient to work with than the alternatives. Booleans are also differently implemented, but are converted more implicitly than strings are.
Much of STL containers’ power lies in nested structures. In C++, you can, e.g., assemble a vector of sets or a queue of multimaps. This increases the universe of potential setups exponentially. Since an R wrapper package needs to hardcode all options, cppcontainers does not implement nested compositions and only targets primitive elements of the four mentioned types.
This package covers many of the container types’ methods and maintains
their original names. To make them more intuitive for R users, it
replaces the object-oriented notation (x.size()
), with the
in R more commonly used functional notation (size(x)
).
cppcontainers omits methods that return values such as pointers or
iterators, which are not directly utilizable in R. One example is the
equal_range
method for maps.
A few methods are modified to make them more convenient for R users.
This particularly includes indexing and vector inputs. Methods like
count
and contains
only accept a single value
in C++, but allow for a vector in this package. cppcontainers solves it
with iterative procedures on the C++ side.
An exception from the naming convention are the functions creating the
objects. To avoid clashes with, e.g., base::vector
and
utils::stack
, this package prefixes the respective methods
with cpp_
. cpp_set
creates a set,
cpp_vector
creates a vector, etc. Another outlier is
remove
, which the package implements as
remove.
(with a dot suffix) to avoid compatibility problems
with base::remove
.
The following table illustrates which methods apply to which container type. Italicized names do not refer to original C++ methods, but to functions added by this package. They aid in accessing the objects’ data from R.
method | set | unordered set | multiset | unordered multiset | map | unordered map | multimap | unordered multimap |
---|---|---|---|---|---|---|---|---|
== | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
[ | ✓ | ✓ | ||||||
assign | ||||||||
at | ✓ | ✓ | ||||||
back | ||||||||
bucket_count | ✓ | ✓ | ✓ | ✓ | ||||
capacity | ||||||||
clear | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
contains | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
count | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
emplace | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
emplace_after | ||||||||
emplace_back | ||||||||
emplace_front | ||||||||
empty | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
erase | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
erase_after | ||||||||
flip | ||||||||
front | ||||||||
insert | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
insert_after | ||||||||
insert_or_assign | ✓ | ✓ | ||||||
load_factor | ✓ | ✓ | ✓ | ✓ | ||||
max_bucket_count | ✓ | ✓ | ✓ | ✓ | ||||
max_load_factor | ✓ | ✓ | ✓ | ✓ | ||||
max_size | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
merge | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
pop | ||||||||
pop_back | ||||||||
pop_front | ||||||||
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
push | ||||||||
push_back | ||||||||
push_front | ||||||||
rehash | ✓ | ✓ | ✓ | ✓ | ||||
remove. | ||||||||
reserve | ✓ | ✓ | ✓ | ✓ | ||||
resize | ||||||||
reverse | ||||||||
shrink_to_fit | ||||||||
size | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
sort | ||||||||
sorting | ||||||||
splice | ||||||||
splice_after | ||||||||
to_r | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
top | ||||||||
try_emplace | ✓ | ✓ | ||||||
type | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
unique |
method | stack | queue | priority queue |
---|---|---|---|
== | ✓ | ✓ | |
[ | |||
assign | |||
at | |||
back | ✓ | ||
bucket_count | |||
capacity | |||
clear | |||
contains | |||
count | |||
emplace | ✓ | ✓ | ✓ |
emplace_after | |||
emplace_back | |||
emplace_front | |||
empty | ✓ | ✓ | ✓ |
erase | |||
erase_after | |||
flip | |||
front | ✓ | ||
insert | |||
insert_after | |||
insert_or_assign | |||
load_factor | |||
max_bucket_count | |||
max_load_factor | |||
max_size | |||
merge | |||
pop | ✓ | ✓ | ✓ |
pop_back | |||
pop_front | |||
✓ | ✓ | ✓ | |
push | ✓ | ✓ | ✓ |
push_back | |||
push_front | |||
rehash | |||
remove. | |||
reserve | |||
resize | |||
reverse | |||
shrink_to_fit | |||
size | ✓ | ✓ | ✓ |
sort | |||
sorting | ✓ | ||
splice | |||
splice_after | |||
to_r | ✓ | ✓ | ✓ |
top | ✓ | ✓ | |
try_emplace | |||
type | ✓ | ✓ | ✓ |
unique |
method | vector | deque | forward list | list |
---|---|---|---|---|
== | ✓ | ✓ | ✓ | ✓ |
[ | ✓ | ✓ | ||
assign | ✓ | ✓ | ✓ | ✓ |
at | ✓ | ✓ | ||
back | ✓ | ✓ | ✓ | |
bucket_count | ||||
capacity | ✓ | |||
clear | ✓ | ✓ | ✓ | ✓ |
contains | ||||
count | ||||
emplace | ✓ | ✓ | ✓ | |
emplace_after | ✓ | |||
emplace_back | ✓ | ✓ | ✓ | |
emplace_front | ✓ | ✓ | ✓ | |
empty | ✓ | ✓ | ✓ | ✓ |
erase | ✓ | ✓ | ✓ | |
erase_after | ✓ | |||
flip | ✓ | |||
front | ✓ | ✓ | ✓ | ✓ |
insert | ✓ | ✓ | ✓ | |
insert_after | ✓ | |||
insert_or_assign | ||||
load_factor | ||||
max_bucket_count | ||||
max_load_factor | ||||
max_size | ✓ | ✓ | ✓ | ✓ |
merge | ✓ | ✓ | ||
pop | ||||
pop_back | ✓ | ✓ | ✓ | |
pop_front | ✓ | ✓ | ✓ | |
✓ | ✓ | ✓ | ✓ | |
push | ||||
push_back | ✓ | ✓ | ✓ | |
push_front | ✓ | ✓ | ✓ | |
rehash | ||||
remove. | ✓ | ✓ | ||
reserve | ✓ | |||
resize | ✓ | ✓ | ✓ | ✓ |
reverse | ✓ | ✓ | ||
shrink_to_fit | ✓ | ✓ | ||
size | ✓ | ✓ | ✓ | |
sort | ✓ | ✓ | ||
sorting | ||||
splice | ✓ | |||
splice_after | ✓ | |||
to_r | ✓ | ✓ | ✓ | ✓ |
top | ||||
try_emplace | ||||
type | ✓ | ✓ | ✓ | ✓ |
unique | ✓ | ✓ |
The manual pages provide ample examples on the classes and methods in this package. The subsequent code just showcases a few of the functionalities outside of any meaningful workflow.
## set
s <- cpp_set(c(4, 2, 3))
s
#> 2 3 4
insert(s, c(4, 6))
s
#> 2 3 4 6
print(s, n = -2)
#> 6 4
## unordered set
s <- cpp_unordered_set(3:5)
s
#> 5 4 3
emplace(s, 2)
s
#> 2 5 4 3
type(s)
#> [1] "integer"
## multiset
s <- cpp_multiset(c(3, 3, 6, 2))
s
#> 2 3 3 6
contains(s, 2)
#> [1] TRUE
to_r(s)
#> [1] 2 3 3 6
## unordered multiset
s <- cpp_unordered_multiset(c(3, 3, 6, 2))
s
#> 6 2 3 3
count(s, 3)
#> [1] 2
rehash(s)
## map
m <- cpp_map(c("Alice", "Bob"), c(TRUE, FALSE))
m
#> ["Alice",TRUE] ["Bob",FALSE]
m["Bob"]
#> [1] FALSE
insert_or_assign(m, FALSE, "Alice")
m
#> ["Alice",FALSE] ["Bob",FALSE]
## unordered map
m1 <- cpp_unordered_map(c("Alice", "Bob"), 3:4)
m1
#> ["Bob",4] ["Alice",3]
m2 <- cpp_unordered_map(c("Bob", "Jane"), 6:7)
m2
#> ["Jane",7] ["Bob",6]
merge(m1, m2)
m1
#> ["Jane",7] ["Bob",4] ["Alice",3]
m2
#> ["Bob",6]
at(m1, "Jane")
#> [1] 7
## multimap
m <- cpp_multimap(c("Alice", "Bob", "Bob"), 3:5)
m
#> ["Alice",3] ["Bob",4] ["Bob",5]
clear(m)
empty(m)
#> [1] TRUE
## unordered multimap
m <- cpp_multimap(c("Alice", "Bob", "Bob"), 3:5)
m
#> ["Alice",3] ["Bob",4] ["Bob",5]
size(m)
#> [1] 3
erase(m, "Bob")
m
#> ["Alice",3]
## stack
s1 <- cpp_stack(3:4)
s1
#> Top element: 4
s2 <- cpp_stack(3:4)
s2
#> Top element: 4
s1 == s2
#> [1] TRUE
pop(s1)
s1
#> Top element: 3
## queue
q <- cpp_queue(c(2.1, 3.3))
q
#> First element: 2.1
push(q, 2.7)
q
#> First element: 2.1
back(q)
#> [1] 2.7
## priority queue
p <- cpp_priority_queue(c(2.4, 1.3, 4.2, 1.5))
p
#> First element: 4.2
top(p)
#> [1] 4.2
sorting(p)
#> [1] "descending"
## vector
v <- cpp_vector(2:4)
v
#> 2 3 4
reserve(v, 10)
capacity(v)
#> [1] 10
## deque
d <- cpp_deque(c("Alice", "Bob", "Bob"))
d
#> "Alice" "Bob" "Bob"
push_front(d, "Jane")
d
#> "Jane" "Alice" "Bob" "Bob"
pop_front(d)
d
#> "Alice" "Bob" "Bob"
## forward list
l <- cpp_forward_list(c(4, 2, 6))
l
#> 4 2 6
sort(l)
l
#> 2 4 6
resize(l, 7)
l
#> 2 4 6 0 0 0 0
## list
l <- cpp_list(c("Alice", "Bob", "Bob", "Alice"))
unique(l)
#> [1] 1
l
#> "Alice" "Bob" "Alice"
print(l, from = 3)
#> "Alice" "Bob" "Alice"