The R data.table package is an extension of data.frame built for fast in-memory data analysis. Use the dt tag for the DataTables package with Shiny (DT).
r's data.table
package provides an enhanced version of data.frame
including fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast overlapping range joins, fast add/modify/delete of columns by reference by group using no copies at all, and a fast file reader: fread
. It has a natural syntax: DT[where|order, select|update, by]
. SQL-inspired syntax enables joins within []
by using on
to specify matching columns. These queries can be chained together just by adding another one on the end: DT[...][...]
.
The aggregation features are analogous to stats::ave
, plyr::ddply
, dplyr::group_by
and Python's pandas
, but faster.
Repositories
- Development version on GitHub (issues, wiki, speed benchmarks, Articles, Presentations). Install with
install.packages('data.table', type = 'source', repos = 'http://Rdatatable.github.io/data.table')
. - Stable version on CRAN
Detailed HTML vignettes
- Introduction to
data.table
- Reference Semantics
- Keys and fast binary search based subsets
- Efficient reshaping using
data.table
sdcast
andmelt
- Secondary sorting keys/indexing
- Tips for benchmarking
data.table
- Tips for using
data.table
functionality in your package
Other vignettes to follow, see here and feel free to voice support for your most-wanted!
Other resources
- DataCamp's cheat sheet
- Posts on R-bloggers
- Cheat Sheet by Erik Petrovski
Other operations to be benchmarked.