In technical computing, getting data into and out of your code can be a pain. Data comes in all shapes, sizes and formats, with many different locations and storage access mechanisms.
DataSets.jl is a new package for describing data declaratively and mapping it neatly into your programs. We aim to make your code portable between data environments and remove the cruft of local paths and data access wrappers which litter technical analysis code.
DataSets.jl is an open source package for describing data format and location declaratively so that one can better separate data deserialization and access from the domain-specific analysis code which consumes that data.
To quote from the package documentation available at https://juliacomputing.github.io/DataSets.jl/dev :
DataSets.jl exists to help manage data and reduce the amount of data wrangling code you need to write. It's annoying to constantly rewrite
DataSets.jl provides scaffolding to make this kind of code more reusable. We want to make it easy to relocate an algorithm between different data environments without code changes. For example from your laptop to the cloud, to another user's machine, or to an HPC system.