Quilt - Data Engineering Infrastructure
With Quilt you can build, push, and install data packages. Data packages are versioned, reusable data structures that can be loaded into Python. Quilt is designed to support reproducible, auditable, and compliant workflows. Quilt consists of three source-level components data catalog, data registry and data compiler.
Its features include:
Its features include:
- Versioning and storage of large data
- Transformation of a variety of file formats into data frames (via pandas and pyarrow)
- De-duplication of repeated data for reduced disk and network footprint
https://quiltdata.com
https://github.com/quiltdata/quilt
License:
Tech:
Tags: