Welcome to Piperoni’s documentation!¶
Piperoni is a lightweight ETL framework for any data type, which allows you to make, track, and visualize atomic data transformations. Unlike some ETL tools, Piperoni relies on in-memory transformation, and thus is ideal for manipulating complex, diverse non-“big”-data.
Piperoni allows you to make and track atomic data transformations, ensures expected types are being passed from transformation to transformation, and allows you to easily see the state of the data at any point in time. Piperoni is a great tool for collaborative data pipelines, where visibility into data transformations is key.
Piperoni allows you to:
Chain custom and built-in Operators together to transform data.
Use custom or built-in Extractors to pull data from any format and transform it.
Use custom or built-in Transformers to manipulate data for featurization, normalization, etc.
Create custom Casters to explicitly cast from one datatype to another (e.g. dict to pandas DataFrame).
Log every operation before and after transformation.
Installation¶
You can install piperoni via pip:
$ pip install piperoni