Skip to content


This section details the foundational concepts of Pachyderm's data versioning and pipeline semantics broken down into two main components:

  • Pachyderm File System (PFS) manages Pachyderm's data and versioning system.
  • Pachyderm Pipeline System (PPS) enables you to perform various transformations on your data.

After you have a good grasp of the basics, move to advanced concepts and features.

In particular, you will learn:

Versioned Data Concepts     

Learn about the main Pachyderm abstractions that you will operate with when using Pachyderm.

Pipeline Concepts    

Learn the main concepts of the Pachyderm pipeline system.

Advanced Concepts     

More about Pachyderm abstractions: Global IDs, deferred processing, and distributed computing.

Last update: July 23, 2022
Does this page need fixing? Edit me on GitHub