The Pachyderm platform brings together version control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to develop their code in any language, framework, or tool of their choice. Pachyderm has been proven to be the ideal foundation for teams looking to use ML and AI to solve real-world problems in a reliable way.
The Pachyderm platform includes the following main components:
- Pachyderm File System (PFS)
- Pachyderm pipelines
To start, you need to understand the foundational concepts of Pachyderm's data versioning and pipeline semantics. After you have a good grasp of the basics, you can use advanced concepts and features for more complicated challenges.
This section describes the following Pachyderm concepts:
Versioned Data Concepts
Learn about the main Pachyderm abstractions that you will operate with when using Pachyderm.
Learn the main concepts of the Pachyderm pipeline system.
More about Pachyderm abstractions: Global IDs, deferred processing, and distributed computing.
Last update: November 1, 2021