A local installation helps you learn some of the Pachyderm basics and experiment. It is not designed to be a production environment.
This guide walks you through the steps to install Pachyderm on macOS®, Linux®, or Microsoft® Windows®.
To install Pachyderm on Windows, take a look at Deploy Pachyderm on Windows first.
We offer two ways to deploy Pachyderm on a local Kubernetes cluster.
- The first uses Pachyderm's client
pachctland the command
pachctl deploy local.
- The second uses the deployment tool
- Helm support in Pachyderm is a beta release. See our supported releases documentation for details.
pachctl deploy localis designed for a single-node cluster. This cluster uses local storage on disk and does not create a PersistentVolume (PV). If you want to deploy a production multi-node cluster, follow the instructions for your cloud provider or on-prem installation as described in Deploy Pachyderm. New Kubernetes nodes cannot be added to this single-node cluster.
- Pachyderm supports the Docker runtime only. If you want to deploy Pachyderm on a system that uses another container runtime, ask for advice in our Slack channel.
Before you deploy Pachyderm, make sure that you have installed:
- A Kubernetes cluster running on your local environment:
- Pachyderm Command Line Interface
- Helm depending on your installation choice.
On your local machine, you can run Pachyderm in a minikube virtual machine. Minikube is a tool that creates a single-node Kubernetes cluster. This limited installation is sufficient to try basic Pachyderm functionality and complete the Beginner Tutorial.
To configure Minikube, follow these steps:
- Install minikube and VirtualBox in your operating system as described in the Kubernetes documentation.
Any time you want to stop and restart Pachyderm, run
minikube delete and
minikube start. Minikube is not meant to be a production environment and does not handle being restarted well without a full wipe.
Using Kubernetes on Docker Desktop¶
If you are using Minikube, skip this section.
You can use Kubernetes on Docker Desktop instead of Minikube on macOS or Linux by following these steps:
In the Docker Desktop Preferences, enable Kubernetes:
From the command prompt, confirm that Kubernetes is running:
kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 56d
To reset your Kubernetes cluster that runs on Docker Desktop, click the Reset Kubernetes cluster button. See image above.
pachctl is a command-line tool that you can use to interact with a Pachyderm cluster in your terminal.
You need to have
pachctl installed on your machine to deploy Pachyderm using the
pachctl deploy local command:
Run the corresponding steps for your operating system:
- For macOS, run:
brew tap pachyderm/tap && brew install email@example.com
- For a Debian-based Linux 64-bit or Windows 10 or later running on WSL:
curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.13.0/pachctl_1.13.0_amd64.deb && sudo dpkg -i /tmp/pachctl.deb
- For all other Linux flavors:
curl -o /tmp/pachctl.tar.gz -L https://github.com/pachyderm/pachyderm/releases/download/v1.13.0/pachctl_1.13.0_linux_amd64.tar.gz && tar -xvf /tmp/pachctl.tar.gz -C /tmp && sudo cp /tmp/pachctl_1.13.0_linux_amd64/pachctl /usr/local/bin
Verify that installation was successful by running
pachctl version --client-only:
pachctl version --client-only
COMPONENT VERSION pachctl 1.13.0
If you run
pachctl versionwithout the flag
--client-only, the command times out. This is expected behavior because Pachyderm has not been deployed yet (
pachdis not yet running).
A look at Pachyderm high-level architecture diagram will help you build a mental image of Pachyderm various architectural components.
If you choose to install Pachyderm using Helm, follow this installation guide.
When done with the Prerequisites, deploy Pachyderm on your local cluster by following these steps:
If you are new to Pachyderm, try Pachyderm Shell. This add-on tool suggests
pachctl commands as you type. It will help you learn Pachyderm's main commands faster.
For macOS or Linux, run:
pachctl deploy local
This command generates a Pachyderm manifest and deploys Pachyderm on Kubernetes.
Try the following dry run to visualize your manifest:
pachctl deploy local --dry-run > pachyderm.json
- Start Windows Subsystem for Linux.
In WSL, run:
pachctl deploy local --dry-run > pachyderm.json
pachyderm.jsonfile into your working directory.
From the same directory, run:
kubectl create -f ./pachyderm.json
Get the Repo Info:
$ helm repo add pachyderm https://pachyderm.github.io/helmchart
$ helm repo update
Edit a values file
Find a baseline file for local deployments in this example repository and set the
Install the Pachyderm helm chart (helm v3):
$ helm install pachd -f my_pachyderm_values.yaml pachyderm/pachyderm
Check your install¶
Check the status of the Pachyderm pods by periodically running
kubectl get pods. When Pachyderm is ready for use, all Pachyderm pods must be in the Running status.
Because Pachyderm needs to pull the Pachyderm Docker image from DockerHub, it might take a few minutes for the Pachyderm pods status to change to
kubectl get pods
NAME READY STATUS RESTARTS AGE dash-6c9dc97d9c-vb972 2/2 Running 0 6m etcd-7dbb489f44-9v5jj 1/1 Running 0 6m pachd-6c878bbc4c-f2h2c 1/1 Running 0 6m
If you see a few restarts on the
pachd nodes, that means that Kubernetes tried to bring up those pods before
etcd was ready. Therefore, Kubernetes restarted those pods. You can safely ignore that message.
pachctl versionto verify that
pachdhas been deployed.
$ pachctl version
COMPONENT VERSION pachctl 1.13.0 pachd 1.13.0
Open a new terminal window.
Use port forwarding to access the Pachyderm dashboard (Pachyderm UI).
This command runs continuosly and does not exit unless you interrupt it.
Minikube users: you can alternatively set up Pachyderm to directly connect to the Minikube instance:
Get your Minikube IP address:
Configure Pachyderm to connect directly to the Minikube instance:
pachctl config update context `pachctl config get active-context` --pachd-address=<minikube ip>:30080
Complete the Beginner Tutorial to learn the basics of Pachyderm, such as adding data and building analysis pipelines.
Explore the Pachyderm Dashboard. By default, Pachyderm deploys the Pachyderm Enterprise dashboard. You can use a FREE trial token to experiment with the dashboard. Point your browser to port
30080on your minikube IP. Alternatively, if you cannot connect directly, enable port forwarding by running
pachctl port-forward, and then point your browser to