This guide walks you through the steps to install Pachyderm on macOS®, Linux®, or Microsoft® Windows®. Local installation helps you to learn some of the Pachyderm basics and is not designed to be a production environment.
pachctl deploy local is designed for a single-node cluster. This cluster uses local storage on disk and does not create a PersistentVolume (PV). If you want to deploy a production multi-node cluster, follow the instructions for your cloud provider or on-prem installation as described in Deploy Pachyderm. New Kubernetes nodes cannot be added to this single-node cluster.
Pachyderm supports the Docker runtime only. If you want to deploy Pachyderm on a system that uses another container runtime, ask for advice in our Slack channel.
Before you can deploy Pachyderm, make sure you have the following programs installed on your computer:
If you want to install Pachyderm on Windows, follow the instructions in Deploy Pachyderm on Windows.
On your local machine, you can run Pachyderm in a minikube virtual machine. Minikube is a tool that creates a single-node Kubernetes cluster. This limited installation is sufficient to try basic Pachyderm functionality and complete the Beginner Tutorial.
To configure Minikube, follow these steps:
- Install minikube and VirtualBox in your operating system as described in the Kubernetes documentation.
Any time you want to stop and restart Pachyderm, run
minikube delete and
minikube start. Minikube is not meant to be a production environment and does not handle being restarted well without a full wipe.
If you are using Minikube, skip this section and proceed to Install pachctl
You can use Docker Desktop instead of Minikube on macOS or Linux by following these steps:
- In the Docker Desktop settings, verify that Kubernetes is enabled:
- From the command prompt, confirm that Kubernetes is running:
kubectl get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 56d
- To reset your Kubernetes cluster that runs on Docker Desktop, click the Reset button in the Preferences sub-menu.
pachctl is a command-line utility that you can use to interact with a Pachyderm cluster.
To deploy Pachyderm locally, you need to have pachctl installed on your machine by following these steps:
Run the corresponding steps for your operating system:
For macOS, run:
brew tap pachyderm/tap && brew install firstname.lastname@example.org
For a Debian-based Linux 64-bit or Windows 10 or later running on WSL:
curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.12.1/pachctl_1.12.1_amd64.deb && sudo dpkg -i /tmp/pachctl.deb
For all other Linux flavors:
curl -o /tmp/pachctl.tar.gz -L https://github.com/pachyderm/pachyderm/releases/download/v1.12.1/pachctl_1.12.1_linux_amd64.tar.gz && tar -xvf /tmp/pachctl.tar.gz -C /tmp && sudo cp /tmp/pachctl_1.12.1_linux_amd64/pachctl /usr/local/bin
Verify that installation was successful by running
pachctl version --client-only:
pachctl version --client-only
COMPONENT VERSION pachctl 1.12.1
If you run
pachctl version without
--client-only, the command times out. This is expected behavior because
pachd is not yet running.
After you configure all the Prerequisites, deploy Pachyderm by following these steps:
If you are new to Pachyderm, try Pachyderm Shell. This handy tool suggests you
pachctl commands as you type and helps you learn Pachyderm faster.
- For macOS or Linux, run:
pachctl deploy local
This command generates a Pachyderm manifest and deploys Pachyderm on Kubernetes.
In WSL, run:
pachctl deploy local --dry-run > pachyderm.json
pachyderm.jsonfile into your working directory.
From the same directory, run:
kubectl create -f ./pachyderm.json
Because Pachyderm needs to pull the Pachyderm Docker image from DockerHub, it might take a few minutes for the Pachyderm pods status to change to
- Check the status of the Pachyderm pods by periodically running
kubectl get pods. When Pachyderm is ready for use, all Pachyderm pods must be in the Running status.
kubectl get pods
NAME READY STATUS RESTARTS AGE dash-6c9dc97d9c-vb972 2/2 Running 0 6m etcd-7dbb489f44-9v5jj 1/1 Running 0 6m pachd-6c878bbc4c-f2h2c 1/1 Running 0 6m
If you see a few restarts on the
pachd nodes, that means that Kubernetes tried to bring up those pods before
etcd was ready. Therefore, Kubernetes restarted those pods. You can safely ignore that message.
pachctl versionto verify that
pachdhas been deployed.
COMPONENT VERSION pachctl 1.12.1 pachd 1.12.1
Open a new terminal window.
Use port forwarding to access the Pachyderm dashboard.
This command runs continuosly and does not exit unless you interrupt it.
Alternatively, you can set up Pachyderm to directly connect to the Minikube instance:
Get your Minikube IP address:
Configure Pachyderm to connect directly to the Minikube instance:
pachctl config update context `pachctl config get active-context` --pachd-address=`minikube ip`:30650
After you install and configure Pachyderm, continue exploring Pachyderm:
Complete the Beginner Tutorial to learn the basics of Pachyderm, such as adding data and building analysis pipelines.
Explore the Pachyderm Dashboard. By default, Pachyderm deploys the Pachyderm Enterprise dashboard. You can use a FREE trial token to experiment with the dashboard. Point your browser to port
30080on your minikube IP. Alternatively, if you cannot connect directly, enable port forwarding by running
pachctl port-forward, and then point your browser to