Skip to content

Helm Deployment

Currently, the pachctl deploy command is the authoritative deployment method for Pachyderm. However, you can now deploy Pachyderm using the package manager Helm.

Note

  • Helm support for Pachyderm is a beta release. See our supported releases documentation for details.
  • Changes coming with Helm: For improved security, Pachyderm services are now exposed on the cluster internal IP (ClusterIP) instead of each node’s IP (Nodeport). These changes do not apply to LOCAL Helm installations (i.e. Services are still accessible through Nodeports on Local installations)

This page gives you a high level view of the steps to follow to install Pachyderm using Helm. Find our charts on Artifacthub.

Install

Prerequisites

  1. Install Helm.

  2. Choose the deployment guidelines that apply to you:

    • Find the deployment page that applies to your Cloud provider (or custom deployment, or on-premises deployment). It will help list the various installation prerequisites, Kubernetes deployment instructions, and kubectl installation that fit your own use case:

      For example, if your Cloud provider is Google Cloud Platform, follow the Prerequisites and Deploy Kubernetes sections of the deployment on Google Cloud Platform page.

    • Additionally, those instructions will also help you configure the various elements (persistent volume, object store, credentials...) that relate to your deployment needs. In the case of a deployment using pachctl deploy, the values of those parameters would ultimately be passed to the corresponding arguments and flags of the command.

      For example, in the case of a generic pachctl deploy google command:"

      pachctl deploy google <bucket-name> <disk-size> [<credentials-file>] [flags]
      
      This command comes with an exhaustive list of available flags.

      In the case of an installation using Helm, those same parameters values will now be specified in a YAML configuration file as follows.

Edit a values.yaml file

Create a personalized my_pachyderm_values.yaml out of this example repository. Pick the example that fits your target deployment and update the relevant fields according to the parameters gathered in the previous step.

See the conversion table at the end of this page. It should help you pass easily from the pachctl deploy arguments and flags to their attributes counterpart in values.yaml.

See also the reference values.yaml for an exhaustive list of all parameters.

Install the Pachyderm Helm Chart

  1. Get your Helm Repo Info

    $ helm repo add pachyderm https://pachyderm.github.io/helmchart
    $ helm repo update
    

  2. Install Pachyderm

    You are ready to deploy Pachyderm on the environment of your choice.

    $ helm install pachd -f my_pachyderm_values.yaml pachyderm/pachyderm
    
    You can choose a specific helm chart version by adding a --version flag (for example, --version 0.3.0). Each version of a chart is associated with a given version of Pachyderm. No mention of the version will install the latest available version of Pachyderm by default. Artifacthub lists all available chart versions and their associated version of Pachyderm.

Check your installation

  1. Check your deployment

    $ kubectl get pods
    

    System Response:

    NAME                     READY     STATUS    RESTARTS   AGE
    dash-6c9dc97d9c-89dv9    2/2       Running   0          1m
    etcd-0                   1/1       Running   0          4m
    pachd-65fd68d6d4-8vjq7   1/1       Running   0          4m
    
  2. Verify that the Pachyderm cluster is up and running

    $ pachctl version
    

    System Response:

    COMPONENT           VERSION
    pachctl             1.13.0
    pachd               1.13.0
    

Uninstall the Pachyderm Helm Chart

Helm uninstall a release as easily as you installed it.

$ helm uninstall pachd 

Conversion table

FLAG OPTION Values.yaml ATTRIBUTE DEFAULT
common ---- ---
--image-pull-secret imagePullSecret ""
dash ---- ---
--dash-image dash.image.repository pachyderm/dash
--registry dash.image.tag "0.5.57"
etcd ---- ---
--dynamic-etcd-nodes etcd.dynamicNodes 1
--etcd-storage-class etcd.storageClass ""
--etcd-cpu-request etcd.resources.requests.cpu "1"
--etcd-memory-request etcd.resources.requests.memory "2G"
pachd ---- ---
--block-cache-size pachd.blockCacheBytes "1G"
--cluster-deployment-id pachd.clusterDeploymentID ""
inverse of --no-expose-docker-socket pachd.exposeDockerSocket false
--expose-object-api pachd.exposeObjectAPI false
--pachd-cpu-request pachd.resources.requests.cpu "1"
--pachd-memory-request pachd.resources.requests.memory "2G"
--require-critical-servers-only pachd.requireCriticalServersOnly false
--shards pachd.numShards 16
--worker-service-account pachd.workerServiceAccount.name pachyderm-worker
pachd.storage ---- ---
--put-file-concurrency-limit pachd.storage.putFileConcurrencyLimit 100
--upload-concurrency-limit pachd.storage.uploadConcurrencyLimit 100
pachd.storage.amazon ---- ---
--cloudfront-distribution pachd.storage.amazon.cloudFrontDistribution ""
--credentials id together with secret and token,
implements the functionality of the
--credentials argument to pachctl deploy.
  • pachd.storage.amazon.id
  • pachd.storage.amazon.secret
  • pachd.storage.amazon.token
all 3 default to " "
--disable-ssl pachd.storage.amazon.disableSSL false
--iam-role pachd.storage.amazon.iamRole ""
--max-upload-parts pachd.storage.amazon.maxUploadParts 10000
--no-verify-ssl pachd.storage.amazon.noVerifySSL false
--obj-log-options pachd.storage.amazon.logOptions Comma-separated list containing zero or more of: 'Debug', 'Signing', 'HTTPBody', 'RequestRetries','RequestErrors', 'EventStreamBody', or 'all' (case-insensitive). See 'AWS SDK for Go' docs for details. Default to: ""
--part-size pachd.storage.amazon.partSize 5242880
--retries pachd.storage.amazon.retries 10
--reverse pachd.storage.amazon.reverse true
--timeout pachd.storage.amazon.timeout "5m"
--upload-acl pachd.storage.amazon.uploadACL bucket-owner-full-control
pachd.storage.local ---- ---
--host-path pachd.storage.local.hostPath "/var/pachyderm/"
rbac ---- ---
--local-roles rbac.clusterRBAC true
--no-rbac opposite of rbac.create true
tls ---- ---
--tls
  • tls.crt
  • tls.key
both default to ""

pachctl deploy flag deprecation

Deprecation notice

With the addition of the Helm chart, the following pachctl deploy flags are deprecated and will be removed in the future:

  • dash-image
  • dashboard-only
  • no-dashboard
  • expose-object-api
  • storage-v2
  • shards
  • no-rbac
  • no-guaranteed
  • static-etcd-volume
  • disable-ssl
  • max-upload-parts
  • no-verify-ssl
  • obj-log-options
  • part-size
  • retries
  • reverse
  • timeout
  • upload-acl


Last update: April 5, 2021
Does this page need fixing? Edit me on GitHub