Reference
PachCTL

Egress PPS

Push the results of a Pipeline to an external data store or an SQL Database.

March 29, 2023

ℹī¸

For a single-page view of all PPS options, go to the PPS series page.

Spec #


"egress": {
    // Egress to an object store
    "URL": "s3://bucket/dir"
    // Egress to a database
    "sql_database": {
        "url": string,
        "file_format": {
            "type": string,
            "columns": [string]
        },
        "secret": {
            "name": string,
            "key": "PACHYDERM_SQL_PASSWORD"
        }
    }
},

Attributes #

AttributeDescription
URLThe URL of the object store where the pipeline’s output data should be written.
sql_databaseAn optional field that is used to specify how the pipeline should write output data to a SQL database.
urlThe URL of the SQL database, in the format postgresql://user:password@host:port/database.
file_formatThe file format of the output data, which can be specified as csv or tsv. This field also includes the column names that should be included in the output.
secretThe name and key of the Kubernetes secret that contains the password for the SQL database.

Behavior #

The egress field in a Pachyderm Pipeline Spec is used to specify how the pipeline should write the output data. The egress field supports two types of outputs: writing to an object store and writing to a SQL database.

Data is pushed after the user code finishes running but before the job is marked as successful. For more information, see Egress Data to an object store or Egress Data to a database.

This is required if the pipeline needs to write output data to an external storage system.

When to Use #

You should use the egress field in a Pachyderm Pipeline Spec when you need to write the output data from your pipeline to an external storage system, such as an object store or a SQL database.

Example scenarios: