CLI Reference¶

Bodywork is distributed as a Python 3 package that exposes a CLI for interacting with your Kubernetes cluster. Using the Bodywork CLI you can deploy Bodywork-compatible ML projects, packaged as Git repositories hosted on either, GitHub, GitLab, Azure DevOps or BitBucket. This page is a reference for all Bodywork CLI commands.

Get Version¶

$ bodywork --version

Prints the Bodywork package version to stdout.

Validate Configuration File¶

The bodywork.yaml file can be checked for errors by issuing the following command from the CLI,

$ bodywork validate --check-files

The optional --check-files flag will check if all executable_module_path paths map to files that exist and can be reached by Bodywork, from the root directory where bodywork.yaml is located. This command assumes that bodywork.yaml is in the current working directory - if this is not the case, use the --file option to specify the path of bodywork.yaml. Validation errors are printed to stdout.

Configure Namespace¶

$ bodywork setup-namespace YOUR_NAMESPACE

Create and prepare a Kubernetes namespace for running Bodywork workflows - see Preparing a Namespace for use with Bodywork for more information. This command will also work with namespaces created by other means - e.g. kubectl create ns YOUR_NAMESPACE - where it will not seek to recreate the existing namespace, only to ensure that it is correctly configured.

Run Workflow¶

$ bodywork workflow \
    --namespace=YOUR_NAMESPACE \
    REMOTE_GIT_REPO_URL \
    REMOTE_GIT_REPO_BRANCH

Clones the chosen branch of a Git repository containing a Bodywork ML project and then executes the workflow configured within it. This will start a Bodywork workflow-controller wherever the command is called. If you are working with private remote repositories you will need to use the SSH protocol and ensure that the appropriate private-key is available within a secret - see Working with Private Git Repositories using SSH for more information.

Run Stage¶

$ bodywork stage \
    REMOTE_GIT_REPO_URL \
    REMOTE_GIT_REPO_BRANCH \
    STAGE_NAME

Clones the chosen branch of a Git repository containing a Bodywork ML project and then executes the named stage. This is equivalent to installing all the 3rd party Python package requirements specified in the stage's requirement.txt file, and then executing python NAME_OF_EXECUTABLE_PYTHON_MODULE.py as defined for the stage in bodywork.yaml. See Configuring Stages for more information. The Bodywork stage-runner will be started wherever the command is called.

This command is intended for use by Bodywork containers and it is not recommended for use during Bodywork project development on your local machine.

Manage Deployments¶

A deployment is defined as a workflow-controller running as a job within the cluster (as opposed to a workflow-controller running locally). The workflow-controller deploys projects by executing the workflow defined in the project's DAG.

Get Deployments¶

$ bodywork deployment display \
    --namespace=YOUR_NAMESPACE

Will list all workflow-controller jobs that have run within YOUR_NAMESPACE, whether or not they have been successful. All workflow-controller jobs are deleted after they have been in a completed state for 15 minutes.

Create Deployments¶

$ bodywork deployment create \
    --namespace=YOUR_NAMESPACE \
    --name=DEPLOYMENT_NAME \
    --git-repo-url=REMOTE_GIT_REPO_URL \
    --git-repo-branch=REMOTE_GIT_REPO_BRANCH \
    --retries=NUMBER_OF_TIMES_TO_RETRY_ON_FAILURE \
    --local-workflow-contoller

Will immediately deploy your project by starting a workflow-controller job in your cluster, unless the --local-workflow-contoller has been used, in which case this command becomes an alias for the Run Workflow command, and will run the workflow controller locally for easy testing.

Delete Deployment-Jobs¶

$ bodywork deployment delete_job \
    --namespace=YOUR_NAMESPACE \
    --name=DEPLOYMENT_NAME

When a deployment is created, a workflow-controller job is started in your cluster. Not all clusters are configured to clean-up these jobs up automatically, in which case you may have to delete them manually.

Get Deployment Workflow Logs¶

$ bodywork deployment logs \
    --namespace=YOUR_NAMESPACE \
    --name=DEPLOYMENT_NAME

Stream the workflow logs from the workflow-controller job, to your terminal's standard output stream.

Manage Secrets¶

Secrets are used to pass credentials to containers running workflow stages that require authentication with 3rd party services (e.g. cloud storage providers). See Managing Credentials and Other Secrets and Injecting Secrets into Stage Containers for more information.

Create Secrets¶

$ bodywork secret create \
    --namespace=YOUR_NAMESPACE \
    --name=SECRET_NAME \
    --data SECRET_KEY_1=secret-value-1 SECRET_KEY_2=secret-value-2

Delete Secrets¶

$ bodywork secret delete \
    --namespace=YOUR_NAMESPACE \
    --name=SECRET_NAME

Get Secrets¶

$ bodywork secret display \
    --namespace=YOUR_NAMESPACE

Will print all secrets in YOUR_NAMESPACE to stdout.

$ bodywork secret display \
    --namespace=YOUR_NAMESPACE \
    --name=SECRET_NAME

Will only print SECRET_NAME to stdout.

Manage Services¶

Unlike batch stages that have a discrete lifetime, service deployments live indefinitely and may need to be managed as your project develops.

Get Services¶

$ bodywork service display \
    --namespace=YOUR_NAMESPACE

Will list information on all active service deployments available in YOUR_NAMESPACE, including their internal cluster URLs.

Delete Services¶

$ bodywork service delete \
    --namespace=YOUR_NAMESPACE \
    --name=SERVICE_NAME

Delete an active service deployment - e.g. one that is no longer required for a project.

Manage Cronjobs¶

Workflows can be executed on a schedule using Bodywork cronjobs. Scheduled workflows will be managed by workflow-controller jobs that Bodywork starts automatically on your cluster.

Get Cronjobs¶

$ bodywork cronjob display \
    --namespace=YOUR_NAMESPACE

Will list all active cronjobs within YOUR_NAMESPACE.

Create Cronjob¶

$ bodywork cronjob create \
    --namespace=YOUR_NAMESPACE \
    --name=CRONJOB_NAME \
    --schedule=CRON_SCHEDULE \
    --git-repo-url=REMOTE_GIT_REPO_URL \
    --git-repo-branch=REMOTE_GIT_REPO_BRANCH \
    --retries=NUMBER_OF_TIMES_TO_RETRY_ON_FAILURE \
    --history-limit=MIN_NUMBER_OF_WORKFLOW_CONTROLLER_JOBS_TO_RETAIN

Will create a cronjob whose schedule must be a valid cron expression - e.g. 0 * * * * will run the workflow every hour. Use the MIN_NUMBER_OF_WORKFLOW_CONTROLLER_JOBS_TO_RETAIN argument to set the minimum number of historical workflow-controller jobs that are retained, at any given moment in time.

Delete Cronjob¶

$ bodywork cronjob delete \
    --namespace=YOUR_NAMESPACE \
    --name=CRONJOB_NAME

Will also delete all historic workflow-controller jobs associated with this cronjob.

Get Cronjob History¶

$ bodywork cronjob history \
    --namespace=YOUR_NAMESPACE \
    --name=CRONJOB_NAME

Display all workflow-controller jobs that were created by a cronjob.

Get Cronjob Workflow Logs¶

$ bodywork cronjob logs \
    --namespace=YOUR_NAMESPACE \
    --name=HISTORICAL_CRONJOB_WORKFLOW_EXECUTION_JOB_NAME

Stream the workflow logs from a historical workflow-controller job, to your terminal's standard output stream.

Debug¶

$ bodywork debug SECONDS

Runs the Python time.sleep function for SECONDS. This is intended for use with the Bodywork image and kubectl - for deploying a container on which to open shell access for advanced debugging. For example, issuing the following command,

$ kubectl create deployment DEBUG_DEPLOYMENT_NAME \
    -n YOUR_NAMESPACE \
    --image=bodyworkml/bodywork-core:latest \
    -- bodywork debug SECONDS

Will deploy the Bodywork container and run the bodywork debug SECONDS command within it. While the container is sleeping, a shell on the container in this deployment can be started. To achieve this, first of all find the pod's name, using,

$ kubectl get pods -n YOUR_NAMESPACE | grep DEBUG_DEPLOYMENT_NAME

And then open a shell to the container within this pod using,

$ kubectl exec DEBUG_DEPLOYMENT_POD_NAME -n YOUR_NAMESPACE -it -- /bin/bash

Once you're finished debugging, the deployment can be shut-down using,

$ kubectl delete deployment DEBUG_DEPLOYMENT_NAME -n YOUR_NAMESPACE