Setting up a default HPCC Systems cluster on Microsoft Azure Cloud Using HPCC Systems and Kubernetes
HPCC Systems is now available for download on bare metal on our website.
The first preview of the new design for providing a cloud native HPCC Systems platform was released with the 7.8.x series. Containerization makes the system much easier to manage while taking advantage of cloud infrastructure that provides the ability to scale up and down cluster size based on demand.
If you want to find out more about the cloud related changes included in the initial HPCC Systems 7.8.x release or follow the steps involved in starting a simple test HPCC Systems cluster using the default helm chart, read Richard Chapman’s blog, HPCC Systems and the Path to the Cloud.
In this blog, Jake Smith provides a step by step tutorial demonstrating how to setup a default HPCC Systems cluster on Microsoft Azure, including some useful additional notes to help you navigate the process easily.
Introduction
The aim of this blog is to walk you through the steps required to setup and run the HPCC helm charts using the Azure Kubernetes Service (AKS).
It will initially walk through the required steps to create the Azure resources required, before showing you how to launch HPCC’s helm charts on AKS.
And finally, it will show you how to interact with Eclwatch and run a test query on the newly provisioned HPCC cluster instance.
If you encounter any unexpected errors or issues whilst running the steps described, please refer to the “Additional tips” section at the bottom of this blog for hints and suggestions.
Preparing the Azure shell
To interact with Azure, you can either use Azure’s CLI from within a shell or scripts, or the Azure Portal. This tutorial focuses on using the Azure CLI.
You have two choices here, either install the Azure CLI on your own machine, this will require other prerequisites to also be installed, or by using the Azure Cloud Shell from within a web browser to interact with Azure.
Please follow either 1. or 2. below based on your preference.
1. Install the Azure CLI
Kubectl and Helm are prerequisites that must be installed, as well as the Azure CLI itself. Install each of the following:
- Kubectl – To interact with and inspect Kubernetes clusters
Follow this guide to install kubectl: https://kubernetes.io/docs/tasks/tools/install-kubectl/ - Helm – To install, upgrade and delete the HPCC Helm charts
Follow this guide to install helm: https://helm.sh/docs/intro/install/ - Azure CLI – to interact with the Azure Cloud platform
Follow this guide: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli
After the prerequisites are installed, login to your Azure account, by running:
az login
This will open a webpage asking you to login to your Azure account. Once you have done so, the webpage can be closed.
If you have multiple subscriptions associated with the account, you will need to choose which subscription to use.
First list the subscriptions with:
az account list -o table
Then set the subscription to use with:
az account set --subscription <subscription-name>
Now skip forward to the Creating a resource group section below.
2. Use Azure’s Cloud Shell
The Azure Cloud Shell will give you a Linux shell within a browser window, which you can use to interact with Azure in practically the same way you would from your local machine. Start by going to:
https://shell.azure.com
If this is the first time you have accessed the cloud shell, Azure will tell you that some storage is needed for the cloud shell to persist the account settings and files.
Click Create Storage. After a few seconds, you should be presented with a Linux shell. At this point, the cloud shell will already be logged into to your Azure account. It will also have some other prerequisites installed for you, namely, kubectl and helm.
Creating a resource group
Unless you have already created a resource group, or have been told to use a specific pre-existing resource group, you will need to create one. To create a new named resource group, you must choose a name and a Azure location.
The following example, creates a new resource group called rg-hpcc in Azure location westus:
az group create --name rg-hpcc --location westus
If successful your output will be similar to this:
{ "id": "/subscriptions/a019d133-0644-4de0-9028-ca5e600abc55/resourceGroups/rg-hpcc", "location": "westus", "managedBy": null, "name": "rg-hpcc", "properties": { "provisioningState": "Succeeded" }, "tags": null, "type": "Microsoft.Resources/resourceGroups" }
Note: A useful way to list available locations is to run:
az account list-locations -o table
Some locations appear to either not support creating AKS clusters (see later steps), or don’t support many of the standard VM node types.
Creating an AKS cluster
Next create a named Kubernetes cluster. You can choose any name for your Kubernetes, the following example uses aks-hpcc. To create a Kubernetes cluster, run the following command:
az aks create --resource-group rg-hpcc --name aks-hpcc --node-vm-size Standard_D2 --node-count 1
Note: –node-vm-size and –node-count are optional, but may be needed depending on your subscription restrictions.
This step can take a few minutes, since it needs to provision a few different resources. While waiting, view the progress of these resources in the Azure portal:
https://portal.azure.com/#blade/HubsExtension/BrowseAll
Setting up kubectl so that it can communicate and authenticate with AKS
So that kubectl and helm can interact with Azure, you need to configure the kube client credentials, to do this run the following command:
az aks get-credentials --resource-group rg-hpcc --name aks-hpcc --admin
You are now ready to interact with Kubernetes in Azure, and start installing the HPCC helm charts.
Installing the Helm charts
Use the following steps to fetch, modify and deploy the HPCC Systems charts:
Add the hpcc helm chart repository using the following command:
helm repo add hpcc https://hpcc-systems.github.io/helm-chart/
Install the charts using the following command:
helm install mycluster hpcc/hpcc --set global.image.version=latest --set storage.dllStorage.storageClass=azurefile --set storage.daliStorage.storageClass=azurefile --set storage.dataStorage.storageClass=azurefile
If successful, your output will be similar to this:
NAME: mycluster LAST DEPLOYED: Wed Apr 15 09:41:38 2020 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None
At this point, Kubernetes should start provisioning the hpcc pods. To see their current status run:
kubectl get pods
Note: If this is the first time helm install has been run, it will take some time for the pods to get to a Running state, since Azure will need to pull the container images from docker.
Once all the pods are running, the HPCC cluster is ready to be used.
Customization
If further customization is required, it is recommended that you use a customized version of the hpcc values.yaml file.
To do that, first extract it from the hpcc chart, using the following command:
helm show values hpcc/hpcc > myvalues.yaml
Then edit myvalues.yaml to change the settings required.
Once you have your finished edited version, start the hpcc helm chart like this:
helm install mycluster hpcc/hpcc --values myvalues.yaml
NB: in this example, it is assumed you have set the storageClass settings to “azurefile” within myvalues.yaml, instead of passing them via –set.
If you have already installed the hpcc helm chart, you will either need to delete it, with ‘helm uninstall’, before re-installing it again,
or if you have updated some values, you can use helm upgrade (with the same parameters as the helm install you previously used), to update the running chart.
New settings in existing running pods will not necessarily take immediate effect, some will only take effect when new pods restart.
Accessing EclWatch
To access ECL Watch, an external IP to the ESP running ECL Watch is required. This will be listed as the myesp LoadBalancer service, and can be viewed by running the following command:
kubectl get svc
Your output should be similar to:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE eclservices ClusterIP 10.0.44.11 <none> 8010/TCP 11m eclwatch LoadBalancer 10.0.21.16 13.87.156.208 8010:30190/TCP 11m kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 4h28m mydali ClusterIP 10.0.195.229 <none> 7070/TCP 11m
Using the EXTERNAL-IP for the eclwatch service, open a browser and go to http://<external-ip>:8010/, for example in this case, http://13.87.156.208:8010. If everything is working as expected, the ECL Watch landing page will be displayed.
NB: it may take a couple of minutes after the EXTERNAL-IP becomes available for the eclwatch service to respond, depending on the availability of other dependent services.
Running a test query
Inside ECL Watch, press the ECL button and go to the Playground tab. From here you can use the example ECL or enter other test queries and pick from the available clusters available to submit your workunits.
Note: Since the configuration is setup to run some components on demand, there can be a delay between submitting and the job beginning to execute.
Decommissioning the installation
To check which helm charts are currently installed, run:
helm list
To stop the HPCC Systems pods, use helm to uninstall:
helm uninstall mycluster
This will stop the cluster, delete the pods and with the default settings and persistent volumes, it will also delete the storage used.
However, the resource objects, including the kubernetes cluster itself, the node-pool created for it and other resource objects will remain within your Azure resource group.
To delete everything, you can delete the resource group that you created at the beginning. In doing so, it will delete every resource associated with it, the Kubernetes cluster and the associated node pool, etc.
To delete the resource group run the following command:
az group delete --name rg-hpcc
This will take a few minutes whilst Azure clears up.
Additional tips
- The hpcc charts in this walkthrough will make Persistent Volume Claims based on the configured azurefile storageClass. By default Azure and the azurefile storageClass will mean that storage associated with pod claims will automatically be deleted (reclaim policy = Delete), when the pods are deleted. So any files or storage created will be not be persisted after the helm charts are uninstalled.
- The Azure node pool will continue to remain up even if no pods are running and the minimum node pool scale size is 1. As a consequence, some minimal costs will continue to accrue. However, it is possible to Deallocate node pools via the Azure portal, which may be desirable for testing purposes where the pool would otherwise remain idle for extended periods.
- When creating the AKS cluster, it is possible the machine type (e.g. ‘Standard_D2’) is not available in the region you have selected, if you encounter an error whilst running the ‘az aks create’ command, try another machine type. A list of machine types in a region can be obtained using this command (replacing ‘westus’ with your selected region):
az vm list-sizes --location westus -o table.
- After starting the HPCC helm charts and viewing the pod status’ (using: kubectl get pods), the pods should transition through various states before reaching state Running.
The first time they are run on new nodes, the images will be fetched from the docker hub repository. This can take some time.
If there are insufficient resources in your AKS cluster, e.g. insufficient CPU or memory, then some or all pods will remain in a Pending state, until those resources become available. Depending on the configuration of your AKS configuration (see az aks create -h), the available resources may automatically be provisioned via auto-scaling. This can take some minutes. You can diagnose why the pod is in a particular state, by running:
kubectl describe pod <pod-name>
- Or to diagnose why a Persistent Volume Claim is not attached, list the PVC’s and examine them with:
kubectl get pvc kubectl describe pvc <pvc-name>
- Depending on how your subscription was created, you may need to register the “Microsoft.Storage” provider with your subscription. If your PVC’s are stuck in a ‘Pending’ state their status may contain a message like “The subscription is not registered to use namespace ‘Microsoft.Storage’“. If so, register the provider with your subscription with:
az provider register -n Microsoft.Storage --subscription <your-subscription-id> --wait
- If you use kubernetes or helm with multiple cloud providers, for example, if you already use Minikube or Docker Desktop, you will need to switch between kubernetes contexts to use them. After following this guide, and in particular, after running the ‘az aks get-credentials’ step, kubectl’s context will be configured to talk to the new AKS cluster. To see a list of your kubernetes contexts use:
kubectl config get-contexts
- To switch to a particular kubernetes context use:
kubectl config use-context <name-of-context>