Deploying in Kubernetes¶
Table of Contents¶
Requirements¶
- Kubernetes 1.8+
- An existing Apache Zookeeper 3.5 cluster. This can be easily deployed using our Zookeeper Operator.
- Pravega Operator manages Pravega clusters deployed to Kubernetes and automates tasks related to operating a Pravega cluster.
Usage¶
Install the Pravega Operator¶
Note: If you are running on Google Kubernetes Engine (GKE), please check this first.
Run the following command to install the PravegaCluster custom resource definition (CRD), create the pravega-operator service account, roles, bindings, and the deploy the Pravega Operator.
$ kubectl create -f deploy
Verify that the Pravega Operator is running.
$ kubectl get deploy NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE pravega-operator 1 1 1 1 17s
Deploy a sample Pravega cluster¶
Pravega requires a long term storage provider known as Tier 2 storage. The following Tier 2 storage providers are supported:
- Filesystem (NFS)
- Google Filestore
- DellEMC ECS
- HDFS (must support Append operation)
The following example uses an NFS volume provisioned by the NFS Server Provisioner helm chart to provide Tier 2 storage.
$ helm install stable/nfs-server-provisioner
Verify that the nfs storage class is now available.
$ kubectl get storageclass NAME PROVISIONER AGE nfs cluster.local/elevated-leopard-nfs-server-provisioner 24s ...
Note: This is ONLY intended as a demo and should NOT be used for production deployments.
Once the NFS server provisioner is installed, you can create a PersistentVolumeClaim that will be used as Tier 2 for Pravega. Create a pvc.yaml file with the following content.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pravega-tier2 spec: storageClassName: "nfs" accessModes: - ReadWriteMany resources: requests: storage: 50Gi
$ kubectl create -f pvc.yaml
pravega.yaml file with the following content.
apiVersion: "pravega.pravega.io/v1alpha1" kind: "PravegaCluster" metadata: name: "example" spec: version: 0.4.0 zookeeperUri: [ZOOKEEPER_HOST]:2181 bookkeeper: replicas: 3 image: repository: pravega/bookkeeper autoRecovery: true pravega: controllerReplicas: 1 segmentStoreReplicas: 3 image: repository: pravega/pravega tier2: filesystem: persistentVolumeClaim: claimName: pravega-tier2
where:
[ZOOKEEPER_HOST]is the host or IP address of your Zookeeper deployment.
Deploy the Pravega cluster.
$ kubectl create -f pravega.yaml
Verify that the cluster instances and its components are being created.
$ kubectl get PravegaCluster NAME VERSION DESIRED MEMBERS READY MEMBERS AGE example 0.4.0 7 0 25s
After a couple of minutes, all cluster members should become ready.
$ kubectl get PravegaCluster NAME VERSION DESIRED MEMBERS READY MEMBERS AGE example 0.4.0 7 7 2m
$ kubectl get all -l pravega_cluster=example NAME READY STATUS RESTARTS AGE pod/example-bookie-0 1/1 Running 0 2m pod/example-bookie-1 1/1 Running 0 2m pod/example-bookie-2 1/1 Running 0 2m pod/example-pravega-controller-64ff87fc49-kqp9k 1/1 Running 0 2m pod/example-pravega-segmentstore-0 1/1 Running 0 2m pod/example-pravega-segmentstore-1 1/1 Running 0 1m pod/example-pravega-segmentstore-2 1/1 Running 0 30s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/example-bookie-headless ClusterIP None <none> 3181/TCP 2m service/example-pravega-controller ClusterIP 10.23.244.3 <none> 10080/TCP,9090/TCP 2m service/example-pravega-segmentstore-headless ClusterIP None <none> 12345/TCP 2m NAME DESIRED CURRENT READY AGE replicaset.apps/example-pravega-controller-64ff87fc49 1 1 1 2m NAME DESIRED CURRENT AGE statefulset.apps/example-bookie 3 3 2m statefulset.apps/example-pravega-segmentstore 3 3 2m
By default, a PravegaCluster instance is only accessible within the cluster through the Controller ClusterIP service. From within the Kubernetes cluster, a client can connect to Pravega at:
tcp://<pravega-name>-pravega-controller.<namespace>:9090
And the REST management interface is available at:
http://<pravega-name>-pravega-controller.<namespace>:10080/
Check this to enable external access to a Pravega cluster.
Scale a Pravega Cluster¶
You can scale Pravega components independently by modifying their corresponding field in the Pravega resource spec. You can either kubectl edit the cluster or kubectl patch it. If you edit it, update the number of replicas for BookKeeper, Controller, and/or Segment Store and save the updated spec.
Example of patching the Pravega resource to scale the Segment Store instances to 4.
kubectl patch PravegaCluster example --type='json' -p='[{"op": "replace", "path": "/spec/pravega/segmentStoreReplicas", "value": 4}]'
Upgrade a Pravega Cluster¶
Check out the Upgrade Guide.
Uninstall the Pravega cluster¶
$ kubectl delete -f pravega.yaml $ kubectl delete -f pvc.yaml
Uninstall the Pravega cluster¶
$ kubectl delete -f pravega.yaml $ kubectl delete -f pvc.yaml
Uninstall the Pravega Operator¶
Note that the Pravega clusters managed by the Pravega operator will NOT be deleted even if the operator is uninstalled.
To delete all clusters, delete all cluster CR objects before uninstalling the Pravega Operator.
$ kubectl delete -f deploy
Configuration¶
Use non-default service accounts¶
You can optionally configure non-default service accounts for the Bookkeeper, Pravega Controller, and Pravega Segment Store pods.
For BookKeeper, set the serviceAccountName field under the bookkeeper block.
...
spec:
bookkeeper:
serviceAccountName: bk-service-account
...
For Pravega, set the controllerServiceAccountName and segmentStoreServiceAccountName fields under the pravega block.
...
spec:
pravega:
controllerServiceAccountName: ctrl-service-account
segmentStoreServiceAccountName: ss-service-account
...
If external access is enabled in your Pravega cluster, Segment Store pods will require access to some Kubernetes API endpoints to obtain the external IP and port. Make sure that the service account you are using for the Segment Store has, at least, the following permissions.
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pravega-components namespace: "pravega-namespace" rules: - apiGroups: ["pravega.pravega.io"] resources: ["*"] verbs: ["get"] - apiGroups: [""] resources: ["pods", "services"] verbs: ["get"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pravega-components rules: - apiGroups: [""] resources: ["nodes"] verbs: ["get"]
Replace the namespace with your own namespace.
Installing on a Custom Namespace with RBAC enabled¶
Create the namespace.
$ kubectl create namespace pravega-io
Update the namespace configured in the deploy/role_binding.yaml file.
$ sed -i -e 's/namespace: default/namespace: pravega-io/g' deploy/role_binding.yaml
Apply the changes.
$ kubectl -n pravega-io apply -f deploy
Note that the Pravega Operator only monitors the PravegaCluster resources which are created in the same namespace, pravega-io in this example. Therefore, before creating a PravegaCluster resource, make sure an Operator exists in that namespace.
$ kubectl -n pravega-io create -f example/cr.yaml
$ kubectl -n pravega-io get pravegaclusters NAME AGE pravega 28m
$ kubectl -n pravega-io get pods -l pravega_cluster=pravega NAME READY STATUS RESTARTS AGE pravega-bookie-0 1/1 Running 0 29m pravega-bookie-1 1/1 Running 0 29m pravega-bookie-2 1/1 Running 0 29m pravega-pravega-controller-6c54fdcdf5-947nw 1/1 Running 0 29m pravega-pravega-segmentstore-0 1/1 Running 0 29m pravega-pravega-segmentstore-1 1/1 Running 0 29m pravega-pravega-segmentstore-2 1/1 Running 0 29m
Use Google Filestore Storage as Tier 2¶
Refer to https://cloud.google.com/filestore/docs/accessing-fileshares for more information
- Create a
pv.yamlfile with thePersistentVolumespecification to provide Tier 2 storage.
apiVersion: v1 kind: PersistentVolume metadata: name: pravega-volume spec: capacity: storage: 1T accessModes: - ReadWriteMany nfs: path: /[FILESHARE] server: [IP_ADDRESS]
where:
[FILESHARE]is the name of the fileshare on the Cloud Filestore instance (e.g.vol1)-
[IP_ADDRESS]is the IP address for the Cloud Filestore instance (e.g.10.123.189.202) -
Deploy the
PersistentVolumespecification.
$ kubectl create -f pv.yaml
- Create and deploy a
PersistentVolumeClaimto consume the volume created.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pravega-tier2 spec: storageClassName: "" accessModes: - ReadWriteMany resources: requests: storage: 50Gi
$ kubectl create -f pvc.yaml
Use the same pravega.yaml above to deploy the Pravega cluster.
Tune Pravega configuration¶
Pravega has many configuration options for setting up metrics, tuning, etc. The available options can be found
here and are
expressed through the pravega/options part of the resource specification. All values must be expressed as Strings.
... spec: pravega: options: metrics.enableStatistics: "true" metrics.statsdHost: "telegraph.default" metrics.statsdPort: "8125" ...
Enable External Access¶
By default, a Pravega cluster uses ClusterIP services which are only accessible from within Kubernetes. However, when creating the Pravega cluster resource, you can opt to enable external access.
In Pravega, clients initiate the communication with the Pravega Controller, which is a stateless component frontended by a Kubernetes service that load-balances the requests to the backend pods. Then, clients discover the individual Segment Store instances to which they directly read and write data to. Clients need to be able to reach each and every Segment Store pod in the Pravega cluster.
If your Pravega cluster needs to be consumed by clients from outside Kubernetes (or from another Kubernetes deployment), you can enable external access in two ways, depending on your environment constraints and requirements. Both ways will create one service for all Controllers, and one service for each Segment Store pod.
- Via
LoadBalancerservice type. - Via
NodePortservice type.
For more information, Please check Kubernetes documentation.
Example of configuration for using LoadBalancer service types:
... spec: externalAccess: enabled: true type: LoadBalancer ...
Clients will need to connect to the external Controller address and will automatically discover the external address of all Segment Store pods.
Releases¶
The latest Pravega releases can be found on the GitHub Release project page.