Manual Installation¶
This page describes the prerequisites and installation steps to deploy Pravega in a multi-node production environment.
Prerequisites¶
HDFS¶
Setup an HDFS storage cluster running HDFS version 2.7+. HDFS is used as Tier 2 storage and must have sufficient capacity to store contents of all streams. The storage cluster is recommended to be run alongside Pravega on separate nodes.
Java¶
Install the latest Java 8 from java.oracle.com. Packages are available for all major operating systems.
Zookeeper¶
Pravega requires Zookeeper 3.5.1-alpha+. At least 3 Zookeeper nodes are recommended for a quorum. No special configuration is required for Zookeeper but it is recommended to use a dedicated cluster for Pravega.
This specific version of Zookeeper can be downloaded from Apache at zookeeper-3.5.1-alpha.tar.gz.
For installing Zookeeper see the Getting Started Guide.
Bookkeeper¶
Pravega requires Bookkeeper 4.4.0+. At least 3 Bookkeeper servers are recommended for a quorum.
This specific version of Bookkeeper can be downloaded from Apache at bookkeeper-server-4.4.0-bin.tar.gz.
For installing Bookkeeper see the Getting Started Guide.
Some specific Pravega instructions are shown below. All sets assuming being run from the bookkeeper-server-4.4.0
directory.
Bookkeeper Configuration¶
The following configuration options should be changed in the conf/bk_server.conf
file.
# Comma separated list of <zp-ip>:<port> for all ZK servers zkServers=localhost:2181 # Alternatively specify a different path to the storage for /bk journalDirectory=/bk/journal ledgerDirectories=/bk/ledgers indexDirectories=/bk/index zkLedgersRootPath=/pravega/bookkeeper/ledgers
Initializing Zookeeper paths¶
The following paths need to be created in Zookeeper. From the zookeeper-3.5.1-alpha
directory on the Zookeeper servers run:
bin/zkCli.sh -server $ZK_URL create /pravega bin/zkCli.sh -server $ZK_URL create /pravega/bookkeeper
<$ZK_URL>
with the IP address of the Zookeeper nodes
Running Bookkeeper¶
Before starting the bookie, it needs to be formatted:
bin/bookkeeper shell metaformat -nonInteractive
Start the bookie:
bin/bookkeeper bookie
Installing Pravega¶
For non-production systems, you can use the containers provided by the docker installation to run non-production HDFS, Zookeeper or Bookkeeper.
There are two key components of Pravega that need to be run: - Controller - Control plane for Pravega. Installation requires at least one controller. Two or more are recommended for HA. - Segment Store - Storage node for Pravega. Installation requires at least one segment store.
Before you start, you need to download the latest Pravega release. You can find the latest Pravega release on the github releases page.
Recommendations¶
If you are getting started with a simple 3 node cluster, you may want to layout your services like this:
Node 1 | Node 2 | Node 3 | |
---|---|---|---|
Zookeeper | X | X | X |
Bookkeeper | X | X | X |
Pravega Controller | X | X | |
Pravega Segment Store | X | X | X |
All Nodes¶
On each node extract the distribution package to your desired directory:
tar xfvz pravega-0.1.0.tgz cd pravega-0.1.0
Installing the Controller¶
The controller can simply be run using the following command. Replace <zk-ip>
with the IP address of the Zookeeper nodes
ZK_URL=<zk-ip>:2181 bin/pravega-controller
Alternatively, instead of specifying this on startup each time, you can edit the conf/controller.conf
file and change the zk url there:
zk { url = "<zk-ip>:2181" ... }
Then you can run the controller with:
bin/pravega-controller
Installing the Segment Store¶
Edit the conf/config.properties
file. The following properies need to be changed. Replace <zk-ip>
, <controller-ip>
and <hdfs-ip>
with the IPs of the respective services:
pravegaservice.zkURL=<zk-ip>:2181 bookkeeper.zkAddress=<zk-ip>:2181 autoScale.controllerUri=tcp://<controller-ip>:9090 # Settings required for HDFS hdfs.hdfsUrl=<hdfs-ip>:8020
Once the configuration changes have been made you can start the segment store with:
bin/pravega-segmentstore