Skip to content

Terminology

Here is a glossary of terms related to Pravega:

Term Definition
Pravega A data storage primitive based on append-only logs and tiered storage.
Stream A persistent, unbounded, append-only collection of Events.
A Stream is identified by a name and a Scope.
A Stream is comprised of one or more Stream Segments.
Stream Segment A shard of a Stream.
The number of Stream Segments in a Stream might vary over time according to load and Scaling Policy.
In the absence of a Scale Event, Events written to a Stream with the same Routing Key are stored in the same Stream Segment and are totally ordered.
When a Scale Event occurs, the set of Stream Segments of a Stream changes and Events written with a given Routing Key K before the Scaling Event are stored in a different Stream Segment compared to Events written with the same Routing Key K after the event.
In conjunction with Reader Groups, the number of Stream Segments is the maximum amount of read parallelism of a Stream.
Scope A namespace for Stream names.
A Stream name must be unique within a Scope.
Event A collection of bytes within a Stream.
An Event is associated with a Routing Key.
Routing Key A property of an Event used to route messages to Readers.
Two Events with the same Routing Key will be read by Readers in exactly the same order they were written.
Reader A software application that reads data from one or more Streams.
Writer A software application that writes data to one or more Streams.
Pravega Java Client Library A Java library that applications use to interface with Pravega
Reader Group A named collection of one or more Readers that read from a Stream in parallel.
Pravega assigns Stream Segments to the Readers making sure that ll Stream Segments are assigned to at least one Reader and that hey are balanced across the Readers.
Position An offset within a Stream, representing a type of recovery point for a Reader.
If a Reader crashes, a Position can be used to initialize the failed Reader*'s replacement so that the replacement resumes processing the Stream from where the failed Reader left off.
Tier 1 Storage Short term, low-latency, data storage that guarantees the durability of data written to Streams.
The current implementation of Tier 1 uses  Apache ookkeeper.
Tier 1 storage keeps the most recent appends to streams in Pravega.
As data in Tier 1 ages, it is moved out of Tier 1 into Tier 2.
Tier 2 Storage A portion of Pravega storage based on cheap and deep persistent storage technology such as HDFS, DellEMC's Isilon or DellEMC's Elastic Cloud Storage.
Pravega Server A component of Pravega that implements the Pravega data plane API for operations such as reading from and writing to Pravega Streams.
The data plane of Pravega, also called the Segment Store, is composed of 1 or more Pravega Server instances.
Segment Store A collection of Pravega Servers that in aggregate form the data plane of a Pravega cluster.
Controller A component of Pravega that implements the Pravega control plane API for operations such as creating and retrieving information about Streams.
The control plane of Pravega is composed of 1 or more Controller instances coordinated by Zookeeper.
Auto Scaling A Pravega concept that allows the number of Stream Segments in a Stream to change over time, based on Scaling Policy.
Scaling Policy A configuration item of a Stream that determines how the number of Stream Segments in the Stream should change over time.
There are three kinds of Scaling Policy, a Stream has exactly one of these at any given time.
- Fixed number of Stream Segments
- Change the number of Stream Segments based on the number of bytes per second written to the Stream
- Change the number of Stream Segments based on the number of Events er second written to the Stream
Scale Event There are two types of Scale Event: Scale-Up Event and Scale-Down Event. A Scale Event triggers Auto Scaling.
A Scale-Up Event is a situation where an increase in load causes one or more Stream Segments to be split, increasing the number of Stream Segments in the Stream.
A Scale-Down Event is a situation where a decrease in load causes one or more Stream Segments to be merged, reducing the number of Stream Segments in the Stream.
Transaction A collection of Stream write operations that are applied atomically to the Stream.
Either all of the bytes in a Transaction are written to the Stream or none of them are.
State Synchronizer An abstraction built on top of Pravega to enable the implementation of replicated state using a Pravega segment to back up the state transformations.
A State Synchronizer allows a piece of data to be shared between multiple processes with strong consistency and optimistic concurrency.
Checkpoint A kind of Event that signals all Readers within a Reader Group to persist their state.