Change Data Capture (CDC) is becoming a popular technique for interconnecting disparate systems, for replicating state across traditional boundaries, for decomposing existing monoliths into microservices, and for the recordation of audit trails. CDC is the idea of emitting a changelog of all INSERT‘s, UPDATE‘s, DELETE‘s, and schema changes performed on a database. Debezium.io is an […]

Introduction Pravega is a storage system based on the stream abstraction, providing the ability to process tail data (low-latency streaming) and historical data (catchup and batch reads). Relatedly, Apache Flink is a widely-used real-time computing engine that provides unified batch and stream processing. Flink provides high-throughput, low-latency streaming data processing, as well as support for complex event […]

We are pleased to announce Pravega 0.9.0, our first release since Pravega became part of CNCF (Cloud Native Computing Foundation). This release continues to expand the Pravega feature-set and improves the performance of mission-critical use cases, and, of course, brings improved stability overall. In 2020, Pravega community delivered several significant releases. We introduced Streaming Cache […]

Raúl Gracia and Flavio Junqueira Introduction Streaming applications commonly ingest data from a wide range of elements – e.g., sensors, users, servers – concurrently to form a single stream of events. Using a single stream to capture the parallel data flows generated by multiple such elements enables applications to better reason about data and even […]

Raul Gracia and Flavio Junqueira Introduction Streaming systems continuously ingest and process data from a variety of data sources. They build on append-only data structures to enable efficient write and read access, targeting low-latency end-to-end. As more of the data sources in applications are machines, the expected volume of continuously generated data has been growing […]

Introduction The fundamentals of stream semantics in Pravega are learned through familiarity with its client APIs. In this article, we will overview Pravega’s client APIs with a handful of simple examples. As we reach the end, you should see Pravega in action, understand the guarantees afforded by Pravega streams, and have some familiarity with several […]

Pravega is a storage system for data streams that has an innovative design and an attractive set of features to cope with today’s Stream processing requirements (e.g., event ordering, scalability, performance, etc.). The project has plenty of documentation and great blog posts that explain in detail every technical aspect of Pravega. But, if you are […]

Traditional cache solutions treat each entry as an immutable blob of data, which poses problems for the append-heavy ingestion workloads that are common in Pravega. Each Event appended to a Stream would either require its own cache entry or need an expensive read-modify-write operation to be included in the Cache. To enable high-performance ingestion of […]

The ability to pipeline Events to the Segment Store is a key technique that the Pravega Client uses to achieve high throughput, even when dealing with small writes. A Writer appends an Event to its corresponding Segment as soon as it is received, without waiting for previous ones to be acknowledged. To guarantee ordering and […]