Welcome to this issue of Activation Function. Every other week, I introduce you to a new and exciting open-source backend technology (that you’ve probably only kind of heard about… ) and explain it to you in 5 minutes or less so you can make better technical decisions moving forward.
In this issue, we’ll explore Redpanda, a Kafka-API-compatible distributed data streaming platform. Redpanda dubs itself as a simple, high-performance, cost-effective drop-in replacement for Kafka.
I know… I know… high-performance, drop-in replacement, cost-effective, all those are the usual suspects every new vendor throws around. Is Redpanda the real deal? Let’s find out.
Note: Data stream processing, event stream processing, event data stream processing; I’m sure some of you can argue the nuances of these, but I don’t want to be part of that conversion, so I’ll consider them all interchangeable.
- Redpanda is a new streaming platform implemented in C++ and optimized to leverage modern hardware like NVMe drives and processing techniques like a thread-per-core execution model to improve throughput and latency.
- Redpanda has a simplified architecture with no external dependencies like Zookeeper or the Quorum-based Controller in Kafka's KRaft implementation and can be deployed with a single binary.
- Redpanda provides high throughput, low latency, resiliency, fault tolerance, and high availability, making it suitable for mission-critical applications.
- Redpanda runs in production at companies including Cisco, Akamai, Laceworks, Vodaphone, Moody’s, Midjourney, Activision, and many others.
Evolution of Stream Processing
Stream processing isn’t anything new. It’s been an active research field for over 20 years; however, it has significantly evolved over the years and transitioned from a niche to a mainstream technology over the past ten years, with the main driver being technological advancements (e.g., MapReduce), and requirements (e.g., real-time data feeds at scale).
Without going down the rabbit hole, here’s a quick and oversimplified rundown of how we got to where we are today:
- Initially, we had systems that relied on a simple request-response model. The client would issue a request to your service and sit patiently waiting for the response. As the number of clients and requests increased, your services became overwhelmed, the response time ballooned, a backlog built up on the client side, and, sooner or later, the system crashed.
- Message brokers introduced asynchronous communication where a client can send a request and return later for the response without being held hostage waiting for the response.
- Eventually, we ran into scalability issues when data volumes grew beyond a certain point due to how message brokers serialized data, how they persisted data to disk, and so on.
- In 2011, Kafka emerged from LinkedIn as a messaging system that can handle huge data throughput by bypassing many latency-inducing mechanisms. For example, it wrote data directly to the memory cache before flushing it to disk, natively stored data in a binary format, distributed payloads across nodes (aka horizontal scalability), etc.
Cool… So why do we need alternatives?
The simple answer is that things have evolved, and it can be hard to adapt older technologies to new needs (e.g., ML workloads, cloud-native apps, etc.) or leverage new technology advancement (e.g., new-generation hardware). Eventually, people start running into bottlenecks and start thinking about new ways of getting things done.
To some, Kafka’s “outdated” architecture makes it difficult and labor-intensive to deploy, manage, and optimize.
That’s what Redpanda is looking to capitalize on.
Redpanda (formerly Vectorized) was launched in 2019 to build a modern data streaming platform that is a more performant, less complex, and cost-effective alternative to Kafka. How? By building from the ground up and going low-level to fully leverage modern hardware.
Before we delve into the differences between Redpanda and Kafka, let’s briefly go over the similarities:
- They are both distributed pub/sub-streaming platforms.
- They both provide high throughput and low latency.
- They both provide data integrity and fault tolerance.
- They both offer tiered storage options.
- In both systems, consumers pull messages from the broker.
- Redpanda is Kafka API compatible, meaning all Kafka clients work with Redpanda without any changes. There are a few edge cases, but the Redpanda team is addressing them.
[Note] Redpanda has NOT achieved 100% API compatibility with Kafka, so you shouldn’t expect 100% of the same behavior as with Kafka.
Redpanda draws its performance advantage from a few factors:
- It is written in C++, meaning better control over memory management, hardware access, and system resource control. If you’re interested in implementation details, check out this presentation.
- It’s not dependent on Java Virtual Machine (JVM) and, therefore, doesn’t suffer from the typical performance bottlenecks of Java-based systems such as Kafka (e.g., Garbage collection).
[Thought] I wonder how much of the C++ performance advantage is negated if you use Java-based components like Kafka Connect, which Redpanda integrates with.
- It’s optimized to leverage nonvolatile memory express (NVMe) storage, improving both throughput and latency.
- It uses a tread-per-core execution model where each core handles its own set of tasks. This reduces CPU communication and locking and should provide predictable performance. It accomplishes this by leveraging ScyllaDB’s Seastar framework under the hood.
[Note] If you want to go down the rabbit hole of Thread Per Code Architecture, start here.
All of this means that, in theory, Redpanda should outperform Kafka, but as we know, it’s a bit more complicated than that. Long story short, you’ll need to find out by running benchmarks for your specific use case.
But just in case you’re interested in existing benchmarks, the Redpanda team released these benchmarks in 2022, to which Jack Vanlightly (Staff Technologist at Confluent) responded with his own benchmarks.
- Redpanda benchmarks show that for high-throughput workloads, they deliver better performance, more predictable latency, and need 3x fewer nodes.
- Confluent benchmarks show that Kafka outperformed Rendpanda when the number of producers and consumers increased and experienced less performance degradation when running under constant loads for prolonged periods.
Ease of use
This is an area that is easier to determine, and many agree that Redpanda has the upper hand when it comes to self-managed environments. This is mainly because Redpanda boasts a more streamlined architecture than Kafka without relying on external components like the JVM or specific ZooKeeper servers.
Redpanda can be set up in a cluster with one or multiple nodes, similar to Kafka. These clusters can span various availability zones or regions. However, unlike Kafka, every Redpanda node operates using the same binary, taking on different roles, such as acting as a data broker or providing supplementary services like an HTTP proxy or schema registry.
Each node inherently uses the Raft consensus algorithm, eliminating the need for external services like the Quorum-based Controller in Kafka's KRaft.
[Technical Note] Kafka moved away from ZooKeeper to KRaft to reduce operational overhead and improve scalability.
Total cost of ownership (TCO)
Just like performance, TCO is hard to generalize, so you’ll have to do your own benchmarking. Redpanda claims that its self-hosted version is much cheaper than Kafka as it requires fewer nodes to achieve similar performance (aka smaller infrastructure footprint) and requires less administrative overhead.
That being said, remember the TCO includes other factors like labor cost, which can tip the scales in favor of Kafka, as it’s safe to assume that finding engineers who know Kafka is cheaper and easier given the much larger expert community.
Also, remember that even if you decide to go with the hosted version, you’ll have many more vendors and pricing options to choose from when going with Kafka. On the other hand, Redpanda is the only vendor offering a fully managed version of its platform.
Other quick facts
As you can imagine, there are many more technical nuances to unpack for such a technical product. So here are a few quick facts you should be aware of:
- Redpanda supports several Kafka Connect connectors, although I’m not 100% sure what behavior you should expect out of the box.
- Redpanda comes with a pretty slick console interface.
- Redpanda has built a Wasm Engine that lets you deploy custom code written in languages that can be compiled to WebAssembly to operations on “in-flight” data.
- Unlike Kafka, which is open-sourced under the Apache 2.0 license, Redpanda is released under the more restrictive Source Available License (BSL).
- Redpanda is still a fairly new platform, so documentation, resources, and community support are limited.
- As with any immature infrastructure product, Redpanda has some bugs here and there.
Redpanda in Production
Redpanda is still in its early days but has already amassed a solid customer base with names like Cisco, Akamai, Laceworks, Vodaphone, Moody’s, Midjourney, Activision, and many others.
Redpanda is used in a variety of use cases where latency is important, including
real-time cybersecurity monitoring, log stream processing and real-time trade risk management, real-time market pricing and analytics, and many more.
When to use Redpanda?
Based on the case studies listed above, it looks like Redpanda is a good fit in the following situations:
- You’re looking to avoid Java JVM to gain a slight performance boost.
- You’re looking to reduce operational complexity by avoiding external dependencies and deploying everything using a single binary when on the self-managed version.
- You prefer working with a C++ infrastructure.
- And, of course, if you ran your own benchmarks and demonstrated a superior performance/dollar ratio.
When not to use Redpanda?
Many of these will be obvious, and I’ve already mentioned a few, but reasons you wouldn't want to use Redpanda include:
- You’re looking for a truly Open Source streaming solution. Redpanda is published under a Source Available License (BSL).
- You’re already working with or prefer to work with a Java/JVM infrastructure.
- You’re seeking a solution with a large community of experts to get support and hire from.
- You’re looking to have optionality when choosing a hosting provider.
- You’re new to streaming technology and would like to have abundant resources to learn from.
Will Redpanda replace Kafka?
This one is a tricky question to answer, and my opinion is solely based on my experience working on Memgraph (a C++ wire-compatible graph DB alternative to Java-based Neo4j) and my observation of ScyllaDB (a C++, API-compatible replacement to Java-based Cassandra DB).
Many organizations have sunk a lot of resources into Kafka, and it will make more sense for them to keep throwing money at Confluent to deal with operational complexity, performance optimization, etc. Although today’s focus on cost-cutting might push a few to consider replacing some of their Kafka-cluster to test the waters.
I think new startups or even new projects with larger organizations will adopt Redpanda to speed up development, reduce complexity, and achieve better performance for more demanding use cases. But this will be a slow adoption driven by a few discerning engineers.
I might sound bearish, but I’m excited about watching Redpanda grow and reach new heights. I think the team has built a great piece of technology, and as they grow their customer portfolio, community, resources, and tooling ecosystem, I think they’ll give Confluent a run for its money.
Where to go from here?
Alright, so now that I hopefully got you intrigued by Redpanda, here are a few resources you should check out to dig a little deeper:
Articles & Tutorials
- Getting started with Redpanda
- Stream processing with Redpanda
- Cluster operations with Redpanda
- Redpanda + Kafka Connect/Schema Registry
Videos & talks
People & companies to follow
That’s it, folks! I hope this gave you an overview of Redpanda and how it compares to Kafka on a high level. As always, there is a lot to learn and many nuances to consider for your specific needs. The only way to determine if this is for you is to take it for a quick spin, which should be simple, thanks to the single binary deployment.
Until next time!