How Optimizing Kafka Can Save Costs of the Whole System

Itay Tal Head of Cloud Services

29th September, 2024 4 Min read

Kafka is no longer exclusively the domain of high-velocity Big Data use cases. Today, it is utilized on by workloads and companies of all sizes, supporting asynchronous communication between even small groups of microservices.

But this expanded usage has led to problems with cost creep that threaten many companies’ bottom lines. And due to the complexity of the platform, many leaders are unsure how best to manage their costs without compromising data quality.

This article reveals exactly why Kafka costs spike – and how you can keep them under control.

Why Kafka Optimization Matters in 2024

A series of changes have sparked a renewed requirement for Kafka optimization:

Higher Data Usage: In the past, roughly 80% of data would either be archived or deleted – with just 20% stored and routinely used. However, with the growing use of AI and other data-intensive solutions, this has flipped in recent years. Today, around 80% of data is stored, analyzed, fetched, and queried – with just 20% archived.
Increased Data Complexity: While companies used to rely on simpler data formats like JSON and XML, these are now being replaced with Protocol Buffers (Protobuf) and Avro. These new formats feature different types of indentation and more complicated events – placing greater demands on your system.
Faster Dataflow: Not only is there a larger volume of more complex data, but most use cases require it to be delivered faster and with higher performance.

Kafka is the centerpiece that enables this new scale of high-speed, high-quality data streamlining. However, Kafka costs are a constant challenge for all companies that use it – because costs can suddenly spike and get out of hand far faster than other infrastructure components and traditional databases.

What Leads to Sudden Kafka Costs Increases?

Kafka usage is charged based on the volume of compute power and data storage. As a result, there are a handful of common factors that lead to higher costs:

Increased Data Volume: A sudden increase in the volume of data input will inevitably increase costs. This often occurs when Kafka is introduced, and suddenly, all developers throughout an organization start using it.
Number of consumer groups: While consumers within the same group read a single unique message from the topic, consumer groups duplicate that message. So, as you add more consumer groups to a single topic, the data input will be multiplied by the number of groups.
Latency Requirements: As companies strive to lower latency and increase throughput, they typically increase the number of brokers they use and make those brokers more robust. But this requires more CPU and memory – both of which increase Kafka cost.
Retention Policies: Poorly optimized retention policies lead to unnecessary increases in static storage.
Number of Partitions: The number of partitions in the system directly impacts CPU cycles, CPU utilization, memory utilizatio, and static storage.
Number of Connections: While there is a system in place to kill idle connections in Kafka, cutting a large volume of connections at once can cause a CPU spike and increase costs.

Strikingly, many of these factors also play a key role in damaging Kafka’s performance. From “one-size-fits-all” configurations that cause streaming delays to poorly optimized consumer groups, organizations can take a series of steps to reduce their Kafka spend and improve performance.

Five Ways to Optimize Kafka Costs and Performance

1. Set Appropriate Data Retention Periods

Don’t use a single policy for topics; this can lead to data being stored for either too long (which causes wasted spend) or too briefly (which may impact future performance.)

Instead, profile each topic individually to find access patterns and create custom policies to optimally reflect these trends. While this can take a lot of manual effort, it is more than worthwhile for the subsequent cost and performance improvements you will produce.

2. Tiered Storage

A recent platform update called Tiered Storage offers another way to avoid needless data retention. Using your understanding of individual topics, you can offload cold offsets to cheap object storage to avoid wasting excess storage capacity on topics that don’t require it.

3.Ditch JSONs

Companies can reduce up to 50% of their payload while improving performance simply by using binary formats, such as Protobuf and Avro. Better still, this switch will not at all jeopardize your clients’ CPU utilization.

4. Locate Inactive Partitions and Topics

Inactive partitions and topics can significantly affect memory, storage and CPU utilization. Even for self-hosted Kafka solutions, they can reduce the internal compute usage of brokers and impact performance.

Companies should, therefore, take proactive steps to identify and eliminate these inactive partitions and topics – generating immediate savings and helping to avoid underutilizing resources.

5. Use a Managed Solution

One of the largest Kafka costs is actually the human labor required to manage the platform. With fewer man-hours, less management effort, and auto-scaling capacity, a managed Kafka solution is an instant cost saver and performance enhancer.

Optimize Your Kafka Costs with GlobalDots

There is no silver bullet for Kafka optimization. While the steps we’ve discussed will help most organizations cut costs and improve performance, your unique challenges and requirements will determine their real impact.

That is why so many companies choose to work with GlobalDots: a true innovation partner with over 20 years’ of expertise, we battle-test every product and strategy on the market – then we work closely with you to select the most impactful approach for your organization.

Latest Articles

Cloud Cost Optimization

4 Proven Ways to Minimize Your AWS MSK Cost

The very tools designed to streamline cloud operations can sometimes stretch budgets thin. One good example is managing the costs associated with Amazon Managed Streaming for Apache Kafka (MSK). While AWS MSK simplifies deploying and scaling Kafka clusters, the costs can stack up if not optimized. Here’s how you can rethink your AWS MSK deployment […]

3rd February, 2025

Cloud Cost Optimization

Complying with AWS’s RI/SP Policy Update: Save More, Stress Less

Shared Reserved Instances (RIs) and Savings Plans (SPs) have been a common workaround for reducing EC2 costs, but their value has always been limited. On average, these shared pools deliver only 25% savings on On-Demand costs—far below the 60% savings achievable with automated reservation tools. For IT and DevOps teams, the trade-offs include added complexity, […]

Itay Tal

5th December, 2024

Cloud Cost Optimization FinOps

How E-commerce TrustMeUp Achieved 40% Faster Delivery and 25% Bandwidth Savings with GlobalDots & CloudFront

A popular e-commerce platform was growing fast, but that growth created challenges. With a poorly optimized cloud setup, the company faced content quality problems, as well as ongoing security issues. The only way to solve the problem was to optimize their CloudFront distribution – leading them to work with GlobalDots’ innovation experts. Using the solution […]

Itay Tal

11th September, 2024

Cloud Cost Optimization FinOps

EBS-Optimized Instances: A Guide to Cut Costs and Maintain Performance

A recent study of over 100 enterprises found more than 15% of AWS cloud bills comes from Elastic Block Store (EBS). But what can you do to cut those costs without impacting performance? The key is to select EBS-optimized instances. With the right combination of EBS-optimized instances and EBS volumes, companies consistently maintain at least […]

Ganesh The Awesome

19th May, 2024

Back to Resources