Access Control Lists (ACLs) are Kafka’s native mechanism for controlling who is allowed to do what within a cluster. They define which principals (e.g. users and applications) are allowed to perform which operations (e.g. produce, consume, etc.) on which resources (e.g. topics, consumer groups, etc.).
ACLs are an important tool for securing Kafka clusters. Without them, any authenticated client can read and write any piece of data freely, as well as perform destructive actions like deleting entire topics.
At first glance, enabling ACLs in a live Kafka cluster seems like a simple affair:
Seems straightforward enough, but step 3 poses a challenge. Once ACL enforcement is enabled, Kafka defaults to a “deny all” policy, where any request from a non-superuser is denied unless there is an explicit ACL granting access. Whilst great for security, it can make enabling ACLs on a production Kafka cluster for the first time terrifying, because a single missing permission or misconfigured ACL can halt producers, stall consumers, and disrupt connectors, making operators understandably cautious.
The risk is amplified by how easy ACLs are to misconfigure. Wildcards are a common source of errors. For example, <span class="codeinline">Prefixed("*")</span> matches nothing, despite appearing to be a catch-all, whereas the <span class="codeinline">Literal(*)</span> is required for a true wildcard. Other mistakes, like forgetting that principal names are case-sensitive, can similarly cause unexpected denials and production issues.
These sorts of subtleties are easy to miss during ACL creation and are nearly impossible to notice until enforcement begins – after which it’s already too late.
Even perfectly written ACLs may not behave as expected. Since ACLs apply to authenticated principals, any misconfiguration in SASL, Kerberos, certificates, or identity mapping can result in clients not being recognised or being treated as the wrong user. At enablement time, Kafka will simply deny access, with operators left to deal with the consequences.
Given how risky it is to turn on ACL enforcement blindly, what we really need is a way to see how our ACLs would behave before they start blocking real clients. If we could run ACLs in a mode where they are evaluated but not enforced, we would be able to surface misconfigurations early, without putting production traffic at risk, and free operators from the typical enable-and-pray predicament.
That idea led us to build ACL Shadowing for WarpStream, a way to validate your ACLs and simulate authorization decisions safely on a live cluster before they’re enforced.
With the flip of a toggle in the console UI, you can enable ACL Shadowing on any Kafka cluster. Once it’s active, the cluster begins running your ACL rules in a shadow mode. ACLs are fully evaluated against live traffic, but their results are not enforced.
.png)
In other words, your WarpStream cluster continues operating normally, but you gain full visibility into what would happen if authorization were turned on.
Whenever an operation would be denied by your ACLs, the system surfaces this immediately, in a manner similar to how a real authorization failure would be communicated: deny logs are generated for every failing check and a diagnostic is emitted showing the principal, operation, and resource that would have been blocked.
Because ACL Shadowing never interferes with real traffic, teams can inspect and fix any issues like incorrect principals, wildcard mistakes, or misaligned authentication, without risking an outage.
By the time you enable ACLs, you have much greater visibility into how your rules are likely to behave. Fewer surprises, less firefighting, and a lower risk of impacting production, allowing developers to go live with confidence.
ACL Shadowing is available now for all WarpStream clusters. To get started, navigate to the ACLs tab for your cluster. If you have any questions, contact us.