Skip to main content

Command Palette

Search for a command to run...

Visualizing Avro Kafka Data in Grafana: Streaming Real-Time Schemas

Updated
8 min read
Visualizing Avro Kafka Data in Grafana: Streaming Real-Time Schemas

Hey everyone, if you've been following along from my last post on streaming JSON into Grafana with the Kafka Datasource plugin, you know how awesome it is to watch real-time data light up your dashboards. Today, we're leveling up to Avro; that compact, schema-evolved format so many Kafka setups swear by. I'll walk you through using the plugin's Avro support, whether you're pulling from a Schema Registry or embedding schemas inline. It's straightforward, and you'll be visualizing those binary messages in no time.

Before we jump in

If you haven't checked out my previous article on the JSON feature, I'd highly recommend giving it a quick read first. It covers the foundational concepts of how this plugin works: connecting to Kafka, picking topics, authentication, configuring partitions and offsets, and mapping message fields to Grafana panels. The Avro support builds on the exact same mental model: Kafka topic → live stream → fields in Grafana. So you'll feel right at home.

The only real difference? Avro messages are binary and need a schema to be decoded into something Grafana can work with. Once decoded, everything else (queries, panels, alerts) works identically to JSON.

What is Avro (and why teams use it)?

Apache Avro is a schema-based serialization format that encodes data compactly, often much smaller than JSON, and relies on a schema to describe what each message contains. Think of it as a contract between your producer and consumer: the schema defines field names, types, and structure, and the binary payload is just the raw values packed efficiently.

Here's a quick example to see the difference. Below is an Avro schema of measurements of a sensor:

{
  "type": "record",
  "name": "SensorReading",
  "fields": [
    {"name": "sensor_id", "type": "string"},
    {"name": "temperature", "type": "double"},
    {"name": "humidity", "type": ["null", "int"], "default": null},
    {"name": "timestamp", "type": "long"}
  ]
}

The data in Avro binary (~29 bytes, strongly typed):

\x10sensor-42C@33333\x02\xc8\x01\xd0\xf4\x8b\x9a\xb2\x30

Same data in JSON (~91 bytes, no type safety):

{
  "sensor_id": "sensor-42",
  "temperature": 23.4,
  "humidity": 68,
  "timestamp": 1708027200000
}

↑ Field names repeated in every message + UTF-8 text encoding make JSON ~3x larger than Avro's compact binary format*; JSON types are also implicit; you won't know if temperature is a float, double, or string until runtime.*

Decoded by Grafana plugin (what you see in panels):

{
  "sensor_id": "sensor-42",
  "temperature": 23.4,
  "humidity": 68,
  "timestamp": 1708027200000
}

↑ Ready to map to Grafana fields: temperature becomes a time series, sensor_id a label

The big wins are efficiency and type safety: Avro doesn't repeat field names in every message (huge savings at scale), and the schema enforces types at write time, no surprise strings where you expected numbers. Plus, schema evolution lets you add fields and maintain backward/forward compatibility when you follow the rules, which is why Avro is commonly paired with a Schema Registry in Kafka ecosystems. Teams love it because streaming millions of messages per second becomes cheaper, safer, and more manageable than with JSON.

Big picture: what the plugin does for Avro

This Grafana Kafka datasource plugin can consume Kafka messages and turn them into Grafana-friendly fields in real time, supporting both JSON and Avro payloads. For Avro specifically, the plugin handles the heavy lifting:

  • Fetches schemas from a Schema Registry (like Confluent's) automatically based on topic/subject naming conventions

  • Accepts inline schemas you paste directly into the query or upload the schema file (great for quick tests or environments without a registry)

  • Deserializes binary messages on the fly and flattens nested records into dot-notation fields that Grafana understands

So whether your Avro messages come from IoT sensors, clickstream events, or service logs, the plugin decodes them, and you build dashboards just like you would with JSON data.

↑ A sample of the decoded Avro data in Grafana using the plugin

Prerequisites

Before we dive into configs, make sure you have:

  • Grafana 10.2+ installed and running

  • Plugin version 1.2.0+ (Avro support landed here): install via grafana-cli plugins install hamedkarbasi93-kafka-datasource or grab the latest zip from the GitHub releases

  • Kafka broker (v0.9+ works great)

  • (Optional) Schema Registry running at something like http://localhost:8081 if you want the registry approach

No Schema Registry? Totally fine! Inline schemas work perfectly for demos, local dev, and quick debugging.

Data source setup for Avro

Head to Connections > Data sources > Add new data source and search for "Kafka". Fill in the basics:

  • Bootstrap Servers: e.g., localhost:9092

  • SASL/SSL toggles: if your cluster requires auth, flip these on and add credentials

  • Schema Registry URL & its username/password: e.g., http://schema-registry:8081

For inline schema mode, you don't need to set the registry URL at the datasource level; you'll paste the schema directly in the data query panel. But if you're using a registry in production, configuring it once here means all your Avro queries auto-fetch schemas without extra config.

Hit Save & Test. A green check means you're connected and the plugin can reach Kafka.

Two ways to decode Avro

If you already have a Schema Registry (common with Confluent-style setups), you point the datasource at the registry URL and let it resolve schemas automatically. After hitting Test Connection and receiving the "The schema registry is accessible" confirmation, you can be confident the plugin can reach your registry successfully. The plugin follows Kafka's subject naming convention (e.g., <topic-name>-value for message values), fetches the latest schema version, and deserializes each message.

This keeps dashboards clean because you're not pasting schemas into Grafana queries, and it naturally supports teams that evolve schemas over time, just register a new version, and the plugin picks it up.

Inline/online schema (great for demos and quick debugging)

If you don't have a registry (or you're just prototyping), you can paste the full Avro schema JSON directly into the query editor, and the plugin will use it to decode messages. Alternatively, you can upload an Avro schema file formatted as .avsc for convenience. The plugin validates the inline schema automatically to ensure it's properly formatted before attempting to deserialize messages

Building your first Avro query

Create a new panel, pick your Kafka datasource, and configure the query:

  1. Format: Switch from JSON to Avro

  2. Topic: Type or autocomplete your Avro topic name (e.g., server-metrics)

  3. Partitions: Click "Fetch" to list partitions, then pick specific ones or select all

  4. Offset: Choose "Latest" for live streams, or "Last N" to replay recent messages

  5. Schema Mode:

    • Schema Registry: The plugin auto-derives the subject (e.g., server-metrics-value) and fetches the schema

    • Inline Schema: Paste the full Avro schema JSON in the text box

Once you save, the plugin starts consuming, deserializing on the fly, and populating fields in the panel's data frame.

Understanding field flattening with nested schemas

One of the features is how the plugin handles nested Avro records: it flattens them into dot-notation field names, allowing Grafana to work with them naturally. Let's use a real example to see this in action.

Here's a nested schema representing server metrics with host info, multi-level metrics, nullable fields, and arrays:

{
    "type": "record",
    "name": "NestedMessage",
    "fields": [
        {
            "name": "host",
            "type": {
                "type": "record",
                "name": "Host",
                "fields": [
                    {"name": "name", "type": "string"},
                    {"name": "ip", "type": "string"}
                ]
            }
        },
        {
            "name": "metrics",
            "type": {
                "type": "record",
                "name": "Metrics",
                "fields": [
                    {
                        "name": "cpu",
                        "type": {
                            "type": "record",
                            "name": "CPU",
                            "fields": [
                                {"name": "load", "type": ["null", "double"], "default": null},
                                {"name": "temp", "type": "double"}
                            ]
                        }
                    },
                    {
                        "name": "mem",
                        "type": {
                            "type": "record",
                            "name": "Memory",
                            "fields": [
                                {"name": "used", "type": "int"},
                                {"name": "free", "type": "int"}
                            ]
                        }
                    }
                ]
            }
        },
        {"name": "value1", "type": ["null", "double"], "default": null},
        {"name": "value2", "type": ["null", "double"], "default": null},
        {
            "name": "tags",
            "type": {"type": "array", "items": "string"}
        },
        {
            "name": "alerts",
            "type": {
                "type": "array",
                "items": {
                    "type": "record",
                    "name": "Alert",
                    "fields": [
                        {"name": "type", "type": "string"},
                        {"name": "severity", "type": "string"},
                        {"name": "value", "type": "double"}
                    ]
                }
            }
        },
        {
            "name": "processes",
            "type": {"type": "array", "items": "string"}
        }
    ]
}

When the plugin decodes a message with this schema, you'll see flattened fields like:

  • host.name (string) → label/dimension

  • host.ip (string) → label/dimension

  • metrics.cpu.load (double, nullable) → time series value

  • metrics.cpu.temp (double) → time series value

  • metrics.mem.used (int) → time series value

  • metrics.mem.free (int) → time series value

  • value1, value2 (nullable doubles) → time series values

  • tags → JSON string ["prod","edge"]

  • alerts → JSON string [{"severity":"warning","type":"cpu_high","value":98.99}]

  • processes → JSON string ["nginx","mysql","redis"]

This means you can directly reference metrics.cpu.temp in a Graph panel or use host.name as a grouping dimension in a Table; no manual parsing is needed.

Practical tips for nested schemas

  • Nullable unions (like ["null", "double"]) are handled gracefully; null values just show up as gaps in the time series.

  • Arrays are serialized as JSON strings in the data frame (e.g., ["nginx","mysql"] or [{"severity":"warning","type":"cpu_high","value":98.99}]). You can parse them further with Grafana's JSON transform or display them as-is in Table panels.

  • Deep nesting is flattened up to depth 5 by default; beyond that, the plugin won't flatten further to avoid performance issues.

  • You can adjust both the flatten depth and the maximum field limit (defaults: depth 5, fields 1000) in the Advanced Settings section of the Config Editor if your schema needs more room.

  • Use field aliases in Grafana's Transform tab to rename metrics.cpu.temp to something friendlier, like "CPU Temperature" in legends.

Testing it out with live data

The repo includes a Go-based producer that can publish Avro messages to your local Kafka. Fire it up:

go run ./example/go \
  -broker localhost:9094 \
  -topic server-metrics \
  -interval 500 \
  -format avro \
  -schema-registry http://localhost:8081

This pushes a message every 500ms with randomized metrics. Now create a panel in Grafana, point it at the server-metrics topic with Avro format, and watch the fields populate in real time. Build a Stat panel for metrics.cpu.temp, a Graph for metrics.mem.used over time, or a Table grouped by host.name, it all just works.

Wrapping up

There you have it! Avro streaming unlocked in Grafana. Whether you're running a full Confluent Platform with Schema Registry or just need to decode some binary messages locally, this plugin makes it dead simple to visualize Avro data alongside your other telemetry.

Questions? Found a bug? Feature idea? Hit up the GitHub issues or drop a comment below. And if this saved you a few hours of head-scratching, toss the repo a star, which helps more folks find it!​

Happy streaming! 🚀