🟩 Azure Event Hubs

Hyper-scale event streaming platform — ingest millions of events per second with replay


Table of Contents

  1. Product Overview
  2. Core Concepts
    1. Event Hub (Topic)
    2. Partitions
    3. Consumer Groups
    4. Offset & Replay
  3. Capacity Model
    1. Standard & Premium: Throughput Units / Processing Units
  4. SKU Tiers
  5. SLA
  6. Event Hubs Capture
  7. Kafka Compatibility
  8. Geo-Disaster Recovery
  9. Schema Registry
  10. Security
  11. Common Exam Scenarios

Product Overview

Azure Event Hubs is a big-data streaming platform and event ingestion service. It can receive and process millions of events per second with low latency. Event Hubs is designed for telemetry ingestion, log streaming, clickstream data, IoT data pipelines, and any scenario requiring high-throughput, ordered, replayable event streams.

Event Hubs is Azure’s Apache Kafka-compatible streaming service. Kafka clients can connect to Event Hubs without code changes.

flowchart LR
    subgraph Producers["Producers (Publishers)"]
        IoT["IoT Devices"]
        App["Application Logs"]
        Click["Clickstream / Web"]
        Kafka["Kafka Clients"]
    end
    subgraph EH["Azure Event Hubs Namespace"]
        direction TB
        Hub["Event Hub (topic)\n(partitioned log)"]
        P0["Partition 0"]
        P1["Partition 1"]
        P2["Partition 2"]
        P3["Partition 3"]
        Hub --> P0
        Hub --> P1
        Hub --> P2
        Hub --> P3
    end
    subgraph Consumers["Consumer Groups (Independent Readers)"]
        CG1["Consumer Group: Analytics\n(Stream Analytics)"]
        CG2["Consumer Group: ML\n(Azure Databricks)"]
        CG3["Consumer Group: Archive\n(Event Hubs Capture)"]
    end
    Producers -->|AMQP / HTTPS / Kafka| Hub
    P0 & P1 & P2 & P3 --> CG1
    P0 & P1 & P2 & P3 --> CG2
    P0 & P1 & P2 & P3 --> CG3

Core Concepts

Event Hub (Topic)

An Event Hub is a named stream inside a namespace — analogous to a Kafka topic. Events are appended to a partitioned, time-ordered log.

Partitions

Partitions are the unit of parallelism in Event Hubs. Each partition is an independent, ordered sequence of events. Key properties:

Property Detail
Default partitions 4 (Standard/Premium)
Max partitions 32 (Standard) / 2,000 (Premium / Dedicated)
Partition count Set at creation; cannot be changed
Ordering guarantee Guaranteed within a partition
Partition routing By partition key (hash) or round-robin

⚠️ Exam Caveat: Partition count is immutable after creation. If a scenario requires more partitions later, a new Event Hub must be created. Plan capacity upfront.

Consumer Groups

A consumer group is an independent view of the entire stream — each consumer group maintains its own offset (position in the stream). This enables multiple independent consumers to read the same data at their own pace without interfering with each other.

Consumer Groups Limit
Basic 1 consumer group
Standard Up to 20 consumer groups
Premium Up to 100 consumer groups

Offset & Replay

Unlike queues, Event Hubs retains events for a configurable retention period. Consumers can reset their offset to any point within the retention window and replay events — a key differentiator from Service Bus.

Property Detail
Default retention 1 day
Max retention (Standard) 7 days
Max retention (Premium) 90 days
Max retention (Dedicated) 90 days

⚠️ Exam Caveat: The ability to replay events is unique to Event Hubs among the four messaging services. If a scenario mentions replaying events for reprocessing or recovery, the answer is Event Hubs.


Capacity Model

Standard & Premium: Throughput Units / Processing Units

SKU Capacity Unit Ingress Egress
Standard Throughput Unit (TU) 1 MB/s or 1,000 events/s per TU 2 MB/s per TU
Premium Processing Unit (PU) ~1 GB/s per PU (much higher) Proportional
Dedicated Capacity Unit (CU) ~1 GB/s per CU Proportional

Standard limits:

  • Max 40 TUs per namespace (soft limit; can be raised)
  • Auto-inflate: ✅ Automatically scales TUs up to a configured max

Premium limits:

  • 1–16 PUs per namespace
  • Fixed price per PU regardless of usage volume

⚠️ Exam Caveat: Standard TUs cap at 20 MB/s ingress (20 TUs). For higher throughput, use Premium or Dedicated. Auto-inflate is only available on Standard.


SKU Tiers

Feature Basic Standard Premium Dedicated
Consumer groups 1 20 100 Unlimited
Brokered connections 100 1,000 10,000 Unlimited
Retention 1 day 1–7 days 1–90 days 1–90 days
Capture
Kafka surface
Schema Registry
VNet / Private Endpoint
Geo-DR
Availability Zones ✅ (auto) ✅ (auto)
Customer-managed keys
Max partitions 32 32 100 2,000
Dedicated cluster
Pricing Per TU Per TU Per PU Per CU (hourly)

SLA

SKU Uptime SLA
Basic 99.9%
Standard 99.9%
Premium 99.95%
Dedicated 99.99%

⚠️ Exam Caveat: Event Hubs Dedicated is the only tier with a 99.99% SLA — achieved through Availability Zone support in its isolated single-tenant cluster.


Event Hubs Capture

Capture automatically delivers the streaming data to Azure Blob Storage or Azure Data Lake Storage Gen2 in Avro or Parquet format — without writing any consumer code. This is the primary way to persist event streams for long-term analytics.

Property Detail
Output format Apache Avro (default) or Parquet
Destination Azure Blob Storage or ADLS Gen2
Trigger Time window (min 1 min) or size window (min 10 MB)
Availability Standard, Premium, Dedicated (NOT Basic)
Cost Charged per Capture-hour plus storage costs

⚠️ Exam Caveat: Event Hubs Capture requires Standard tier or above — it is not available on Basic.


Kafka Compatibility

Event Hubs exposes a Kafka endpoint on port 9093 (TLS). Kafka clients can produce and consume events without code changes — only the connection string is different. This makes Event Hubs a managed replacement for self-hosted Kafka clusters.

Kafka Concept Event Hubs Equivalent
Broker Event Hubs Namespace
Topic Event Hub
Partition Partition
Consumer Group Consumer Group
Offset Offset

⚠️ Exam Caveat: Kafka surface is available from Standard tier and above — not Basic.


Geo-Disaster Recovery

Event Hubs Geo-DR pairs a primary and secondary namespace. Like Service Bus Geo-DR, it replicates metadata only (Event Hub definitions, consumer groups) — not messages in flight. On failover, the secondary becomes the active namespace via an alias FQDN.

Active Geo-Replication (message-level): Must be implemented at the application layer using dual-write producers.


Schema Registry

Event Hubs includes a built-in Schema Registry (Standard and above) for managing and enforcing Avro, JSON Schema, or Protobuf schemas on event producers and consumers. This enables schema evolution without breaking consumers.


Security

Mechanism Notes
SAS tokens Namespace or entity-level; Send, Listen, Manage claims
Microsoft Entra ID (RBAC) Preferred; assign Azure Event Hubs Data Sender/Receiver roles
Managed Identity Allows producers/consumers to authenticate without secrets
Private Endpoints Premium and Dedicated only
Customer-managed keys Premium and Dedicated only
IP filtering Available on all tiers

Common Exam Scenarios

Scenario Answer
Ingest 1 million IoT events/second Event Hubs (Premium or Dedicated)
Replay events after a consumer bug Event Hubs (unique to this service)
Stream telemetry into Azure Stream Analytics Event Hubs as the input source
Replace self-managed Kafka cluster Event Hubs (Kafka-compatible endpoint, Standard+)
Archive streaming data to ADLS Gen2 Event Hubs Capture
Need > 7-day event retention Event Hubs Premium or Dedicated (up to 90 days)
Isolate Event Hubs from public internet Event Hubs Premium + Private Endpoint
Multiple independent analytics pipelines on same stream Event Hubs Consumer Groups
Guaranteed event ordering across all partitions NOT possible in Event Hubs — ordering is per-partition only