Skip to main content
Back to Case Studies
IoT & Data Engineering

MQTT-Based IoT Data Pipeline

Case study: High-throughput MQTT data pipeline processing real-time telemetry from thousands of connected devices. Built for reliability, low latency, and horizontal scale.

Thousands
Connected Devices
Low-latency
Telemetry Ingestion
99.9%
Pipeline Uptime
High-throughput MQTT broker
Real-time telemetry processing
Time-series storage and querying
MQTTApache KafkaPythonTimescaleDBDockerAWS
MQTT IoT Data Pipeline Architecture

Project Overview

The client shipped hardware to end customers and needed visibility into how those devices were performing in the field. Their existing telemetry setup was a single MQTT broker writing directly to a relational database — fine at a few hundred devices, but unstable under the burst traffic that came with fleet growth. Devices reconnecting after a network outage would flood the broker simultaneously, the database write queue would back up, and telemetry gaps would appear in monitoring dashboards at exactly the moments they were most needed.

We redesigned the telemetry pipeline from broker to storage, introducing a durable message bus between ingestion and persistence and replacing the relational database with a purpose-built time-series store. The result handles burst traffic without shedding messages, scales horizontally as the fleet grows, and queries efficiently against months of high-frequency device data.

Technical Architecture

The pipeline is designed around the principle that no single component should be a bottleneck or single point of failure:

  • MQTT Broker Cluster: Clustered EMQX deployment with QoS 1 message delivery guarantees — devices connect with persistent sessions so messages queued during outages are delivered when the device reconnects, not lost
  • Kafka Bridge: MQTT-to-Kafka bridge decouples ingestion rate from processing rate; the broker acknowledges messages immediately and Kafka provides the durable buffer that absorbs burst traffic without back-pressure to devices
  • Stream Processor: Python consumers read from Kafka, validate and enrich telemetry records (device metadata, unit conversions, anomaly flags), and write to storage — horizontally scalable by adding consumer instances
  • TimescaleDB: PostgreSQL extension optimized for time-series data; continuous aggregations pre-compute common queries (hourly averages, daily min/max) so dashboard queries stay fast as the dataset grows

Scale and Reliability

The redesigned pipeline removed the failure modes that caused telemetry gaps and gave the team the operational headroom to grow the fleet without rearchitecting again:

  • Burst absorption: Devices reconnecting after a network event now flood Kafka, not the database — the broker stays healthy and message delivery guarantees hold regardless of downstream processing speed
  • No data loss on failure: If a stream processor instance fails, Kafka consumer group rebalancing redistributes its partitions to healthy instances; messages are replayed from last committed offset with no gaps
  • Efficient historical queries: TimescaleDB’s chunk-based storage and continuous aggregations keep query times consistent even as months of per-device per-second data accumulates
  • Operational visibility: Kafka consumer lag metrics give the team a leading indicator of pipeline health — a growing lag signals a problem before it manifests as missing data in a dashboard

Challenges

  • Handling burst traffic from large device fleets
  • Ensuring message delivery guarantees at scale
  • Storing and querying high-cardinality time-series data

Solutions

  • Deployed clustered MQTT broker with QoS tiering
  • Bridged MQTT to Kafka for durable stream processing
  • Used TimescaleDB for compressed time-series storage

Results & impact

Stable ingestion across full device fleet

Sub-second telemetry visibility in dashboards

Horizontally scalable pipeline with no single point of failure

Want similar clarity in your systems?

Start with a Systems Assessment