IoT & Data Engineering

MQTT-Based IoT Data Pipeline

Case study: High-throughput MQTT data pipeline processing real-time telemetry from thousands of connected devices. Built for reliability, low latency, and horizontal scale.

Thousands

Connected Devices

Low-latency

Telemetry Ingestion

99.9%

Pipeline Uptime

High-throughput MQTT broker

Real-time telemetry processing

Time-series storage and querying

MQTTApache KafkaPythonTimescaleDBDockerAWS

Project Overview

The client shipped hardware to end customers and needed visibility into how those devices were performing in the field. Their existing telemetry setup was a single MQTT broker writing directly to a relational database — fine at a few hundred devices, but unstable under the burst traffic that came with fleet growth. Devices reconnecting after a network outage would flood the broker simultaneously, the database write queue would back up, and telemetry gaps would appear in monitoring dashboards at exactly the moments they were most needed.

We redesigned the telemetry pipeline from broker to storage, introducing a durable message bus between ingestion and persistence and replacing the relational database with a purpose-built time-series store. The result handles burst traffic without shedding messages, scales horizontally as the fleet grows, and queries efficiently against months of high-frequency device data.

Technical Architecture

The pipeline is designed around the principle that no single component should be a bottleneck or single point of failure:

MQTT Broker Cluster: Clustered EMQX deployment with QoS 1 message delivery guarantees — devices connect with persistent sessions so messages queued during outages are delivered when the device reconnects, not lost
Kafka Bridge: MQTT-to-Kafka bridge decouples ingestion rate from processing rate; the broker acknowledges messages immediately and Kafka provides the durable buffer that absorbs burst traffic without back-pressure to devices
Stream Processor: Python consumers read from Kafka, validate and enrich telemetry records (device metadata, unit conversions, anomaly flags), and write to storage — horizontally scalable by adding consumer instances
TimescaleDB: PostgreSQL extension optimized for time-series data; continuous aggregations pre-compute common queries (hourly averages, daily min/max) so dashboard queries stay fast as the dataset grows

Scale and Reliability

The redesigned pipeline removed the failure modes that caused telemetry gaps and gave the team the operational headroom to grow the fleet without rearchitecting again:

Burst absorption: Devices reconnecting after a network event now flood Kafka, not the database — the broker stays healthy and message delivery guarantees hold regardless of downstream processing speed
No data loss on failure: If a stream processor instance fails, Kafka consumer group rebalancing redistributes its partitions to healthy instances; messages are replayed from last committed offset with no gaps
Efficient historical queries: TimescaleDB’s chunk-based storage and continuous aggregations keep query times consistent even as months of per-device per-second data accumulates
Operational visibility: Kafka consumer lag metrics give the team a leading indicator of pipeline health — a growing lag signals a problem before it manifests as missing data in a dashboard

Challenges

Handling burst traffic from large device fleets
Ensuring message delivery guarantees at scale
Storing and querying high-cardinality time-series data

Solutions

Deployed clustered MQTT broker with QoS tiering
Bridged MQTT to Kafka for durable stream processing
Used TimescaleDB for compressed time-series storage

Results & impact

Stable ingestion across full device fleet

Sub-second telemetry visibility in dashboards

Horizontally scalable pipeline with no single point of failure

Want similar clarity in your systems?

Start with a Systems Assessment