AI and IoT for air pollution: monitoring, prediction, and control

Air pollution is a measurement problem before it is a policy problem. You cannot reduce what you cannot see at the resolution that matters — block by block, hour by hour, source by source. That is the gap a combined AI and IoT system actually closes. A dense mesh of low-cost sensors produces continuous readings of particulate matter, nitrogen dioxide, ozone, and carbon monoxide; machine-learning models then turn that stream into something operators can act on: source attribution, short-horizon forecasts, and exposure maps. The pairing is not novel, but the way it scales has changed what cities and industrial operators can do in practice.

We have worked on enough environmental and infrastructure monitoring projects to know where this falls apart, and where it does not. The interesting questions are not about whether AI “can” help — they are about which signals are reliable, what the models can and cannot infer, and where human judgment still has to sit in the loop.

What does an AI and IoT air quality system actually do?

At the hardware layer, an IoT air quality system is a fleet of networked sensors. Each node typically carries a PMS5003 or SPS30 particulate sensor, an electrochemical cell for NO₂ or O₃, a temperature and humidity probe, and a microcontroller that timestamps and transmits readings over LoRaWAN, NB-IoT, or Wi-Fi. The data lands in a time-series database — InfluxDB, TimescaleDB, or a managed equivalent — and downstream services consume it.

The AI layer does four jobs that raw telemetry cannot:

Calibration. Low-cost sensors drift. ML models trained against reference-grade monitors correct that drift in software, which is what makes a dense low-cost network operationally useful at all.
Source attribution. Given a spike at sensor 47 at 18:30, was it traffic, a nearby construction site, or transported smoke from a fire 40 km upwind? Gradient-boosted models trained on wind, traffic, and historical patterns assign probabilities to each.
Short-horizon forecasting. Sequence models (LSTM or temporal convolutional networks) predict pollution levels 1–24 hours ahead, with usable accuracy in the 1–6 hour band.
Exposure mapping. Spatial interpolation — kriging, graph neural networks, or learned land-use regression — fills the gaps between sensors so the output is a continuous field, not 200 isolated dots.

The structured answer that engineering teams actually need looks like this:

Layer	Component	What it produces
Sensing	IoT nodes (PM, NO₂, O₃, met)	Raw 1-minute telemetry
Transport	LoRaWAN / NB-IoT / cellular	Time-stamped messages
Storage	Time-series DB + object store	Queryable history
Calibration	Regression / Gaussian process models	Drift-corrected readings
Inference	Gradient boosting, LSTM, GNN	Source attribution, forecasts, maps
Action	Dashboards, alerts, control loops	Traffic signals, advisories, HVAC setpoints

Each layer fails in its own way. Sensor drift is the most common silent failure; transport gaps the most common loud one; model staleness the most common slow one.

Why low-cost sensor networks change the picture

For decades, official air-quality data came from a handful of reference-grade stations — accurate, expensive, and sparse. A city of ten million might have twenty. That works for compliance reporting but tells you almost nothing about exposure on a specific street, near a specific school, at school-pickup time.

IoT changed the economics. A reference monitor costs tens of thousands of dollars; a calibrated low-cost node costs a few hundred. That ratio is the entire story. It is now feasible to deploy hundreds or thousands of nodes across a metropolitan area, and the missing accuracy is recovered statistically: each cheap sensor is co-located with or cross-referenced against a reference station, and a learned correction maps its readings into the reference scale.

This is an observed pattern across deployments rather than a single benchmark. The trade-off is real — low-cost sensors are noisier and degrade faster — but the gain in spatial resolution more than compensates for most operational use cases (advisories, traffic management, hotspot identification). For epidemiological studies or legal compliance, reference instruments still anchor the network.

What does AI add that simple thresholds cannot?

A naive system would alert when PM2.5 crosses 35 µg/m³. That works, and it is where most early deployments stop. The reason to add machine learning is not to detect pollution — sensors already do that — but to answer the questions a threshold cannot:

What is causing this? Source attribution turns a spike into an actionable signal. A traffic-caused spike calls for signal-timing changes; a construction-caused spike calls for an inspector visit; a wildfire-smoke spike calls for indoor-air advisories.
What happens next? A six-hour forecast lets schools cancel outdoor activity before the air gets bad, not after. The forecast horizon that matters depends on the decision: hospitals plan respiratory-staffing days ahead, traffic centres respond within minutes.
Where, exactly? Spatial models turn a sparse sensor field into a continuous exposure surface. A parent does not care about the average reading across the borough; they care about the route to school.

Models we have used in similar geospatial-inference work — gradient boosting for tabular features, temporal convolutional networks for sequences, and graph neural networks for spatially-correlated sensors — all sit on top of standard tooling (PyTorch, scikit-learn, XGBoost, ONNX for deployment). None of this requires exotic infrastructure. What it requires is clean data, good co-location with reference stations, and ongoing model maintenance.

Edge inference versus cloud inference

The architectural question that comes up early is whether inference runs at the edge — on the sensor node or a nearby gateway — or in the cloud. The honest answer is that both are needed, and the split depends on latency and bandwidth.

Calibration and anomaly flagging belong at the edge. They are cheap to compute, they reduce the volume of garbage data flowing upstream, and they keep the system responsive when connectivity drops. We have seen networks where naive cloud-only designs collapsed under their own telemetry volume; an edge filter that drops obviously bad readings before transmission solved it.

Forecasting and spatial inference belong in the cloud. They need the full picture across all sensors, wind data from external feeds, and traffic data from city systems. The output flows back down to dashboards and control systems.

Frameworks like TensorFlow Lite Micro and ONNX Runtime make edge inference straightforward on the ARM Cortex-M class microcontrollers used in sensor nodes. The harder problem is power: a battery-powered node running continuous inference burns through its capacity in weeks. Most production designs run inference in short scheduled bursts, not continuously.

Worked examples

A few deployments illustrate the shape of working systems:

London runs the Breathe London network, with hundreds of low-cost sensors complementing the official LAQN reference network. AI models fuse the two streams and feed school- and street-level advisories.
Beijing has integrated dense sensor coverage with traffic and industrial-emission data. The output drives same-day restrictions during high-pollution episodes.
California deploys mobile and fixed sensors to track wildfire smoke, with forecast models that warn communities hours before the plume arrives. The PurpleAir network is the most visible civilian example.
Singapore ties air-quality sensing into its smart-traffic system, so signal timing and route recommendations respond to measured exposure, not just congestion.

The common thread is not a single algorithm. It is the discipline of running calibrated sensors continuously, attributing sources rigorously, and connecting the output to a decision someone is actually empowered to make.

What this still cannot do

Two limits are worth naming. First, sensor networks measure what they measure — they will miss pollutants they were not designed for. A network optimised for PM2.5 will not catch a benzene leak. Pollutant coverage is a design choice, not a model property.

Second, prediction degrades fast beyond about six hours for fine-grained spatial forecasts. Day-ahead forecasts work for regional averages; street-level forecasts at 24 hours are not reliable, and presenting them as such erodes trust in the whole system. Operators who run these systems well are explicit about the uncertainty band on every forecast.

The role of human judgment does not disappear. Models surface candidate sources and forecast ranges; people decide whether to close a road, issue an advisory, or open an investigation. The system is a sensor and an analyst, not a decision-maker.

For teams building toward this kind of deployment, the engineering questions — sensor selection, calibration protocols, edge-cloud split, model maintenance — matter more than the AI marketing. We work with operators who need the system to be reliable five years from deployment, not impressive on launch day. Sustained calibration discipline is what separates a network that keeps producing decision-grade data from one that quietly degrades into noise.

If you are scoping a sensing-and-inference project of this shape — environmental, industrial, or logistics — TechnoLynx can help with the architecture and the model layer. The hardest part is rarely the AI; it is the data plumbing that keeps the AI honest.

Image by Freepik.