Big data as a tool to improve air quality

Jamiil Touré Ali

Air pollution is a lightly discussed topic, which impacts our health and sustainable development around the world. To date, Air Quality Index (AQI) is the metrics humans use to get information about air quality. The AQI is a unitless metric used to assess daily air quality. It helps monitor air pollutants such as ground-level ozone, particulate matter (PM) (PM2.5 and PM10), carbon monoxide, sulfur dioxide, nitrogen dioxide concentrations in µg/m3. Depending on the country, region, or locality, national governments have a national air quality standard set by the Environmental Protection Agency (EPA) to protect public health. Ranging from 0 to 500 or color-coded, the AQI informs on our air quality and its health effects. 

According to the World Health Organization (WHO), in 2018, 90% of people worldwide breathe polluted air, despite the different actions countries are taking. An alarming health situation which is the cause of premature and adult deaths. IQAir reported that 2020 compared to 2019, revealed an unprecedented air quality improvement across major cities such as Beijing (-11%), Chicago (-13%), Delhi (-15%), London (-16%), Paris (-17%) and Seoul (-16%) due to COVID-19 lockdowns and behavioral changes on global particulate matter (PM2.5) levels (IQAir, 2020). However, in 2021 a resurgence in air pollution is expected due to human activity. So, how can we improve the quality of the air we breathe? 

 Big data for cleaner air 

The computation of AQI uses huge amounts of data collected from sensors of various stations which detect the air pollutants concentrations in the air hourly. Such information is compiled for instance from various monitoring stations over 3 years which represents approximately 15 million observations difficult to compute using conventional methods (Knowledge Discovery and Data Mining, 2015). Thanks to Big data analytics we can process historical and continuous data collected from sensors to compute the AQI per region, a locality within time slots as well as forecast its future values using machine learning. The resulting information from the big data analysis is then visualized and serves as a tool for prevention. For instance, in India, the city of Delhi vehicle exhaust is one of the main causes of PM2.5 high concentration in the air (Centre for Policy Research, 2017). As a result, decision-makers could take action with respect to the usage of public transport to reduce air pollution, hence contributing to the attainment of SDG target 3.9: By 2030, substantially reduce the number of deaths and illnesses from hazardous chemicals and air, water, and soil pollution and contamination.

Big data combined with IoT (Internet Of Things) can also help improve air quality thanks to the smart city concept. The smart city concept integrates the usage of information and communication technology and IoT network to facilitate interactivity between citizens and community infrastructures and to monitor changes in the city such as air quality index. For instance, in the UK, the city of Coventry uses IoT sensor data, to send alerts to pedestrians and drivers to consider alternative routes while entering an area with a high concentration of air pollution (EarthSense, 2019). Furthermore, IoT sensor data allows analysis in areas where monitoring stations do not have access; Making it possible to determine air quality in a very specific area rather than considering Point Of Interest data computed in a monitoring station. Therefore  supporting compliance with SDG target 11.6: By 2030, reduce the adverse per capita environmental impact of cities, including by paying special attention to air quality, municipal and other waste management.

Final words

Big data use can provide informed insights to policymakers by analyzing historical or real-time data of air pollutants concentrations in monitoring stations. Moreover, combining IoT sensor data with big data provides further local insights to monitor the changes in air quality at the citizen and community level. Nevertheless, big data will never replace human innovation or environmental responsibility, it can only provide the tools and knowledge we need to reduce pollution and improve air quality. Air quality should be a public health concern for all humankind to breathe cleaner oxygen and comply with SDG targets 3.6 and 11.6.


[1] fighting-air-pollution

[2] household-air-pollution-and-health

[3] world-health-organization-releases-new-global-air-pollution-data






[9] 9-out-of-10-people-worldwide-breathe-polluted-air-but-more-countries-are-taking-action 

[10] forecasting-fine-grained-air-quality-based-on-big-data


[12] zephyr-sensors-send-alerts-to-divert-traffic-from-pollution-hotspots-in-coventry


About the author

Jamiil Touré Ali

Design Engineer in Electricity and Master Graduate in Mathematical Sciences with good knwoledge of programming language such as python and R for the use of data science, machine learning and Deep learning tasks/projects.