Journalist Olaya Argüeso Pérez shares techniques, tools, data sources and practical tips on how to find out which companies are contributing to worsening air quality across Europe.
by Olaya Argüeso Pérez
I grew up in an industrial town where smoking chimneys were part of the landscape, but it was not until years later that I realised how this pollution might have affected my health as a child. So when I discovered an official database containing data on air pollution from large European factories, I couldn't help but search for my hometown. Sure enough, the factory where my father used to work was one of the biggest polluters, not only in the region, but in the entire country.
I searched the database out of personal curiosity, but anyone — be it a journalist, a researcher, an activist, or a concerned citizen — can use this resource to find out which companies are contributing to worsening air quality across Europe.
Background information to get you started
Let’s start with the good news: According to the European Environment Agency (EEA), air pollution emissions have declined in the last two decades, which has improved air quality in Europe and helped reduce the number of deaths related to it. However, air pollution remains the largest environmental health risk for citizens on the continent. In 2022, for example, exposure to a harmful pollutant known as fine particulate matter caused an estimated 239,000 premature deaths, according to the World Health Organization. To put this into perspective, in a single year, air pollution prevented hundreds of thousands of people from sharing more time with their loved ones and contributing to their communities.
The damaging effects of air pollution are not limited to premature deaths. It can also reduce quality of life for the living, making people sick, and financially burdening the healthcare sector. Air pollution does not harm everyone equally: Lower socio-economic groups tend to be exposed to higher levels of air pollution, while the most vulnerable in our society — older people, children, and those with pre-existing health conditions — suffer the most.
In addition, air pollution damages European ecosystems.
While there are various sources of air pollutants, like road traffic or domestic heating, industry emissions take a toll on nature and human health. The costs of air pollution caused by Europe’s largest industrial plants averaged between EUR 268 billion to EUR 428 billion per year, according to an EEA analysis. In 2021, these costs corresponded to about 2% of the EU’s Gross Domestic Product (GDP).
For more on the adverse effects of air pollution, read “Ηow air pollution affects our health” from the EEA. You can also check how clean the air you’re breathing right now is by accessing the European Air Quality Index here: https://www.eea.europa.eu/en/analysis/maps-and-charts/index.
How is pollution controlled?
These high levels of air pollution might be even higher if not for industrial emissions controls set by the EU.
The main regulation is the Industrial Emissions Directive (IED, Directive 2010/75/EU), which requires facilities undertaking certain industrial activities to ask authorities in their EU Member State for a permit to operate.
Other relevant pollution control efforts include the European Pollutant Release and Transfer Register (E-PRTR) (regulated by Regulation (EC) No 166/2006 and Commission Implementing Decision 2019/1741), which requires industrial facilities in certain sectors to disclose their emissions. This register currently covers more than 60,000 industrial sites in Europe across the following sectors:
- energy
- production and processing of metals
- mineral industry
- chemical industry
- waste and waste water management
- paper and wood production and processing
- intensive livestock production and aquaculture
- animal and vegetable products from the food and beverage sector
- other activities
Facilities covered by the E-PRTR must disclose their air, water, and land emissions whenever they surpass certain thresholds, which vary by pollutant.
Beginning in 2028, an updated Regulation (Regulation 2024/1244), which sets additional reporting requirements for industrial facilities, will replace the Industrial Emissions Directive. Until then, the E-PRTR Regulation will continue to apply.
The E-PRTR includes 91 pollutants, which belong to one of these seven groups:
- greenhouse gases
- other gases
- heavy metals
- pesticides
- chlorinated organic substances
- other organic substances
- inorganic substances
Substances released into the air account for almost three-quarters of all pollutants in the E-PRTR.
Where to find the data
The underlying data feeding the E-PRTR can be found on the European Industrial Emissions Portal, run by the European Environment Agency (EEA). This website contains extensive information about industrial pollution, the regulations that control it, as well as maps and tables. You can also learn more about the effects of each pollutant on human health and ecosystems. The wealth of data is a treasure trove for researchers eager to dive deeper.
The emissions portal is also the source for the dataset we are going to work with. However, it can be a bit tricky to get to it, so let me guide you through!
In the upper menu, there are several options to choose from. One of them is ‘Download’ — also available on the right in the landing page. Go to this section.
Screenshot from the European Industrial Emissions Portal – Source: EEA
Once there, below "Download datasets", you will find the "Industrial reporting dataset". Click on the "DOWNLOAD" button.
Screenshot from the European Industrial Emissions Portal – Source: EEA
Here it gets a bit trickier. At the bottom of this not-very-user-friendly page, a series of screenshots presents older versions of our dataset alongside the newest version. The most updated is the only one not tagged as “superseded”. As I am writing this, it covers the years 2007-2023. Click on it.
Screenshot from the European Industrial Emissions Portal – Source: EEA
Now navigate to the middle of the page and click on "Direct download". We’re almost there!
You will be shown a list of folders. Choose "User friendly Excel file", or ‘User friendly .csv file’ if you know how to use this format. Within this folder, click on "EEA_Industry_Dataset_EPRTR_Air_Releases.xlsx". It is a large file, so it may take some time to download.
What you will find in the dataset — or not
There are certain limitations to the data that you need to be aware of so as not to misunderstand it. Please make sure you read this before diving into the dataset.
1. Not every industrial facility is listed
As mentioned above, the E-PRTR only requires certain activities to report their emissions. They can be found in Annex I of the regulation. Additionally, the regulation establishes certain ‘capacity thresholds’ that limit which facilities need to report. In general, only the largest industrial facilities must disclose their emissions. The E-PRTR covers over 60,000 industrial sites across Europe.
2. Not every emission must be disclosed
Just as there are thresholds for facilities’ size, Annex II of the E-PRTR regulation establishes certain limits above which emissions must be reported. That means that, even if a pollutant is on the list, facilities do not need to disclose emissions below a certain threshold. This matters when analysing the data. More on that below.
3. The dataset is not perfect
This register and its data are better than nothing but they do contain mistakes. So if you find a figure that is too large or too small compared to a facility’s historical data, consider that a red flag and reach out to the EEA. They are helpful and appreciate it when errors are spotted.
Additionally, the EEA warns that there are gaps in the data, since some countries have not reported their information for certain years. You can find their full disclaimer here.
4. This is emissions data, not air quality data
You might be tempted to use this data to hold companies accountable for air quality where you live, and you may be able to do so, but only indirectly. The emissions of air pollutants, as per Annex II of the E-PRTR regulation, are measured in kilograms per year (source: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32006R0166#anx_%C2%A0II / see columns 1a, 1b and 1c).
Air quality data, meanwhile, are usually calculated as concentrations in micrograms per cubic meter (source: https://airindex.eea.europa.eu/AQI/index.html / see “About the European Air Quality Index”. Under “Bands of concentrations and index levels”, there is a chart where the index shows “pollutant concentrations in µg/m3”.) The different measuring methods do not allow for direct comparisons between both types of data.
How to use the data
Now that you downloaded the data following the instructions above, it’s time to start working with your dataset. I advise you to first get familiar with the spreadsheet before jumping to conclusions.
This article assumes that the reader can work with spreadsheets. If you need help with that, there are plenty of tutorials online that show how to carry the basic operations that I describe here.
The Excel (or .csv) file consists of four tabs: "Air_Releases_National", "Air_Releases_Sector", "Air_Releases_AnnexIActivity" and "Air_Releases_Facilities". As you may have guessed, they provide different degrees of detail, from aggregated country-wide data to more granular information on individual facilities within countries. Here’s a breakdown of the four tabs:
1. "Air_Releases_National"
This sheet presents aggregated data for each country, pollutant and year. It’s important to remember the database’s limitations mentioned above when analysing the data, especially that only emissions above a certain threshold must be disclosed. So, an empty cell does not necessarily mean that there were no emissions for that pollutant; it only shows that, if there were any, they remained below the reporting level.
Screenshot from the E-PRTR database for air releases – Source: EEA
Let’s see an example.
Between 2007 and 2013, Austria only declared emissions for the pollutant naphthalene for four years (source: Dataset "EEA_Industry_Dataset_EPRTR_Air_Releases.xlsx" / row 21 shows the emissions for naphthalene between 2007 and 2013), when the releases to the air surpassed the limit of 100 kilograms per year for this substance (source: https://industry.eea.europa.eu/pollutants/pollutant-index / in the display list, choose naphthalene and go to the tab “Pollutant thresholds”.) The limit for air is 100 kg/year. The figures for 2014, 2015, 2018 and 2019 show a wide range, from 101 to 1,140 kilograms per year (source: Dataset ‘EEA_Industry_Dataset_EPRTR_Air_Releases.xlsx’ / row 21 shows the emissions for naphthalene between 2007 and 2013.) It may be worth investigating further, and that’s what we will do, using the next sheets.
2. "Air_Releases_Sector
The data in this tab go a layer deeper, adding information about the sector. Here you can find out not only how much of each pollutant was emitted in every country for each year, but also which sector or sectors are responsible for those emissions. Every sector has a descriptive name (column ‘EPRTR_SectorName’) and an identifying code (column ‘EPRTR_SectorCode’).
Screenshot from the E-PRTR database for air releases – Source: EEA
Going back to our example, apply a filter on the "Country" column to select Austria and another one on the ‘Pollutant’ column: this operation will reveal that the mineral industry, whose E-PRTR code is "3", is the only responsible for naphthalene emissions in Austria.
3. "Air_Releases_AnnexIActivity"
Each E-PRTR sector includes different activities, all of them described in Annex I of the E-PRTR regulation. This tab adds that information to the dataset, which allows for a more nuanced analysis. The extra column on this sheet provides information on the main activities within each sector. Unfortunately, it only does so in the form of a code ("EPRTRAnnexIMainActivity"), consisting of the sector’s number and a letter. To find out which code corresponds to which activity, check out Annex I mentioned above.
Caption: Screenshot from the E-PRTR database for air releases – Source: EEA
Applying filters will be useful here too: Repeat the operation above and the sheet will show that, again, just one activity—3(c)—accounts for all the naphthalene emissions in Austria. According to Annex I, that is the code for different types of cement factories (source: <https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32006R0166#anx%C2%A0I> / under column "No", 3 corresponds to "Mineral industry", and (c) contains _installations for the production of (i) Cement clinker in rotary kilns, (ii) Lime in rotary kilns, and (iii) Cement clinker or lime in other furnaces*.)
4. "Air_Releases_Facilities"
The final tab is the most interesting if you are investigating emissions from a certain facility or a certain pollutant. Now you can access the details of how much of each pollutant each facility emitted every year between 2007 and 2023. Additionally, you can also see the city and coordinates for each facility, as well as its name. The arcane number in the "FacilityInspireId" is a unique identifier for each facility, which comes in handy to track down a facility’s history: Since the database covers almost two decades, some facilities may have changed their names, but their IDs remain the same.
Screenshot from the E-PRTR database for air releases – Source: EEA
If we reapply the filters for Austria and naphthalene, there seems to be just one facility emitting that pollutant in the country: Leube Zement GmbH, in the city of Hallein. The figures reported by this factory show a very broad range, from 101 to 1,140 kilograms per year, so it may be worth checking why this happened. The fact that it is only reporting a few years is also remarkable: What was going on the rest of the time series? Maybe the factory was not operating, or there were events that forced the facility to emit more naphthalene than usual. Here’s where you can pick it up!
Use cases
The extent and variety of E-PRTR emissions data allow for different kinds of analyses. If you need inspiration, you can take a look at how some local reporters across Europe have used them to report on their communities.
Previous investigations
CORRECTIV.Europe, a network of local journalists who work together on European data-driven investigations, maintains a website where you can find examples of local stories based on this data. (Full disclosure: I led the project until September 2024.)
While the examples displayed there only show how journalists have used the data, it could also be helpful for other types of investigators. See, for example, this story on "Industrial air pollution costs Europe 265 billion euros in one year" or this one on "Hard To Breathe: Livestock Emissions And The Long Road Towards Sustainability".
Some of the network’s reporting combined the emissions data with another dataset mentioned above, one also produced by the EEA, that tabulates the costs associated with each facility, based on their emissions. According to the EEA’s estimations, these costs reached at least EUR 268 billion per year. While the dataset itself is not public, the EEA has produced some materials, like maps, where it displays the most harmful facilities in Europe and their costs in terms of damage to human health and the environment. This information may prove very useful for local communities and activists fighting for accountability of large polluters.
Ammonia: a pollutant worth checking
Though not explicitly mentioned above, the E-PRTR emissions dataset also includes livestock farming as an industrial activity, under ‘intensive livestock production and aquaculture’. This may seem odd to include, but agriculture is responsible for over 90% of ammonia emissions in the EU.
The pungent gas does not directly harm human health, but it contributes to the formation of particulate matter, a pollutant with a significant potential to harm human health. While the EU has been able to reduce its emissions for almost all pollutants, ammonia levels have remained stubbornly stable for the last 20 years, although they should be reduced by 19% by 2030 (see https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:02016L2284-20240206#anx_II, table B).
Go to https://industry.eea.europa.eu/pollutants/pollutant-index and check the various types of pollutants from the drop-down list to see how they affect the environment.
Screenshot from https://industry.eea.europa.eu/pollutants/pollutant-index.
As with the overall data, there are also limitations for ammonia emissions. The E-PRTR Regulation excludes cattle from the obligation to report their emissions. Only large poultry and pork farms must disclose how much ammonia—among other pollutants—they release. Despite the update of the IED through Regulation 2024/1244, cattle has not been included in the reporting obligations (source: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32006R0166#anx_%C2%A0I / see list in Annex 1, number 7).*
So if there is a large cattle farm close to where you live, you won’t find its emissions in this database.
Conclusion
At this point, I would understand if the E-PRTR register’s limitations and other caveats I mentioned have curbed your initial enthusiasm for looking into the data. You may even wonder whether it’s worth the effort. I honestly think it is. Despite all its pitfalls, the data remains a powerful tool for citizens, activists and journalists to hold companies accountable for the pollution they release into our environment. The use cases above are good examples of not letting the perfect be the enemy of the good.
I have tried to give you a sneak peek into what is possible, without getting too technical. However, some depth is required for the sake of accuracy. But now let me leave you with these parting words: there are many different ways to use the data to serve your interests, so use your imagination!
Credits and Licensing
- Author: Olaya Argüeso Pérez
- Editorial support & copy-editing: Tyler McBrien, Laura Ranca, Jasmine Erkan
- Illustration & design: Exposing the Invisible
CC BY-SA 4.0 - This article is published by Tactical Tech's Exposing the Invisible (ETI) project, and licensed under a Creative Commons Attribution-ShareAlike 4.0 International license
Contact us with questions or suggestions: eti-at-tacticaltech.org (GPG Key / fingerprint: BD30 C622 D030 FCF1 38EC C26D DD04 627E 1411 0C02).
About the author: Olaya Argüeso Pérez has extensive experience in cross-border investigations. She has been editor-in-chief at CORRECTIV, where she led its international investigations beginning in 2019. After more than a decade of reporting about the economy, business and finance at Spain's most important radio network (Cadena SER), Olaya expanded her expertise into data journalism and joined the Lede Program at Columbia University. She then joined CORRECTIV as a reporter and participated in cross-border investigations like The CumEx Files and Grand Theft Europe, which exposed multi-billion tax frauds happening all over Europe. Between 2022 and 2024, she led CORRECTIV.Europe, a pioneering network of European local journalists who worked together in cross-border investigations. She is now a freelance investigative journalist and trainer, covering the societal impact of AI.
This content is part of the resources produced under the Collaborative and Investigative Journalism Initiative.
Disclaimer:
Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Education and Culture Executive Agency (EACEA). Neither the European Union nor EACEA can be held responsible for them.