This story was originally published on HackerNoon at:
https://hackernoon.com/how-we-built-a-per-plant-co2-dataset-for-4551-power-stations-worldwide.
An open dataset of 4,551 power stations: measured + modelled CO2, fuel, owner, capacity and climate zone. How we built it in Python, and the honest limits.
Check more stories related to data-science at:
https://hackernoon.com/c/data-science.
You can also check exclusive content about
#data-engineering,
#python,
#global-energy-monitor,
#greenhouse-gas-data,
#carbon-accounting,
#climate-analytics,
#energy-infrastructure,
#python-etl, and more.
This story was written by:
@dmytroah. Learn more about this writer by checking
@dmytroah's about page,
and for more stories, please visit
hackernoon.com.
The authors built and openly published a dataset covering 4,551 power stations worldwide, combining emissions, ownership, capacity, fuel type, and climate-zone data into a single schema. The project's central finding is that only about 15% of plant-level emissions data comes from direct measurements, while the remaining 85% relies on modelled estimates, making provenance and transparency critical for anyone working with emissions datasets.