Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with mapping country regions for COVID tracking #98

Open
hyperknot opened this issue Mar 24, 2020 · 8 comments
Open

Need help with mapping country regions for COVID tracking #98

hyperknot opened this issue Mar 24, 2020 · 8 comments

Comments

@hyperknot
Copy link

hyperknot commented Mar 24, 2020

Hi, I'm a contributor to coronadatascraper, an open source project aiming to scrape official websites all around the world for COVID numbers.

My problem is that I'm trying to find a system which we can use to define different hierarchies within countries. I created the country-levels project from Natural Earth dataset, but I'm not happy with it as for example it contains bad Admin 1 divisions in Spain.

Can you help us with your experience? My aim is to make a short code based system, like hasc:ES.CL or something similar which we can use to refer to a region. We'd need to have a GeoJSON for each region + a Wikidata link for population fetching. Is this possible somehow with your project?

Here is a relevant issue which I've opened, if you could contribute to the discussion it'd be great!
https://github.com/lazd/coronadatascraper/issues/286

@antoine-de
Copy link
Contributor

Hi,

nice project!

Cosmogony might indeed help you as it creates some hierachies (country, country region, state, state district, ...) using OSM data.

I'm not sure I understand what you are trying to do, but maybe you can either select a hierarchy level (maybe country region ?) or select the first level below country (since sometime a country can have no country regions, only states or even cities).

Since the wikidata id is often filled in OSM, you can easily have the population and other metadata (I don't know the proportion of zone with wikidata id though).

For your id, maybe you can use ISO 3166-2?

It would be easier for you to use an already generated cosmogony file, but I don't have an up to date readily available. Maybe @amatissart or @prhod can help you have a cosmogony file?

@amatissart
Copy link
Member

amatissart commented Mar 24, 2020

I have just uploaded an extract from the cosmogony dataset generated from the planet-200302.osm.pbf file:
https://github.com/osm-without-borders/cosmogony/releases/download/v0.7.3/cosmogony-2020-03-02-regions.jsonl.gz

This file contains all extracted regions with type "country", "country_region", "state" or "state_district". This classification is mostly built for geocoding purposes, and it may or may not fit your needs.

The file is a .jsonl file (with one JSON object per line, representing a zone).
For each zone you'll notably find:

  • id (integer)
  • osm_id
  • zone_type (among "country", "country_region", "state", "state_district")
  • geometry (GeoJSON, directly extracted from OSM, with NO simplification)
  • tags (key-value from OSM, including wikidata and ISO8166-2 if present)
  • parent (the id of the parent zone in the hierarchy)

@hyperknot
Copy link
Author

Thanks so much for the help! I was able to process and simplify the dataset, so I can view it properly.

My biggest question:

  • How is it possible to make the country borders not include sea? Like on wambachers, there is the land/sea switch.
  • Is there any way to substitute in missing countries? Like in Africa for example?

@amatissart
Copy link
Member

How is it possible to make the country borders not include sea? Like on wambachers, there is the land/sea switch.

Country relations in OSM typically include maritime boundaries. As far as I know, there is no simple way to extract country land boundaries directly from OSM. A solution would be to clip the polygons at the end of the process, for example by using global land or global water polygons available on https://osmdata.openstreetmap.de/data/

Is there any way to substitute in missing countries? Like in Africa for example?

Some countries may be missing in the dataset if their polygon was broken in OSM at the time of the extract. Unfortunately it happens from time to time, and that's indeed the case with the planet file we used (dated 2020-03-02). An updated dataset processed with more recent OSM data would hopefully solve that problem.

@hyperknot
Copy link
Author

I see. Is there any way to get the OSM IDs, without the polygons? I found out I can download the polygons from Wambachers with the water cut out, I'd just need to have the IDs listed.

@amatissart
Copy link
Member

Do you mean the OSM IDs of all regions, including those with invalid polygons in OSM ? I fear that is out of the scope of the current implementation: Cosmogony is using the exact geometry of each region to build the hierarchy of zones, and determine a zone_type from this hierarchy.

@hyperknot
Copy link
Author

I see. Maybe I can use the valid polygons from Cosmology and fill up the rest from Wambacher, I'll try.

@amatissart
Copy link
Member

amatissart commented Mar 25, 2020

FYI I have added updated datasets to the latest release (generated with planet-200316.osm.pbf)

All zones:
https://github.com/osm-without-borders/cosmogony/releases/download/v0.7.3/cosmogony-2020-03-16.jsonl.gz

Only "country", "country_region", "state", "state_district":
https://github.com/osm-without-borders/cosmogony/releases/download/v0.7.3/cosmogony-2020-03-02-regions.jsonl.gz

(The "international_labels" field includes only the english version, hence the slightly smaller file size).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants