[an error occurred while processing this directive] Improved ontologies for statoids - Nigami [an error occurred while processing this directive]


Improved ontologies for statoids

In this article, some extensions to the models in geocoding databases are presented. Three areas of change are dicussed: the statoid levels, localization/multi-lingual support and diachronic support.


Several geocoding services on the web allow you to map a string, representing a location, into a longitude/latitude pair (point) or sets of such pairs defining a boundary (region). For instance Yahoo Geocoding API. I discuss several ways in which the data model supported by these systems can be improved upon.

Multi-level ontology

Different countries have different levels of subdivisions. Most webservices support a subset of these subdivisions. country > state/province > municipality > town I will use the term 'statoid' as proposed by Gwillim Law to refer to administrative divisions of the world at any level. Many countries have exceptions and extensions to this simple model of statoids.
United Kingdom
The United Kingdom is an entity that is subdivided into four countries, although it makes sense to call the UK itself a country as well. It also has several types of municipalities.
Belgium comprises three regions: Brussels Capital Region, Flemish Region, Wallonia. The Flemish Region and Wallonia are further subdivided into provinces, but the Brussels Capital Region itself also can be thought of as a province, so it functions at two levels simultaneously.

Multi-language support

Locations have different names in different languages: Munich/München Aachen/Aken/Aix-la-Chapelle/Aquisgrán/Aquisgrana Den Haag/The Hague/La Haya Brussel/Bruxelles/Brussels Ciudad de México/Mexico City/Mexico-stad Note that they do not necessarily have one 'official' name, especially if a country is officially multi-lingual (e.g. Brussels).

Diachronic support

The division of the world into statoids has changed over time, and continually changes. Most webservices attempt at providing the most up-to-date data. But this is a problem if you want to find information on statoids that no longer exist, e.g. for enriching or presenting historical/genealogical data. Also, it means that these services cannot give information about changes to a statoid over time. There are several sources available that have data about changes to statoids, e.g. provinciale herindelingen (changes to provinces) for the Netherlands. It would be desirable to have such information and the subdivision data itself in a unified format/system. This would allow you to get 'snapshots' of the statoid ontology for different times, and track changes to statoids over time. For genealogy it would provide a good model of associating records to statoids, by using statoids as they exited at the time of the record. For instance many records concern old countries such as Prussia or New Holland and their respective subdivisions. be