China/GNS Place names data

From OpenStreetMap Wiki
Jump to navigation Jump to search

A load of place name nodes were added to China by User 'dkt'. They are tagged (created_by=dkt_GNS-import-1) and were imported from GEOnet Names Server.

Rendering of Chinese place names seems to be working fine now in both Mapnik, after font issues were resolved, and Osmarender (where the suitable fonts are installed on the client that renders a part of the map).

The GNS data was extracted using some PHP code...

The China data file (ch.txt) was used as the source, only a small subset of the data was extracted as much of the data has not yet been updated with Chinese, the GNS Feature Classification (FC) of A was used as this data appeared the most complete as far as names are concerned at the time of extract.

Of records with an FC of A the data was further filtered to only include records with a Name Type (NT) of Conventional name (C), BGN Standard name (N), BGN Standard name in non-Roman script (NS) or alternatively Variant name in non-Roman script (VS), but only if the Language Code (LC) is not blank, English (eng) or Chinese (zho) on VS records.

With the data left, all records for a specific 'feature', based on the Unique Feature Identifier (UFI) were read to gather all available name data.

For records with an NT of C if the LC was eng, the English name was extracted. For records with an NT of N, the Pinyin name was extracted. For records with an NT of NS if the LC was zho, the Chinese name was extracted along with the descriptive portion of the name (GENERIC).

In addition, the UFI, the Unique Name Identifier (UNI), the Latitude (LAT), the Longitude (LON), the Feature Designation Code (DSG), the First-order administrative division code (ADM1) were extracted.

Once all data was extracted for a particular UFI the OSM place type was worked out for the feature based on either the GENERIC if available or the last word in a Pinyin name... The mapping was as follows...


With all data together, nodes were written as follows...

 <node id='<unique negative id>' timestamp='2008-01-30T03:16:47+01:00' action='create' lat='<LAT>' lon='<LON>'>
   <tag k='created_by' v='dkt_GNS-import-1' />
   <tag k='place' v='<mapped type>' />
   <tag k='gns:UFI' v='<UFI>' />
   <tag k='gns:UNI' v='<UNI>' />
   <tag k='gns:DSG' v='<DSG>' />
   <tag k='gns:ADM1' v='<ADM1>' />
   <tag k='name' v='<Chinese name if available, or Pinyin name if available, otherwise no tag>' />
   <tag k='name:zh' v='<Chinese name if available, otherwise no tag>' />
   <tag k='name:zh_pinyin' v='<Pinyin name if available, always with tones, otherwise no tag' />
   <tag k='name:en' v='<English name if available, otherwise no tag' />
 </node>

The final OSM xml file was uploaded using the bulk upload tool...

The code used for this extract is pretty basic and is pretty specific to the job of importing Chinese name data, but, if you want a copy it is available on request... Dtucny 05:33, 9 May 2008 (UTC)