User:Miyabi s/ksj2osm-coastline.pl

From OpenStreetMap Wiki
Jump to navigation Jump to search
参考
作業の概要についてはこちらも参照して下さい。


This perl script convert KSJ2 coastline data xml file to osm xml file.

(ja)このperlスクリプトは国土数値情報(海岸線データ)のxmlファイルをosmのxmlファイルに変換します。

Preparation

(ja)準備

  • Installation of JOSM (a java program; needs J2SE SDK or JRE). (ja)JOSMの導入(javaプログラムなので実行にはJ2SE SDK或いはJREが必要。)
  • Installation of perl. You can download ActivePerl from ActiveState site (free of charge). (ja)perlの導入。ActiveStateのサイトからActivePerlをでダウンロードできる(無料)。
  • Install the module of XML::LibXML by perl package manager. (ja)perl package managerからXML::LibXMLモジュールをインストールする。
  • Save the code below as a file with proper name. (e.g. ksj2osm-coastline.pl) (ja)適切なファイル名を付けて下のソースコードを保存する。(例えばksj2osm-coastline.plなど。)

How to convert the data

(ja)変換手順

  • Download a file of KSJ2 codelist data and decompress only the file choosed AdminAreaCd_090101.xml from its zip file to xml file . (ja)コードリストファイルダウンロードし、zipファイルからAdminAreaCd_090101.xmlのみを解凍してxmlファイルにする。
  • Download a file of KSJ2 coastline data (i.e. choose one prefecture which you want to import) and decompress it from zip file to xml file. (ja)国土数値情報(海岸線データ)からインポートしたい都道府県のファイルを一つダウンロードし、zipファイルを解凍してxmlファイルにする。
  • Put the xml file on the same directory of the script. (ja)xmlファイルをスクリプトと同じディレクトリに置く。
  • Open the perl script by editor program, edit the initial values of variables; see table below. And then save the script. (ja)このperlスクリプトをエディタープログラムで開き、変数の初期値を編集し(下の表を参照)、スクリプトを保存する。
  • Open command prompt window and run the script on the directory where you put the script and xml file. Output files (ksj2osm-coastline-*.osm) will be created on the same directory. (ja)コマンドプロンプトを開いて、スクリプトとxmlファイルを置いたディレクトリでスクリプトを実行する。出力ファイル (ksj2osm-coastline-*.osm) は同じディレクトリに作成される。
  • Open output files (ksj2osm-coastline-*.osm) by JOSM, check and edit the data (see notes below), and then upload to server. (ja)JOSMで出力ファイル (ksj2osm-coastline-*.osm)を開いてデータの確認や編集を行い(下の注意を参照)、それからサーバーにアップロードする。


variable
変数
description of initial value 初期値の説明
$max_ways number : limit of the number of ways outputted into each "ksj2osm-coastline-*.osm" file.
Please adjust this number to reduce the size of each output file; a rough estimate is a half of the file size which you can open by JOSM.
数値 : 各 "ksj2osm-coastline-*.osm" ファイルに出力するwayの数の上限。
個々の出力ファイルのサイズを小さくする為に、この数値を調整して下さい。目安はあなたのPC上のJOSMで開けるファイルの半分のサイズです。
$max_nodes_per_file number : limit of the number of nodes outputted into each "ksj2osm-coastline-*.osm" file.
Please adjust this number to reduce the size of each output file; a rough estimate is a half of the file size which you can open by JOSM.
数値 : 各 "ksj2osm-coastline-*.osm" ファイルに出力するnodeの数の上限。
個々の出力ファイルのサイズを小さくする為に、この数値を調整して下さい。目安はあなたのPC上のJOSMで開けるファイルの半分のサイズです。
$file_in filename : the name of input file without an extension (i.e. without ".xml"). ファイル名 : 入力ファイルの名前。拡張子 (".xml") は付けない。
$file_name unnecessary to edit; this is the prefix of output file name. 編集不要。これは出力ファイル名の接頭語です。

Notes

(ja)注意

  • KSJ2 coastline data doesn't have information below, so you need to input them manually into the node which is tagged with "name:ja_rm=". See also Japan tagging#Names. (ja)国土数値情報の海岸線データには以下の情報が無いので手入力する必要があります。Japan tagging#Namesも参照してください。
    • place name in romanization of Japanese; "name:ja_rm=". (ja)ローマ字表記の地名 ("name:ja_rm=")。

Code

As of 2009-11-20.

#! /usr/bin/perl

use strict;
use warnings;
# use encoding "utf8"; 
use encoding "utf8", STDOUT => "shiftjis", STDERR => "shiftjis"; # for Windows
use Encode;
use open IO => "utf8";
use XML::LibXML;

#####
#
# KSJ2 Railway Data
#
# National-Land Numerical Information (Coastline) 2006, MLIT Japan
# 国土数値情報(海岸線データ)平成18年 国土交通省
#
# Files
#   Input
#     XML file : C23-06_*.xml
#   Output
#     Osm file : ksj2osm-coastline.osm
#
#####

our $file_in = "C23-06_26";  # target file
our $max_ways = 30;  # for splitting output files.
our $max_nodes_per_file = 750;  # max nodes per a file


our $file_name = "ksj2osm-coastline";
our $max_nodes_per_way = 250;  # max nodes per a way


sub open_log() {

  my $time = localtime(time);
#  open(LOG, '>', "$file_name.log");
#  print LOG "***** KSJ2 Coastline Data 2006 : Start $time\n";
  print "***** KSJ2 Coastline Data 2006 : Start $time\n";

}  # end sub open_log()


sub close_log() {

  my $time = localtime(time);
#  print LOG "***** Done!: End $time\n";
#  close LOG;
  print "***** Done!: End $time\n";

}  # end sub close_log()

sub open_osm_file() {

  my ($count_files) = @_;

  open(OSM, '>', "$file_name-$count_files.osm");
  print OSM "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
  print OSM "<osm version=\"0.5\" generator=\"KSJ2OSM\">\n";

}  # end sub open_osm_file()


sub close_osm_file(){

  print OSM "</osm>";
  close OSM;

}  # end sub close_osm_file()


# get AAC label in AdminAreaCd.xml from ASL code
sub get_aac_label(){

  my $aac_code = shift;

  my $aac_label;

  my $xml_aac = XML::LibXML->new;
  my $aac_doc = $xml_aac->parse_file("AdminAreaCd_090101.xml");
  my @aac_list = $aac_doc->getElementsByTagName('codelabel');

  foreach my $aac_list (@aac_list){

    my $aac_list_code = $aac_list->getAttribute('code');
    if($aac_code eq $aac_list_code){
      $aac_label = $aac_list->getAttribute('label');
      last;
    }

  }
  return $aac_label;

}  # end sub get_aac_label()


# get ACA label from ACA code
sub get_aca_label() {

  my %hash = (
    1 => "都道府県知事",
    2 => "市町村長",
    3 => "一般事務組合",
    4 => "港務局",
    9 => "管理者不明",
    0 => "その他"
  );

  return $hash{$_[0]};

}  # end sub get_aca_label()


# get ASL label from ASL code
sub get_asl_label() {

  my %hash = (
    1 => "国土交通省河川局",
    2 => "国土交通省港湾局",
    3 => "農林水産省農村振興局",
    4 => "農林水産省水産庁",
    5 => "農振河川共管",
    6 => "所管官庁不明(原典資料入手不可)",
    7 => "所管官庁不明(原典資料データ化不可)",
    0 => "その他"
  );

  return $hash{$_[0]};

}  # end sub get_asl_label()


# set tag data for writing OSM file
sub write_osm_constant_tag(){

  my $tags;

  my %hash = (
    "created_by" => "National-Land-Numerical-Information_MLIT_Japan",
    "note" => "National-Land Numerical Information (Coastline) 2006, MLIT Japan",
    "note:ja" => "国土数値情報(海岸線データ)平成18年 国土交通省",
    "source" => "KSJ2",
    "source_ref" => "http://nlftp.mlit.go.jp/ksj/jpgis/datalist/KsjTmplt-C23.html",
    "KSJ2:filename" => $file_in . ".xml",
  );

  while ( my ($key, $value) = each( %hash ) ){

    $tags .= "<tag k=\"$key\" v=\"$value\"/>";

  }

  return $tags;

}  # end sub write_osm_constant_tag()


# create nd-tab data for write way to OSM file
sub create_nd_tag() {

  my ($nd_tag_ref) = @_;
  my $node_ref = join("", @$nd_tag_ref);

  return $node_ref;

}


# write way data to OSM file
sub write_osm_way_tag() {

  my ($tag_id, $tag_data, $nd_data) = @_;

  my $tags = "<way id=\"$tag_id\" action=\"modify\" visible=\"true\">";

  $tags .= create_nd_tag($nd_data);

  $tags .= write_osm_constant_tag();
  $tags .= "<tag k=\"natural\" v=\"coastline\"/>";

    while ( my ($key, $value) = each( %$tag_data ) ){

      unless( $key eq "KSJ2:coordinate"
        || $key eq "KSJ2:lat"
        || $key eq "KSJ2:long" ){

        $tags .= "<tag k=\"$key\" v=\"$value\"/>";
      
        if( $key eq "KSJ2:NCA" && $value ne "名称不明" ){
          $tags .= "<tag k=\"name\" v=\"$value ()\"/>";
          $tags .= "<tag k=\"name:en\" v=\"\"/>";
          $tags .= "<tag k=\"name:ja\" v=\"$value\"/>";
          $tags .= "<tag k=\"name:ja_rm\" v=\"\"/>";
        }

      }

    }

  $tags .= "</way>";

  print OSM $tags;

}  # end sub write_osm_way_tag()


# write node data to OSM file
sub write_osm_node_tag() {

  my ($tag_id, $hash, $coodinate) = @_;

  # if $cordinate have data
  if( defined( $coodinate ) ){
    $hash->{"KSJ2:coordinate"} = $coodinate;
    ($hash->{"KSJ2:lat"}, $hash->{"KSJ2:long"}) = split(/\s/, $hash->{"KSJ2:coordinate"});
    printf "pos = lat: %s / long: %s\n", $hash->{"KSJ2:lat"}, $hash->{"KSJ2:long"};

  }

  my $tags = sprintf("<node id=\"%d\" visible=\"true\" lat=\"%0.6f\" lon=\"%0.6f\">", $tag_id, $hash->{"KSJ2:lat"}, $hash->{"KSJ2:long"});

  $tags .= write_osm_constant_tag();

  while ( my ($key, $value) = each( %$hash ) ){
  
    if ($key eq "KSJ2:coastline_id"
        || $key eq "KSJ2:curve_id"
        || $key eq "KSJ2:NCA"
        || $key eq "KSJ2:lat"
        || $key eq "KSJ2:long"
        || $key eq "KSJ2:coordinate" ){

      $tags .= "<tag k=\"$key\" v=\"$value\"/>";

    }

  }
  
  $tags .= "</node>";

  print OSM $tags;
  
  return %$hash;

}  # end sub write_osm_node_tag()


# get the data of Attribute
sub get_attr_data() {

  my $ksj_attr = shift;
  my $tmp;
  my %hash;

=pod
  my %hash = (
    "KSJ2:coastline_id" => $CP01_id 
    "KSJ2:LOC" => $LOC_idref,
    "KSJ2:AAC" => $asl_code,
    "KSJ2:AAC_label"  => $aac_label,
    "KSJ2:ASL" => $asl_code,
    "KSJ2:ASL_label" => $asl_label,
    "KSJ2:RIM" => $rim,

    "KSJ2:ACA" => $aca_code,
    "KSJ2:ACA_label" => $aca_label,
    "KSJ2:CAN" => $can,
    "KSJ2:NCA" => $nca,
    "KSJ2:NAC" => $nac
  );
=cut

  # get the Data of CP-id & LOC-idref
  $hash{"KSJ2:coastline_id"} = $ksj_attr->getAttribute('id');
  $hash{"KSJ2:LOC"} = $ksj_attr->getChildrenByTagName('ksj:LOC')->get_node(0)->getAttribute('idref');

  # get code & label of AdminAreaCd
  $hash{"KSJ2:AAC"} = $ksj_attr->getChildrenByTagName('ksj:AAC')->get_node(0)->textContent;
  $hash{"KSJ2:AAC_label"} = &get_aac_label( $hash{"KSJ2:AAC"} );

  # get classification of the mouse of the River
  my $rim_bool = $ksj_attr->getChildrenByTagName('ksj:RIM')->get_node(0)->textContent;

  if ($rim_bool eq "false"){
    $hash{"KSJ2:RIM"} = "河口部以外"
  }elsif ($rim_bool eq "true"){
    $hash{"KSJ2:RIM"} = "河口部"
  }

  # get Name of coast area
  if($ksj_attr->find('ksj:NCA')){
    $hash{"KSJ2:NCA"} = $ksj_attr->getChildrenByTagName('ksj:NCA')->get_node(0)->textContent;
    if( $hash{"KSJ2:NCA"} eq "不明" ){
      $hash{"KSJ2:NCA"} = "名称不明";
    }
  }

  # get code & label of AdminSeaLine
  $hash{"KSJ2:ASL"} = $ksj_attr->getChildrenByTagName('ksj:ASL')->get_node(0)->textContent;
  $hash{"KSJ2:ASL_label"} = get_asl_label( $hash{"KSJ2:ASL"} );

  # get code of AdminConArea
  if($ksj_attr->find('ksj:ACA')){
    $hash{"KSJ2:ACA"} = $ksj_attr->getChildrenByTagName('ksj:ACA')->get_node(0)->textContent;
    $hash{"KSJ2:ACA_label"} = get_aca_label( $hash{"KSJ2:ACA"} );
  }

  # get Number of Coast Area
  if($ksj_attr->find('ksj:CAN')){
    $hash{"KSJ2:CAN"} = $ksj_attr->getChildrenByTagName('ksj:CAN')->get_node(0)->textContent;
  }

  # get name of AdminCon
  if($ksj_attr->find('ksj:NAC')){
    $hash{"KSJ2:NAC"} = $ksj_attr->getChildrenByTagName('ksj:NAC')->get_node(0)->textContent;
  }

  return %hash;

}  # sub get_attr_data()


# get point data from KSJ file
sub get_ksj_point_data() {

  my ($ksj_node) = @_ ;
  
  my $coordinate = $ksj_node->getElementsByTagName('DirectPosition.coordinate')->get_node(0)->textContent;
  return $coordinate;

}  #end sub get_ksj_point_data()



sub main (){

  # for creating Element ID
  my $negative_id = 0;
  my $node_id;
  my %node_ref_id;

  my $node_per_file = 0;  # nodes per a file

  my $xml = XML::LibXML->new;
  my $ksj_doc = $xml->parse_file("$file_in.xml");

  # for sepalate files
  my $count_files = 0;
  my $count_way = 0;
  my $file_start_id = 0;

  open_log();
  open_osm_file($count_files);

  my @ksj_attr = $ksj_doc->getElementsByTagName('ksj:CP01')->get_nodelist;
  my @ksj_way = $ksj_doc->getElementsByTagName('jps:GM_Curve')->get_nodelist;
  my @ksj_node = $ksj_doc->getElementsByTagName('jps:GM_Point')->get_nodelist;

  # get the data of Attribute
  foreach my $ksj_attr (@ksj_attr){

    # get the data of Attribute
    my %ksj_tag = get_attr_data($ksj_attr);

=pod
  my %hash = (
    "KSJ2:coastline_id" => $CP01_id 
    "KSJ2:LOC" => $LOC_idref,
    "KSJ2:AAC" => $asl_code,
    "KSJ2:AAC_label"  => $aac_label,
    "KSJ2:ASL" => $asl_code,
    "KSJ2:ASL_label" => $asl_label,
    "KSJ2:RIM" => $rim,

    "KSJ2:ACA" => $aca_code,
    "KSJ2:ACA_label" => $aca_label,
    "KSJ2:CAN" => $can,
    "KSJ2:NCA" => $nca,
    "KSJ2:NAC" => $nac
  );
=cut       

    # get the data of Line
    foreach my $ksj_way (@ksj_way){
    
      # for create "<nd>" tag
      my $node_ref = "";  
      my @nd_tag;
      my $node_count = 0;

      my %temp_node_data;

      my $gm_curve_id = $ksj_way->getAttribute('id');

      # Routine when agreeing to information that data of acquired line targets.
      if($gm_curve_id eq $ksj_tag{"KSJ2:LOC"}){
      
        $ksj_tag{"KSJ2:curve_id"} = $gm_curve_id;

        # write way data to OSM file
#        write_osm_way_tag($negative_id, \%ksj_tag);

        # get the data of Point
        my @ksj_way_node = $ksj_way->getElementsByTagName('GM_PointArray.column')->get_nodelist;

        foreach my $ksj_way_node (@ksj_way_node){

          $node_count++;

          # sequence for sepalate file 
          if( $count_way >= $max_ways || $node_per_file >=$max_nodes_per_file){

            close_osm_file();
            $count_way = 0;
            $node_per_file = 0;

            $count_files++;
            open_osm_file($count_files);
            $file_start_id = $negative_id;

            $negative_id--;

            # setting nt-tag for conect ways between separate files
            if( %temp_node_data ){
              write_osm_node_tag($negative_id, \%temp_node_data);
              @nd_tag = sprintf("<nd ref=\"%d\" />", $negative_id );
            }

          }

          $negative_id--;
          $node_id = $negative_id;

          # Routine when acquired point is reference data.
          if($ksj_way_node->find('jps:GM_Position.indirect')){

            my $ksj_point_ref = $ksj_way_node->getElementsByTagName('GM_PointRef.point')->get_node(0)->getAttribute('idref');

            foreach my $ksj_node (@ksj_node){

              my $ksj_point_id = $ksj_node->getAttribute('id');

              # Routine when it is point in line that data of acquired point targets.
              if ( $ksj_point_id eq $ksj_point_ref ){

                if( !exists($node_ref_id{$ksj_point_id}) || $node_ref_id{$ksj_point_id} > $file_start_id ){
                  # get Point data & write node to OSM file
                  my $pos_coodinate = get_ksj_point_data($ksj_node);
                  %temp_node_data = write_osm_node_tag($node_id, \%ksj_tag, $pos_coodinate);

                  $node_ref_id{$ksj_point_id} = $node_id;

                }else{
                  $node_id = $node_ref_id{$ksj_point_id};
                }

                last;
              }  # end if ($ksj_point_id eq $ksj_point_ref)


            }  # end foreach my $ksj_node (@ksj_node)

          }  # end if($ksj_way_node->find('jps:GM_Position.indirect'))

          # Routine when acquired point is direct data.
          elsif($ksj_way_node->find('jps:GM_Position.direct')){

             # get Point data & write node to OSM file
             my $direct_pos_coodinate = get_ksj_point_data($ksj_way_node);
             %temp_node_data = write_osm_node_tag($node_id, \%ksj_tag, $direct_pos_coodinate);

          }  # end elsif($ksj_way_node->find('jps:GM_Position.direct'))

          # for write way data to OSM file
          unshift( @nd_tag, "<nd ref=\"$node_id\" />" );

          if( $node_count >= $max_nodes_per_way ){

            $negative_id--;


            # write way data to OSM file
            write_osm_way_tag($negative_id, \%ksj_tag, \@nd_tag);
            $node_ref = "";

            # setting nt-tag for conect ways 
            @nd_tag = "<nd ref=\"$node_id\" />" );

            $count_way++;  # count ways for separating OSM file
            $node_per_file += $node_count;

            $node_count = 0;

          }

        }  # end foreach my $ksj_way_node (@ksj_way_node)

        $negative_id--;

        # write way data to OSM file
        write_osm_way_tag($negative_id, \%ksj_tag, \@nd_tag);
        $node_ref = "";

        $count_way++;  # count ways for separating OSM file
        $node_per_file += $node_count;

      }  # end if($gm_curve_id eq $ksj_tag{"KSJ2:LOC"})

    }  # end foreach my $ksj_way (@ksj_way)

  }  # end foreach my $ksj_attr (@ksj_attr)

  close_osm_file();
  close_log();

}  # end main()

main();