User:Miyabi s/ksj2osm-forest.pl

From OpenStreetMap Wiki
Jump to navigation Jump to search

This perl script convert KSJ2 forest data xml file to osm xml file.

(ja)このperlスクリプトは国土数値情報(森林データ)のxmlファイルをosmのxmlファイルに変換します。

Preparation

(ja)準備

  • Installation of JOSM (a java program; needs J2SE SDK or JRE). (ja)JOSMの導入(javaプログラムなので実行にはJ2SE SDK或いはJREが必要。)
  • Installation of perl. You can download ActivePerl from ActiveState site (free of charge). (ja)perlの導入。ActiveStateのサイトからActivePerlをでダウンロードできる(無料)。
  • Install the module of XML::LibXML by perl package manager. (ja)perl package managerからXML::LibXMLモジュールをインストールする。
  • Save the code below as a file with proper name. (e.g. ksj2osm-forest.pl) (ja)適切なファイル名を付けて下のソースコードを保存する。(例えばksj2osm-forest.plなど。)

How to convert the data

(ja)変換手順

  • Download a file of KSJ2 codelist data and decompress only the file choosed PrefCd.xml and SubprefectureNameCd.xml from its zip file to xml file . (ja)コードリストファイルダウンロードし、zipファイルからPrefCd.xmlSubprefectureNameCd.xmlのみを解凍してxmlファイルにする。
  • Download a file of KSJ2 coastline data (i.e. choose one prefecture which you want to import) and decompress it from zip file to xml file. (ja)国土数値情報(森林データ)からインポートしたい都道府県のファイルを一つダウンロードし、zipファイルを解凍してxmlファイルにする。
  • Put the xml file on the same directory of the script. (ja)xmlファイルをスクリプトと同じディレクトリに置く。
  • Open the perl script by editor program, edit the initial values of variables; see table below. And then save the script. (ja)このperlスクリプトをエディタープログラムで開き、変数の初期値を編集し(下の表を参照)、スクリプトを保存する。
  • Open command prompt window and run the script on the directory where you put the script and xml file. Output files (ksj2osm-forest-*.osm) will be created on the same directory. (ja)コマンドプロンプトを開いて、スクリプトとxmlファイルを置いたディレクトリでスクリプトを実行する。出力ファイル (ksj2osm-forest-*.osm) は同じディレクトリに作成される。
  • Open output files (ksj2osm-forest-*.osm) by JOSM, check and edit the data (see notes below), and then upload to server. (ja)JOSMで出力ファイル (ksj2osm-forest-*.osm)を開いてデータの確認や編集を行い(下の注意を参照)、それからサーバーにアップロードする。


variable
変数
description of initial value 初期値の説明
$max_ways_per_file number : limit of the number of ways outputted into each "ksj2osm-forest-*.osm" file.
Please adjust this number to reduce the size of each output file; a rough estimate is a half of the file size which you can open by JOSM.
数値 : 各 "ksj2osm-forest-*.osm" ファイルに出力するwayの数の上限。
個々の出力ファイルのサイズを小さくする為に、この数値を調整して下さい。目安はあなたのPC上のJOSMで開けるファイルの半分のサイズです。
$max_nodes_per_way number : limit of the number of nodes outputted into each way.
Please below 2000, because limit of the number of the node each way is contained in less than 2000.
数値 : 1本のwayに含めるnodeの数の上限。
OpenStreetMapの仕様の関係上、2000以下にしてください。
$max_nodes_per_file number : limit of the number of nodes outputted into each "ksj2osm-forest-*.osm" file.
Please adjust this number to reduce the size of each output file; a rough estimate is a half of the file size which you can open by JOSM.
数値 : 各 "ksj2osm-forest-*.osm" ファイルに出力するnodeの数の上限。
個々の出力ファイルのサイズを小さくする為に、この数値を調整して下さい。目安はあなたのPC上のJOSMで開けるファイルの半分のサイズです。
$file_in filename : the name of input file without an extension (i.e. without ".xml"). ファイル名 : 入力ファイルの名前。拡張子 (".xml") は付けない。
$file_name unnecessary to edit; this is the prefix of output file name. 編集不要。これは出力ファイル名の接頭語です。

Notes

(ja)注意

  • (ja)国土数値情報ファイル(森林データ)のxmlファイルでは森林地域を以下の4つに区分されています。
    • 森林地域
    • 国有林
    • 地域森林計画対象民有林
    • 保安林
  • (ja)このうち森林地域となる区域は他の3区分を完全に包括した区域となっています。
  • (ja)森林地域以外の3区分について次のように区分されています。
    • (ja)国有林地域森林計画対象民有林とは排他となる区分となっています
    • (ja)保安林国有林ならびに地域森林計画対象民有林重複した区分か包括されてる区分となっています。
  • (ja)このスクリプトでは区分指定範囲はそのままにして、次のようにタグ付けを行うようにしています。
    • 森林地域: natural=wood
    • 国有林・地域森林計画対象民有林・保安林: landuse=forest
  • (ja)広範囲(上記の変数 #max_node_per_way で指定した値以上のnodeを含むwayで囲まれたarea)のもの、内側に穴部がある範囲のものはマルチポリゴン・リレーションとして出力されます。またマルチポリゴン・リレーションのメンバとして出力されたwayについて、次のことに留意してください(OpenStreetMapの仕様の関係上、描画される際に不具合が生じるため。詳細はマルチポリゴンの項を参照してください)。
    • (ja)マルチポリゴン・リレーションのメンバとして出力されたwayには上記のタグが付いていません。
    • (ja)wayにタグ付けされていない代わりにリレーションにタグ付けされています

(ja)以上を踏まえた上で必要に応じて適当に編集してください。

Code

As of 2010-04-13.

#! /usr/bin/perl

use strict;
use warnings;
# use encoding "utf8"; 
use encoding "utf8", STDOUT => "shiftjis", STDERR => "shiftjis"; # for Windows
use Encode;
use open IO => "utf8";
use XML::LibXML;

#####
#
# KSJ2 Railway Data
#
# National-Land Numerical Information (Forest) 2006, MLIT Japan
# 国土数値情報(森林データ)平成18年 国土交通省
#
# Files
#   Input
#     XML file : A13-06_*.xml
#   Output
#     Osm file : ksj2osm-forest.osm
#
#####

our $file_in = "A13-06_26";  # target file

our $file_name = "ksj2osm-forest";

our $max_ways_per_file = 100;  # for splitting output files.
our $max_nodes_per_way = 1500;  # max nodes per a way
our $max_nodes_per_file = 999999;  # max nodes per a file


sub open_log() {

  my $time = localtime(time);
  open(LOG, '>', "$file_name.log");
  print LOG "***** KSJ2 Forest Data 2006 : Start $time\n";
  print "***** KSJ2 Forest Data 2006 : Start $time\n";

}  # end sub open_log()


sub close_log() {

  my $time = localtime(time);
  print LOG "***** Done!: End $time\n";
  close LOG;
  print "***** Done!: End $time\n";

}  # end sub close_log()


sub open_osm_file() {

  my ($count_files) = @_;

  open(OSM, '>', "$file_name-$count_files.osm");
  print OSM "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
  print OSM "<osm version=\"0.5\" generator=\"KSJ2OSM\">\n";

}  # end sub open_osm_file()


sub close_osm_file(){

  print OSM "</osm>";
  close OSM;

}  # end sub close_osm_file()


# get PRC label in PrefCd.xml from PRC code
sub get_prc_label(){


  my $prc_code = shift;

  my $prc_label;

  my $xml_prc = XML::LibXML->new;
  my $prc_doc = $xml_prc->parse_file("PrefCd.xml");
  my @prc_list = $prc_doc->getElementsByTagName('codelabel');

  foreach my $prc_list (@prc_list){

    my $prc_list_code = $prc_list->getAttribute('code');
    if($prc_code eq $prc_list_code){
      $prc_label = $prc_list->getAttribute('label');
      last;
    }

  }
  return $prc_label;

}  # end sub get_prc_label()

# get BDC label in SubprefectureNameCd.xml from BDC code
sub get_bdc_label(){


  my $bdc_code = shift;

  my $bdc_label;

  my $xml_bdc = XML::LibXML->new;
  my $bdc_doc = $xml_bdc->parse_file("SubprefectureNameCd.xml");
  my @bdc_list = $bdc_doc->getElementsByTagName('codelabel');

  foreach my $bdc_list (@bdc_list){

    my $bdc_list_code = $bdc_list->getAttribute('code');
    if($bdc_code eq $bdc_list_code){
      $bdc_label = $bdc_list->getAttribute('label');
      last;
    }

  }
  return $bdc_label;

}  # end sub get_bdc_label()

# get DFC label from DFC code
sub get_dfc_label() {

  my %hash = (
    1 => "森林地域",
    2 => "国有林",
    3 => "地域森林計画対象民有林",
    4 => "保安林"
  );

  return $hash{$_[0]};

}  # end sub get_dfc_label()

# set tag data for writing OSM file
sub write_osm_constant_tag(){

  my $tags;

  my %hash = (
    "created_by" => "National-Land-Numerical-Information_MLIT_Japan",
    "source:en" => "National-Land Numerical Information (Forest) 2006, MLIT Japan",
    "source:ja" => "国土数値情報(森林地域データ)平成18年 国土交通省",
    "source" => "KSJ2",
    "source_ref" => "http://nlftp.mlit.go.jp/ksj/jpgis/datalist/KsjTmplt-A13.html",
    "KSJ2:filename" => $file_in . ".xml",
  );

  while ( my ($key, $value) = each( %hash ) ){

    $tags .= "<tag k=\"$key\" v=\"$value\"/>\n";

  }

  return $tags;

}  # end sub write_osm_constant_tag()


# create nd-tab data for write way to OSM file
sub create_nd_tag() {

  my ($nd_tag_ref) = @_;
  my $node_ref = join("", @$nd_tag_ref);

  return $node_ref;

}


# write way data to OSM file
sub write_osm_way_tag() {

  my ($tag_id, $tag_data, $nd_data, $relation) = @_;

  my $tags = "<way id=\"$tag_id\" action=\"modify\" visible=\"true\">\n";

  $tags .= create_nd_tag($nd_data);

  $tags .= write_osm_constant_tag();

  # tagging way is NOT Relation member
  if( !$relation ){

    # Check forest area code for tagging ( code 1 is "森林地域" )
    if( $tag_data->{"KSJ2:DFC"} eq "1" ){
      $tags .= "<tag k=\"natural\" v=\"wood\"/>\n";
    } else {
      $tags .= "<tag k=\"landuse\" v=\"forest\"/>\n";
    }

  }

    while ( my ($key, $value) = each( %$tag_data ) ){

      unless( $key eq "KSJ2:coordinate"
        || $key eq "KSJ2:lat"
        || $key eq "KSJ2:long" ){

        $tags .= "<tag k=\"$key\" v=\"$value\"/>\n";

      }

    }

  $tags .= "</way>\n";

  print OSM $tags;

}  # end sub write_osm_way_tag()


# write node data to OSM file
sub write_osm_node_tag() {

  my ($tag_id, $hash, $coodinate) = @_;

  # if $cordinate have data
  if( defined( $coodinate ) ){
    $hash->{"KSJ2:coordinate"} = $coodinate;
    ($hash->{"KSJ2:lat"}, $hash->{"KSJ2:long"}) = split(/\s/, $hash->{"KSJ2:coordinate"});
  }

  my $tags
   = sprintf("<node id=\"%d\" visible=\"true\" lat=\"%f\" lon=\"%f\" />\n", $tag_id, $hash->{"KSJ2:lat"}, $hash->{"KSJ2:long"});
  print OSM $tags;
  
  return %$hash;

}  # end sub write_osm_node_tag()


# set member-tag for write relation da|ta to OSM file
sub set_relation_member_data() {

  my ($relation, $way_id, $outer_inner_chk, $member_data) = @_;
  
  # for write relation data to OSM file
  if( !$relation ){ return; }

    # check current member is outer / inner ( if $outer_inner_chk is "1", member is "outer" )
    if( $outer_inner_chk ){
      unshift( @$member_data, "<member type=\"way\" ref=\"$way_id\" role=\"outer\" />\n" );
    }else{
      unshift( @$member_data, "<member type=\"way\" ref=\"$way_id\" role=\"inner\" />\n" );
    }

}  # end set_relation_member_data()


# create member-tag data for write relation to OSM file
sub create_member_tag() {

  my ($member_tag_ref) = @_;
  my $member_ref = join("", @$member_tag_ref);

  return $member_ref;

}  # end create_member_tag()


# write relation data to OSM file
sub write_osm_relation_tag() {

  my ($tag_id, $tag_data, $member_data) = @_;

  my $tags = "<relation id=\"$tag_id\" visible=\"true\">\n";

  $tags .= create_member_tag($member_data);

  if( $tag_data->{"KSJ2:DFC_label"} eq "森林地域" ){
    $tags .= "<tag k=\"natural\" v=\"wood\"/>\n";
  } else {
    $tags .= "<tag k=\"landuse\" v=\"forest\"/>\n";
  }

  $tags .= "<tag k=\"type\" v=\"multipolygon\"/>\n";

  $tags .= write_osm_constant_tag();

    while ( my ($key, $value) = each( %$tag_data ) ){

      unless( $key eq "KSJ2:coordinate"
        || $key eq "KSJ2:lat"
        || $key eq "KSJ2:long"
        || $key eq "KSJ2:curve_id" ){

        $tags .= "<tag k=\"$key\" v=\"$value\"/>\n";
      }

    }

  $tags .= "</relation>\n";

  print OSM $tags;

}  # end sub write_osm_relation_tag()


# get the data of Attribute
sub get_attr_data() {

  my $ksj_attr = shift;
  my $tmp;
  my %hash;

  # get the Data of CP-id & LOC-idref
  $hash{"KSJ2:forest_id"} = $ksj_attr->getAttribute('id');
  $hash{"KSJ2:ARE"} = $ksj_attr->getChildrenByTagName('ksj:ARE')->get_node(0)->getAttribute('idref');

  # get code & label of AdminAreaCd
  $hash{"KSJ2:PRC"} = $ksj_attr->getChildrenByTagName('ksj:PRC')->get_node(0)->textContent;
  $hash{"KSJ2:PRC_label"} = get_prc_label( $hash{"KSJ2:PRC"} );

  # get code & label of SubprefectureNameCd
  $hash{"KSJ2:BDC"} = $ksj_attr->getChildrenByTagName('ksj:BDC')->get_node(0)->textContent;
  unless( $hash{"KSJ2:BDC"} eq "00" ){
    $hash{"KSJ2:BDC_label"} = get_bdc_label( $hash{"KSJ2:BDC"} );
  }

  # get code of ForestAreaCode
  if($ksj_attr->find('ksj:DFC')){
    $hash{"KSJ2:DFC"} = $ksj_attr->getChildrenByTagName('ksj:DFC')->get_node(0)->textContent;
    $hash{"KSJ2:DFC_label"} = $hash{"note"} = get_dfc_label( $hash{"KSJ2:DFC"} );
  }

  return %hash;

}  # sub get_attr_data()


# get point data from KSJ file
sub get_ksj_point_data() {

  my ($ksj_node) = @_ ;
  
  my $coordinate = $ksj_node->getElementsByTagName('DirectPosition.coordinate')->get_node(0)->textContent;
  return $coordinate;

}  #end sub get_ksj_point_data()



sub main (){

  # for creating Element ID
  my $negative_id = 0;
  my $node_id;
  my %node_ref_id;

  my $node_per_file = 0;  # nodes per a file

  my $xml = XML::LibXML->new;
  my $ksj_doc = $xml->parse_file("$file_in.xml");

  # for sepalate files
  my $count_files = 0;
  my $count_way = 0;
  my $file_start_id = 0;

  open_log();
  open_osm_file($count_files);

  my @ksj_attr = $ksj_doc->getElementsByTagName('ksj:BF01')->get_nodelist;
  my @ksj_area = $ksj_doc->getElementsByTagName('jps:GM_Surface')->get_nodelist;
  my @ksj_way = $ksj_doc->getElementsByTagName('jps:GM_Curve')->get_nodelist;
  my @ksj_node = $ksj_doc->getElementsByTagName('jps:GM_Point')->get_nodelist;

  # get the data of Attribute
  foreach my $ksj_attr (@ksj_attr){

    # get the data of Attribute
    my %ksj_tag = get_attr_data($ksj_attr);

    # get the data of Line
    foreach my $ksj_area (@ksj_area){

      # for create "<node>" tag
      my $node_ref = "";
      my $node_count = 0;

      # for relation YES/NO
      my $relation = 0;

      # for create "<member>" tag
      my @member_tag;

      my %temp_node_data;

      my $gm_surface_id = $ksj_area->getAttribute('id');

      # Routine when agreeing to information that data of acquired line targets.
      if($gm_surface_id eq $ksj_tag{"KSJ2:ARE"}){

        # get the data of way per area
        my @ksj_area_way = $ksj_area->getElementsByTagName('jps:GM_Ring')->get_nodelist;

        my $ways_per_area = $ksj_area->getElementsByTagName('jps:GM_Ring')->size;

        foreach my $ksj_area_way (@ksj_area_way){

            # for inner or outer to set relation tags
            my $outer_inner_chk = 0;
            my $curve_type = $ksj_area_way->parentNode->nodeName;

            # check current curve is outer / inner ( if "outer", $outer_inner_chk is "1" )
            if($curve_type eq "jps:GM_SurfaceBoundary.exterior"){
              $ksj_tag{"KSJ2:curve_type"} = "exterior";
              $outer_inner_chk = 1;
            }elsif($curve_type eq "jps:GM_SurfaceBoundary.interior"){
              $ksj_tag{"KSJ2:curve_type"} = "interior";
              $outer_inner_chk = 0;
            }

            $ksj_tag{"KSJ2:curve_id"}
             = $ksj_area_way->getElementsByTagName('jps:GM_CompositeCurve.generator')->get_node(0)->getAttribute('idref');

            foreach my $ksj_way (@ksj_way){

              # for create "<nd>" tag
              my @nd_tag;

              my $gm_curve_id = $ksj_way->getAttribute('id');

            if($gm_curve_id eq $ksj_tag{"KSJ2:curve_id"}){

              # get the data of Point per way
              my @ksj_way_node = $ksj_way->getElementsByTagName('GM_PointArray.column')->get_nodelist;

              # get number of nodes per way
              my $nodes_per_way = $ksj_way->getElementsByTagName('GM_PointArray.column')->size;

              # max ways per relation for relation ON
              my $ways_included_relation = int($nodes_per_way /$max_nodes_per_way) + $ways_per_area;

              # relation check ( nodes per way is more $max_nodes_per_way ? )
              if( $ways_included_relation > 1 ){

                $relation = 1;

              }  # end if( $ways_included_relation > 1 )

              # if the number of ways per relation more than $max_ways_per_file, sepalate save file.
              if( $count_way > $max_ways_per_file - $ways_included_relation && $outer_inner_chk ){

                close_osm_file();
                $count_way = 0;
                $node_per_file = 0;

                $count_files++;
                open_osm_file($count_files);

              }  # end if( $count_way > $max_ways_per_file - $ways_included_relation && $outer_inner_chk eq "rn" )

              foreach my $ksj_way_node (@ksj_way_node){

                $node_count++;

                # sequence for sepalate file 
                if( $node_per_file >=$max_nodes_per_file ){

                  close_osm_file();
                  $count_way = 0;
                  $node_per_file = 0;

                  $count_files++;
                  open_osm_file($count_files);
                  $file_start_id = $negative_id;

                  $negative_id--;

                  # setting nd-tag for conect ways between separate files
                  if( %temp_node_data ){
                    write_osm_node_tag($negative_id, \%temp_node_data);
                    @nd_tag = "<nd ref=\"$negative_id\" />\n";
                  }

                }  # end if( $node_per_file >=$max_nodes_per_file )

                $negative_id--;
                $node_id = $negative_id;

                # Routine when acquired point is reference data.
                if($ksj_way_node->find('jps:GM_Position.indirect')){

                  my $ksj_point_ref = $ksj_way_node->getElementsByTagName('GM_PointRef.point')->get_node(0)->getAttribute('idref');

                  foreach my $ksj_node (@ksj_node){

                    my $ksj_point_id = $ksj_node->getAttribute('id');

                    # Routine when it is point in line that data of acquired point targets.
                    if ( $ksj_point_id eq $ksj_point_ref ){

                        if( !exists($node_ref_id{$ksj_point_id}) || $node_ref_id{$ksj_point_id} > $file_start_id ){
                          # get Point data & write node to OSM file
                          my $pos_coodinate = get_ksj_point_data($ksj_node);
                          %temp_node_data = write_osm_node_tag($node_id, \%ksj_tag, $pos_coodinate);

                          $node_ref_id{$ksj_point_id} = $node_id;

                        }else{

                          $node_id = $node_ref_id{$ksj_point_id};

                        }  #end if( !exists($node_ref_id{$ksj_point_id}) || $node_ref_id{$ksj_point_id} > $file_start_id )

                        last;
                    }  # end if ($ksj_point_id eq $ksj_point_ref)

                  }  # end foreach my $ksj_node (@ksj_node)

                }  # end if($ksj_way_node->find('jps:GM_Position.indirect'))

                # Routine when acquired point is direct data.
                elsif($ksj_way_node->find('jps:GM_Position.direct')){

                  # get Point data & write node to OSM file
                  my $direct_pos_coodinate = get_ksj_point_data($ksj_way_node);
                  %temp_node_data = write_osm_node_tag($node_id, \%ksj_tag, $direct_pos_coodinate);

                }  # end elsif($ksj_way_node->find('jps:GM_Position.direct'))

                # for write way data to OSM file
                unshift( @nd_tag, "<nd ref=\"$node_id\" />\n" );

                if( $node_count >= $max_nodes_per_way ){

                  $negative_id--;

                  # write way data to OSM file
                  write_osm_way_tag($negative_id, \%ksj_tag, \@nd_tag, $relation);
                  $node_ref = "";

                  # setting nt-tag for conect ways 
                  @nd_tag = "<nd ref=\"$node_id\" />\n";

                  # set member-tag for write relation da|ta to OSM file
                  set_relation_member_data( $relation, $negative_id, $outer_inner_chk, \@member_tag );

                  $count_way++;  # count ways for separating OSM file
                  $node_per_file += $node_count;

                  $node_count = 0;

                }  # end if( $node_count >= $max_nodes_per_way )

              }  # end foreach my $ksj_way_node (@ksj_way_node)

              $negative_id--;

              # write way data to OSM file
              write_osm_way_tag($negative_id, \%ksj_tag, \@nd_tag, $relation);
              $node_ref = "";

              # set member-tag for write relation da|ta to OSM file
              set_relation_member_data( $relation, $negative_id, $outer_inner_chk, \@member_tag );

              $count_way++;  # count ways for separating OSM file
              $node_per_file += $node_count;

            }  # end if($gm_curve_id eq $ksj_tag{"KSJ2:curve_id"})

          }  # end foreach my $ksj_way (@ksj_way)

        }  # end foreach my $ksj_area_way (@ksj_area_way)

        # for write relation data to OSM file
        if( $relation ){

            $negative_id--;

            write_osm_relation_tag($negative_id, \%ksj_tag, \@member_tag);

        }

      }  # end if($gm_surfece_id eq $ksj_tag{"KSJ2:ARE"})

    }  # end foreach my $ksj_area (@ksj_area)

  }  # end foreach my $ksj_attr (@ksj_attr)

  close_osm_file();
  close_log();

}  # end main()

main();