User:Tatata/shp2osm-railway.pl

From OpenStreetMap Wiki
Jump to navigation Jump to search

This perl script convert shape file of KSJ2 railway data to xml file of osm. (ja)このperlスクリプトは国土数値情報(鉄道データ)のshapeファイルをosmのxmlファイルに変換します。

Preparation

(ja)準備

How to convert the data

(ja)変換手順

  • Download KSJ2 railway data decompress from zip file to xml file. (ja)国土数値情報(鉄道データ)をダウンロードし、zipファイルを解凍してxmlファイルにする。
  • Convert xml file to shape file by KsjTool. (ja)国土数値情報データ変換ツールを使ってxmlファイルをshapeファイルに変換する。
  • Open this perl script by editor program, edit the initial values of $target_opc and $target_lin to specify railway operator name and railway line name with UTF-8 characters (you can copy and paste from list), and then save the script. (ja)このperlスクリプトをエディタープログラムで開き、$target_opcと$target_linの初期値にUTF-8の文字による鉄道事業者名と鉄道路線名を書き込んでスクリプトを保存する(このリストからコピペ可)。
  • Open command prompt window and run this script on the directory where you put the shape files. Output files (N02-07_railway.osm, N02-07_railway.log) will be created on the same directory. (ja)コマンドプロンプトを開いて、shape fileを置いたディレクトリでスクリプトを実行する。出力ファイル (N02-07_railway.osm, N02-07_railway.log) は同じディレクトリに作成される。
  • Open N02-07_railway.osm by JOSM, check and edit the data (see notes below), and then upload to server. (ja)JOSMでN02-07_railway.osmを開いてデータの確認や編集を行い(下の注意を参照)、それからサーバーにアップロードする。

Notes

(ja)注意

  • KSJ2 railway data doesn't have information about distinction between "rail" and "subway" and there are some subway lines that have RAC(RailwayClassCd)=16 (automated guideway transit) or RAC=21 (tram). The script doesn't cope with these problem automatically, so you have to check tags in the output file and change them to correct appropriate one before you upload to server. (ja)国土数値情報の鉄道データには地上の鉄道と地下鉄の区別するための情報がありません。また、幾つかの地下鉄路線にはRAC(RailwayClassCd)=16(案内軌上式鉄道)やRAC=21(軌道)として分類されています。このスクリプトはこれらの問題には自動的に対応できませんので、出力ファイル内のタグを確認し、サーバーへアップロードする前に適切なものに変更して下さい。
  • A single railway line information of KSJ2 data (e.g. Tōkaidoō Shinkansen) is divided into many sections and two or more nodes with the same coordinates (a set of latitude and longitude) are piled up on a connection point of adjoining sections. The script merge nodes with the same coordinates, but doesn't combine ways automatically, so it would be better to do it yourself manually. (ja)鉄道データに含まれる1つの路線の情報(例えば東海道新幹線)は多くの区間に分割されており、同じ座標(緯度経度の組合せ)を持った複数のnodeが隣接する区間の接続点で重なっています。このスクリプトは同じ座標を持った複数のnodeをマージしますが、wayの自動的な結合はできませんので、サーバーへアップロードする前に手作業で直した方が良いでしょう。

Source code

As of 2008-04-07.



#!/usr/bin/perl

use strict;
use warnings;
# use encoding "utf8"; 
use encoding "utf8", STDOUT => "shiftjis", STDERR => "shiftjis"; # for Windows
use Encode;
use open IO => "utf8";
use Geo::ShapeFile;

#####
#
# KSJ2 Railway Data
#
# National-Land Numerical Information (Railway) 2007, MLIT Japan
# 国土数値情報(鉄道データ)平成19年 国土交通省
#
# Files
#   Input
#     Shape file (Railway Line) 
#       : N02-07_EB02.dbf N02-07_EB02.shp N02-07_EB02.shx
#     Shape file (Railway Station) 
#       : N02-07_EB03.dbf N02-07_EB03.shp N02-07_EB03.shx
#   Output
#     Osm file : N02-07_railway.osm
#     Log file : N02-07_railway.log
#
#####

our $file_eb02 = "N02-07_EB02";
our $file_eb03 = "N02-07_EB03";
our $file_name = "N02-07_railway";

our $target_opc = "東京都"; # railway operator name (UTF-8) "九州旅客鉄道", "*", ""
our $target_lin = "上野懸垂線"; # railway line name (UTF-8) "鹿児島線", "*", ""


sub main() {

  createOsm(0, 99999);

  my $time = localtime(time);
  print "***** Done!: End $time\n";

}


sub createOsm(){

  my ($from, $to) = @_;
  my $negative_id = 0;
  my $node_ref;
  my %nodes = ();
  my $num_nodes;
  my $num_stations;
  my $num_ways;
  my $time = localtime(time);
  
  open(LOG, ">$file_name.log");
  print LOG "***** KSJ2 Railway Data 2007 : Start $time\n";
  print "***** KSJ2 Railway Data 2007 : Start $time\n";
  print LOG "***** Target : $target_opc / $target_lin\n";
  print "***** Target : $target_opc / $target_lin\n";

  
  open(OSM, ">$file_name.osm");
  print OSM "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
  print OSM "<osm version=\"0.5\" generator=\"SHP2OSM\">\n";
  
  my %int_code = getIntcode();
  my %rac_code = getRaccode();
  my %stations = getStations();
  
  my $shapefile = new Geo::ShapeFile($file_eb02);
  my $num_shapes = $shapefile->shapes();
  printf LOG "***** Railway Lines (EB02) : %d shapes\n", $num_shapes;
  printf "***** Railway Lines (EB02) : %d shapes\n", $num_shapes;
  print LOG 
    "***** Railway Lines (EB02) : process Shape ID from $from to $to\n";
  print "***** Railway Lines (EB02) : process Shape ID from $from to $to\n";

  for (1 .. $num_shapes) {
    
    my $shape = $shapefile->get_shp_record($_);
    my $shape_id = $shape->shape_id();
    printf LOG 
      "***** Railway Lines (EB02) : Shape ID %d : Parts %d : Points %d \n"
      , $shape_id, $shape->num_parts(), $shape->num_points();
    
    my $in_range = ($shape_id >= $from && $shape_id <= $to);
    
    if($in_range){
      
      my %dbf_record = $shapefile->get_dbf_record($shape_id);
      my $value_rac = $dbf_record{"RAC"};
      my $value_int = $dbf_record{"INT"};
      my $value_opc = Encode::decode("shift-jis",$dbf_record{"OPC"});
      my $value_lin = Encode::decode("shift-jis",$dbf_record{"LIN"});

      if ($target_opc eq $value_opc || $target_opc eq "*" || $target_opc eq "") {

        if ($target_lin eq $value_lin || $target_lin eq "*" || $target_lin eq "") {

          my $value_stn = "";
          my $i = 0;
      
          foreach my $point($shape->points()) {
            my $long = $point->X();
            my $lat = $point->Y();
            $i++;
          
            ($negative_id, $node_ref, $num_nodes, $num_stations) 
              = writeNode($negative_id, $node_ref, $num_nodes, $num_stations
              , $lat, $long, $value_rac, $value_int, $value_opc, $value_lin
              , \%rac_code, \%int_code, \%stations, \%nodes);

          }
     
          $negative_id 
            = writeWay($negative_id, $node_ref, $value_rac, $value_int, $value_opc
            , $value_lin, \%rac_code, \%int_code);
          $num_ways++;
          printf LOG 
            "Way %d: %d nodes RAC=%s INT=%s OPC=%s LIN=%s \n"
            , $negative_id, $i, $value_rac, $value_int, $value_opc, $value_lin;
          $node_ref = "";
        }

      }

    }
    
  }
  
  print OSM "</osm>";
  close OSM;
  printf LOG 
    "***** Railway Lines (EB02) : %d nodes on %d ways\n"
    , $num_nodes, $num_ways;
  printf LOG 
    "***** Railway Lines (EB02) : %d nodes tagged as stastion\n"
    , $num_stations;
  $time = localtime(time);
  print LOG "***** Done!: End $time\n";
  close LOG;
  printf 
   "***** Railway Lines (EB02) : %d nodes on %d ways\n"
   , $num_nodes, $num_ways;
  printf 
   "***** Railway Lines (EB02) : %d nodes tagged as stastion\n"
   , $num_stations;
  
}


sub writeWay(){

  my ($negative_id, $node_ref, $value_rac, $value_int, $value_opc, $value_lin
    , $rac_code, $int_code) = @_;
  $negative_id--;
  my $tmp_lin;
  my $tags = "<tag k=\"created_by\" v=\"National-Land-Numerical-Information_MLIT_Japan\"/>";
  $tags .= "<tag k=\"source\" v=\"KSJ2\"/>";
  $tags .= "<tag k=\"source_ref\" v=\"http://nlftp.mlit.go.jp/ksj/jpgis/datalist/KsjTmplt-N02-v1_1.html\"/>";
  $tags .= "<tag k=\"note\" v=\"National-Land Numerical Information (Railway) 2007, MLIT Japan\"/>";
  $tags .= "<tag k=\"note:ja\" v=\"国土数値情報(鉄道データ)平成19年 国土交通省\"/>";
  $tags .= "<tag k=\"KSJ2:RAC\" v=\"$value_rac\"/>";
  my $label_rac = $$rac_code{$value_rac};
  $tags .= "<tag k=\"KSJ2:RAC_label\" v=\"$label_rac\"/>";
  $tags .= "<tag k=\"KSJ2:INT\" v=\"$value_int\"/>";
  my $label_int = $$int_code{$value_int};
  $tags .= "<tag k=\"KSJ2:INT_label\" v=\"$label_int\"/>";
  $tags .= "<tag k=\"KSJ2:OPC\" v=\"$value_opc\"/>";
  $tags .= "<tag k=\"KSJ2:LIN\" v=\"$value_lin\"/>";

  if ($value_int == 1) {$tmp_lin = "$value_lin";}
  elsif ($value_int == 2) {$tmp_lin = "JR$value_lin";}
  elsif ($value_int == 3) {
        $tmp_lin = $value_opc;
        $tmp_lin .= "営";
        $tmp_lin .= $value_lin;
        if ($value_opc eq "東京都" && $value_lin eq "10号線新宿線") 
          {$tmp_lin = "都営新宿線";}
        if ($value_opc eq "福岡市" && $value_lin eq "1号線(空港線)") 
          {$tmp_lin = "福岡市地下鉄空港線 (Fukuoka City Subway K?k? Line)";}
        if ($value_opc eq "福岡市" && $value_lin eq "2号線(箱崎線)") 
          {$tmp_lin = "福岡市地下鉄箱崎線 (Fukuoka City Subway Hakozaki Line)";}
        if ($value_opc eq "福岡市" && $value_lin eq "3号線(七隈線)") 
          {$tmp_lin = "福岡市地下鉄七隈線 (Fukuoka City Subway Nanakuma Line)";}

  }
  elsif ($value_int == 4 || $value_int == 5) {
        $tmp_lin = $value_opc;
        $tmp_lin .= "線";

        if ($value_lin eq $tmp_lin) {$tmp_lin = "$value_opc";}
        else {$tmp_lin = "$value_opc$value_lin";}

        if ($value_opc eq "名古屋ガイドウェイバス") 
          {$tmp_lin = "ゆとりーとライン・名古屋ガイドウェイバス志段味線 (Yutorito Line, Guideway Bus Shidami Line)";}
        if ($value_opc eq "沖縄都市モノレール") 
          {$tmp_lin = "ゆいレール・沖縄都市モノレール (Yui rail, Okinawa City Monorail Line)";}
        if ($value_opc eq "いすみ鉄道") 
          {$tmp_lin = "いすみ鉄道";}
        if ($value_opc eq "北総鉄道") 
          {$tmp_lin = "北総鉄道";}
        if ($value_opc eq "千葉都市モノレール" && $value_lin eq "1号線") 
          {$tmp_lin = "タウンライナー・千葉都市モノレール1号線 (Townliner, Chiba Urban Monorail Line 1)";}
        if ($value_opc eq "千葉都市モノレール" && $value_lin eq "2号線") 
          {$tmp_lin = "タウンライナー・千葉都市モノレール2号線 (Townliner, Chiba Urban Monorail Line 2)";}
        if ($value_opc eq "東葉高速鉄道") 
          {$tmp_lin = "東葉高速鉄道";}
        if ($value_opc eq "東京地下鉄" && $value_lin eq "5号線東西線") 
          {$tmp_lin = "東京メトロ東西線";}
        if ($value_opc eq "東武鉄道" && $value_lin eq "野田線") 
          {$tmp_lin = "東武野田線";}
        if ($value_opc eq "総武流山電鉄") 
          {$tmp_lin = "総武流山電鉄";}
        if ($value_opc eq "肥薩おれんじ鉄道") 
          {$tmp_lin = "肥薩おれんじ鉄道";}
        if ($value_opc eq "舞浜リゾートライン") 
          {$tmp_lin = "ディズニーリゾートライン";}
        if ($value_opc eq "首都圏新都市鉄道" && $value_lin eq "常磐新線") 
          {$tmp_lin = "つくばエクスプレス";}

  }
  else  {
    printf "Railway Class Code Error (INT): ID=%d INT=%s"
      , $negative_id, $value_int;
    printf LOG 
      "Railway Class Code Error (INT): ID=%d RAC=%s INT=%s OPC=%s LIN=%s \n"
      , $negative_id, $value_rac, $value_int, $value_opc, $value_lin;
    $tmp_lin = "$value_opc$value_lin";
  }

  $tags .= "<tag k=\"name\" v=\"$tmp_lin\"/>";
  $tags .= "<tag k=\"name:ja\" v=\"$tmp_lin\"/>";
  $tags .= "<tag k=\"operator\" v=\"$value_opc\"/>";
  $tags .= "<tag k=\"operator:ja\" v=\"$value_opc\"/>";

  if ($value_rac == 11 || $value_rac == 12 || $value_rac == 13) {
    if (($value_opc eq "東京地下鉄")
      || ($value_opc eq "東京都" && $value_lin eq "10号線新宿線")
      || ($value_opc eq "福岡市"))
      {$tags .= "<tag k=\"railway\" v=\"subway\"/>";}
    else {$tags .= "<tag k=\"railway\" v=\"rail\"/>";}
  }
  elsif ($value_rac == 14 || $value_rac == 15 || $value_rac == 22 || $value_rac == 23)
    {$tags .= "<tag k=\"railway\" v=\"monorail\"/>";}
  elsif ($value_rac == 16 || $value_rac == 25)
    {$tags .= "<tag k=\"railway\" v=\"light_rail\"/>";}
  elsif ($value_rac == 17 || $value_rac == 21)
    {$tags .= "<tag k=\"railway\" v=\"tram\"/>";}
  elsif ($value_rac == 24) {
    if ($value_opc eq "名古屋ガイドウェイバス") 
      {$tags .= "<tag k=\"highway\" v=\"bus_guideway\"/>";}
    else  {$tags .= "<tag k=\"railway\" v=\"light_rail\"/>";}
  }
  else  {
    printf "Railway Class Code Error (RAC): ID=%d RAC=%s"
      , $negative_id, $value_rac;
    printf LOG 
      "Railway Class Code Error (RAC): ID=%d RAC=%s INT=%s OPC=%s LIN=%s \n"
      , $negative_id, $value_rac, $value_int, $value_opc, $value_lin;
    $tags .= "<tag k=\"railway\" v=\"rail\"/>";
  }

  my $way 
    = sprintf("<way id=\"%d\" action=\"modify\" visible=\"true\">$node_ref$tags</way>"
    , $negative_id);
  print OSM "$way\n";
  
  return $negative_id;

}


sub writeNode(){

  my ($negative_id, $node_ref, $num_nodes, $num_stations, $lat, $long, $value_rac, $value_int, $value_opc
    , $value_lin, $rac_code, $int_code, $stations, $nodes) = @_;
  my $value_stn = "";
  my $node_id = 0;

  my $tags = "<tag k=\"created_by\" v=\"National-Land-Numerical-Information_MLIT_Japan\"/>";
  $tags .= "<tag k=\"source\" v=\"KSJ2\"/>";
  $tags .= "<tag k=\"source_ref\" v=\"http://nlftp.mlit.go.jp/ksj/jpgis/datalist/KsjTmplt-N02-v1_1.html\"/>";
  $tags .= "<tag k=\"note\" v=\"National-Land Numerical Information (Railway) 2007, MLIT Japan\"/>";
  $tags .= "<tag k=\"note:ja\" v=\"国土数値情報(鉄道データ)平成19年 国土交通省\"/>";
  $tags .= "<tag k=\"KSJ2:coordinate\" v=\"$lat $long\"/>";
  $tags .= "<tag k=\"KSJ2:lat\" v=\"$lat\"/>";
  $tags .= "<tag k=\"KSJ2:long\" v=\"$long\"/>";
  $tags .= "<tag k=\"KSJ2:RAC\" v=\"$value_rac\"/>";
  my $label_rac = $$rac_code{$value_rac};
  $tags .= "<tag k=\"KSJ2:RAC_label\" v=\"$label_rac\"/>";
  $tags .= "<tag k=\"KSJ2:INT\" v=\"$value_int\"/>";
  my $label_int = $$int_code{$value_int};
  $tags .= "<tag k=\"KSJ2:INT_label\" v=\"$label_int\"/>";
  $tags .= "<tag k=\"KSJ2:OPC\" v=\"$value_opc\"/>";
  $tags .= "<tag k=\"KSJ2:LIN\" v=\"$value_lin\"/>";

  if (exists($$stations{"$lat $long"})) {
    $value_stn = $$stations{"$lat $long"};
    $tags .= "<tag k=\"KSJ2:STN\" v=\"$value_stn\"/>";
    $tags .= "<tag k=\"name\" v=\"$value_stn\"/>";
    $tags .= "<tag k=\"name:ja\" v=\"$value_stn\"/>";

    if ($value_rac == 17 || $value_rac == 21)
      {$tags .= "<tag k=\"railway\" v=\"tram_stop\"/>";}
    elsif  ($value_opc eq "名古屋ガイドウェイバス")
      {$tags .= "<tag k=\"highway\" v=\"bus_stop\"/>";}
    else {$tags .= "<tag k=\"railway\" v=\"station\"/>";}
    
  }

  if (exists($$nodes{"$lat $long"})) {
    $node_id = $$nodes{"$lat $long"};
  }
  else {
    $negative_id--;
    $node_id = $negative_id;
    $$nodes{"$lat $long"} = $node_id;
    my $node 
      = sprintf("<node id=\"%d\" visible=\"true\" lat=\"%s\" lon=\"%s\">$tags</node>"
      , $node_id, $lat, $long);
    print OSM "$node\n";
    $num_nodes++;

    unless ($value_stn eq "") {
      $num_stations++;
    }

  }

  $node_ref .= sprintf("<nd ref=\"%d\" />", $node_id);
  printf LOG "Node %d: %s, %s, %s\n", $node_id, $lat, $long, $value_stn;

  return ($negative_id, $node_ref, $num_nodes, $num_stations) ;

}


sub getIntcode() {

  # "InstitutionTypeCd.xml"
  my %hash = (
    "1" => "新幹線",
    "2" => "JR在来線",
    "3" => "公営鉄道",
    "4" => "民営鉄道",
    "5" => "第三セクター"
  );

  return %hash;

}


sub getRaccode() {

  # "RailwayClassCd.xml"
  my %hash = (
    "11" => "普通鉄道JR",
    "12" => "普通鉄道",
    "13" => "鋼索鉄道",
    "14" => "懸垂式鉄道",
    "15" => "跨座式鉄道",
    "16" => "案内軌上式鉄道",
    "17" => "無軌上式鉄道",
    "21" => "軌道",
    "22" => "懸垂式モノレール",
    "23" => "跨座式モノレール",
    "24" => "案内軌上式",
    "25" => "浮上式"
  );

  return %hash;

}


sub getStations() {

  my $num_nodes;
  my $num_ways;
  my %hash = ();

  my $shapefile = new Geo::ShapeFile($file_eb03);
  my $num_shapes = $shapefile->shapes();

  printf LOG "***** Railway Stations (EB03) : %d shapes\n", $num_shapes;
  printf "***** Railway Stations (EB03) : %d shapes\n", $num_shapes;

  for(1 .. $num_shapes) {
    
    my $i = 0;
    my $shape = $shapefile->get_shp_record($_);
    my $shape_id = $shape->shape_id();
    my $num_points = $shape->num_points();
    printf LOG 
      "***** Railway Stations (EB03) : Shape ID %d : Parts %d : Points %d \n"
      , $shape_id, $shape->num_parts, $num_points;

    my %dbf_record = $shapefile->get_dbf_record($shape_id);
    my $value_rac = $dbf_record{"RAC"};
    my $value_int = $dbf_record{"INT"};
    my $value_opc = Encode::decode("shift-jis",$dbf_record{"OPC"});
    my $value_lin = Encode::decode("shift-jis",$dbf_record{"LIN"});

    if ($target_opc eq $value_opc || $target_opc eq "*" || $target_opc eq "") {

      if ($target_lin eq $value_lin || $target_lin eq "*" || $target_lin eq "") {

        my $value_stn = Encode::decode("shift-jis",$dbf_record{"STN"});
        my $center = int(($num_points - 1) / 2);

        foreach my $point($shape->points()) {
          my $long = $point->X();
          my $lat = $point->Y();
      
          if ($i == $center) {
            $hash{"$lat $long"} = $value_stn;
            printf LOG "Node %d: %s, %s, %s <<< Selected\n"
              , $i, $lat, $long, $value_stn;
          }
          else {
            printf LOG "Node %d: %s, %s, %s \n", $i, $lat, $long, $value_stn;
          }

          $num_nodes++;
          $i++;

        }

        $num_ways++;
        printf LOG "Way %d: %d nodes RAC=%s INT=%s OPC=%s LIN=%s STN=%s\n"
          , $num_ways, $i, $value_rac, $value_int, $value_opc, $value_lin, $value_stn;

      }

    }

  }

  my $n = keys(%hash);
  printf LOG "***** Railway Stations (EB03) : %d nodes on %d ways processed\n"
    , $num_nodes, $num_ways;
  printf LOG "***** Railway Stations (EB03) : %d nodes found as station\n"
    , $n;
  printf "***** Railway Stations (EB03) : %d nodes on %d ways processed\n"
    , $num_nodes, $num_ways;
  printf "***** Railway Stations (EB03) : %d nodes found as station\n"
    , $n;

  return %hash;

}


# run this script.

main();


# end of script