Import of the Brazilian National Register of Health Facilities

From OpenStreetMap Wiki
Jump to navigation Jump to search

Goal

Import health facility data from the Brazilian Ministry of Health into OpenStreetMap

The primary goal is to adapt the National register of Health Facilities (Acronym in Portuguese - CNES - Cadastro Nacional de Estabelecimentos de Saúde)  data source to the OpenStreetMap standard and validate the possibility of importing data to OSM.  

The secondary goal is to do imports for the unknown health facilities records with coordinates following the Import Guidelines with specific user IDs to leverage the availability of Brazilian health data in the OSM data sources.

To accomplish the defined goal we plan to follow two different approaches:  

Approach 1- Maproulette:

The first step is to add/improve the known health facility records to OSM from the available directories with the help of the individual local users.

Approach 2- Import for small amount of records:

In the next step for the unknown health facilities records which are having coordinates will be added to OSM by performing small imports based on the district/block.

Schedule

  • Planning: Late 2020 early 2021
  • Import: Second quarter of 2021
  • QA: Post-import
  • Announce Import


Import Data

Background

Several system of information are available in Brazil, most of which are publicly accessible and administered by the Ministry of Health, through the Department of Informatics of the Unified Health System (DATASUS), whose data has guided the conduction of studies that address the analysis of epidemiological, health and service provision structure and infrastructure parameters. One example is the National Registry of Health Facilities - CNES, a database that contains data on all the Brazilian health facilities.

A health facility is included in the National Registry of Health Facilities - CNES by filling in specific forms with data on physical area, human resources, equipment and outpatient and hospital services in operation, regardless of whether or not they provide care to public health users. Once registered, the Ministry of Health generates a numerical code for each facility, the National Registry of Health Facilities - CNES code. National Registry of Health Facilities - CNES data are important for health planning, control and evaluation and should reflect the real situation of the health system.

As part of the efforts to tackle the COVID-19, the Brazilian Department of Informatics of the Unified Health System (DATASUS) released a geolocated version of the National Registry of Health Facilities - CNES comprising all the 400,000 Brazilian health care facilities. The data was released under the the open data  creative commons distribution creating conditions to support the OpenStreetMap and the Healthsites.io initiatives.

OSM Data Files

The OSM files derived from the raw National Registry of Health Facilities - CNES datasource can be found here: brazil_OSM_updated.xlsx . This file is the result of the pre-processing steps described below. The spreadsheet follows the OSM standard in terms of columns and labels.

Import Type

This is an OSM Brazilian community-based, one-time import. There are currently no plans for taking in or processing subsequent updates that DATASUS might provide.This would be a nice capability, but it is outside the scope of this immediate effort.

Method of import: All the imports will be done using JOSM with the import specific OSM accounts.

Data Preparation

Data reduction & simplification

The original Government data-sets are in .csv format and it contains several attributes in each directories. To cover the basic variables needed to fill the OSM standard we selected the information characterizing each health facility, its location, address, type of facility, operator, municipality of location and number of Beds. The raw source of information can be checked here: ftp://ftp.datasus.gov.br/cnes/  . This ftp link shares all raw data concerning the National register of health facilities. Over the information from this source we performed the cleaning steps described below to get the most accurate possible result concerning the Brazilian facilities registered.  

We run descriptive analysis to validate the attribute regarding each facility. The url column of the spreadsheet (brazil_OSM_updated.xlsx) refers to the official government webpage associated with each facility.  The link (http://cnes.datasus.gov.br/pages/estabelecimentos/consulta.jsp) allows any user to insert the CNES-code of each facility to check its information. If you insert the CNES code in the red box, the website will answer the mirror register of each facility in the brazil_OSM_updated.xlsx file. The CNES code is the last 7 digits of each url in the column “URL” of the brazil_OSM_updated.xlsx file.

All data obtained from the DATASUS warehouse was pre-processed to clean inconsistencies attributes and retain the attributes which are useful combinations for the Health care related facilities. The list of attributes which have been retained from the Open Government Data for the import are listed below:

Attributes Description
OBJECTID ID of each facility
osm_id Field to register a future OSM id
amenity Type of amenity considering the match between Portuguese tags and the OSM standard
healthcare Type of facility according to the National Register of Health Facilities - CNES codebook (Codebook.pdf )
name Name of the facility according to the National Register of Health Facilities - CNES
operator Operator responsible for the management of the facility
source Link for the open data source of the facilities analyzed.
speciality Health care specialty of each facility
operator_t Operator of the current facility (public,/private or just government)
contact_nu Phone number of each facility
operationa Operational status of the facility
opening_ho Opening hours
beds Number of beds available
staff_doct Number of physicians in the facility
staff_nurs Number of nurses in the facility
health_ame Type of equipment available at facility
dispensing Existence of dispensing pharmacy in the facility
wheelchair Facility suitable for wheelchair use
emergency Existence of emergency services.
insurance Type of health insurance accepted by the facility
water_sour Source of water
electricit Source of power gnerated
is_in_heal Health area comprising the facility
is_in_he_1 Health zone comprising the facility
URL Url describing the facility register
addr_house Address - house
addr_stree   Address - street
addr_postc   Address - postcode
addr_city   Address - city
changeset_ Versioning control
changeset1 Versioning control
changeset2 Versioning control
changeset3 Versioning control
latitude Latitude of the facility
longitude Longitude of the facility
CNTRY_TERR Country- territory
SOVEREIGN Country- territory
ISO_3_CODE Country code 3 digits- ISO
ISO_2_CODE Country code 2 digits- ISO
UN_CODE United Nations country code
WHO_CODE World health organization country code
WHO_STATUS Status regarding WHO alliance

The CNES dataset comprises hundreds of variables. The data source has information on physical structure, professionals, beds, emergency capacity, list of equipment, and dozens of other data categories. All this information is published consistently on a monthly basis. Our initial approach is to import the information of beds and geolocation to OSM, future efforts can be performed to incorporate the other information available.

Input Data Cleaning:

The input data has few quality issues which will be addressed and further cleaned on the basis of values before the import .The steps performed during the cleaning phase are:

  • Removal of invalid values - N/A, 0, /N etc…,
  • Removal of duplicate records,
  • Modification of the source data into OSM compatible values - separating the multiple values with ";", changing the opening hours Syntax etc.
  • Remove facilities without the geolocation coordinates,
  • Adding to each facility a link with the complete data information, so anyone can help in the future to add new information.

After cleaning : The final data after removing all the issues from the raw data can be found here: brazil_OSM_updated.xlsx

Tagging plans

Here we list all the original tags with their corresponding translation into the OSM tagging schema, plus additional tags for all segments. The conversion dictionary from the type of facility in Portuguse, and the corresponding label according to the OSM key standard is described below:

Health Facility Related tags converted
Government Facility Type OSM Key
CENTRAL DE GESTAO EM SAUDE amenity = government
CENTRAL DE NOTIFICACAO,CAPTACAO E DISTRIB DE ORGAOS ESTADUAL amenity = government
CENTRAL DE REGULACAO DE SERVICOS DE SAUDE amenity = government
CENTRAL DE REGULACAO DO ACESSO amenity = government
CENTRAL DE REGULACAO MEDICA DAS URGENCIAS amenity = government
CENTRO DE APOIO A SAUDE DA FAMILIA amenity = clinic
CENTRO DE ATENCAO HEMOTERAPIA E OU HEMATOLOGICA amenity = clinic
CENTRO DE ATENCAO PSICOSSOCIAL amenity = social_facility
CENTRO DE PARTO NORMAL - ISOLADO amenity = hospital
CENTRO DE SAUDE/UNIDADE BASICA amenity = clinic
CLINICA/CENTRO DE ESPECIALIDADE amenity = clinic
CONSULTORIO ISOLADO amenity = clinic
COOPERATIVA OU EMPRESA DE CESSAO DE TRABALHADORES NA SAUDE amenity = social_facility
FARMACIA amenity = pharmacy
HOSPITAL ESPECIALIZADO amenity = hospital
HOSPITAL GERAL amenity = hospital
HOSPITAL/DIA - ISOLADO amenity = hospital
LABORATORIO CENTRAL DE SAUDE PUBLICA LACEN amenity = clinic
LABORATORIO DE SAUDE PUBLICA amenity = government
OFICINA ORTOPEDICA amenity = clinic
POLICLINICA amenity = clinic
POLO ACADEMIA DA SAUDE amenity = social_facility
POLO DE PREVENCAO DE DOENCAS E AGRAVOS E PROMOCAO DA SAUDE amenity = social_facility
POSTO DE SAUDE amenity = clinic
PRONTO ATENDIMENTO amenity = hospital
PRONTO SOCORRO ESPECIALIZADO amenity = hospital
PRONTO SOCORRO GERAL amenity = hospital
SERVICO DE ATENCAO DOMICILIAR ISOLADO(HOME CARE) amenity = clinic
TELESSAUDE amenity = clinic
UNIDADE DE APOIO DIAGNOSE E TERAPIA (SADT ISOLADO) amenity = clinic
UNIDADE DE ATENCAO A SAUDE INDIGENA amenity = clinic
UNIDADE DE ATENCAO EM REGIME RESIDENCIAL amenity = clinic
UNIDADE DE VIGILANCIA EM SAUDE amenity = clinic
UNIDADE MISTA amenity = clinic
UNIDADE MOVEL DE NIVEL PRE-HOSPITALAR NA AREA DE URGENCIA amenity = clinic
UNIDADE MOVEL FLUVIAL amenity = clinic
UNIDADE MOVEL TERRESTRE amenity = clinic

Other General attributes:

The values for addressing and contact details are tagged based on the global OSM Keys from the Key:addr_house,  Key:addr_stree,  Key:aaddr_postc, Key:addr_city, Key:URL, and Key:contact_nu. The general attributes described in the keys above, highlights additional details regarding the address of each facility, the phone number for contact, as well as the URL with detailed information characterizing each one of the facilities processed.  

Data transformation

A total of 354,805 health facilities were processed. The result of the processing step is saved here: https://drive.google.com/file/d/1V70Fjhg3sza_z3kgg2uQ4fdR7NzSldOv/view?usp=sharing

Data transformation results

Changeset tags

We will use the following changeset tags:

  • comment=Brazilian National Registry of Health Facilities - CNES
  • source=datasus.gov.br
  • source:date=YYYY[-MM[-DD]]..YYYY[-MM[-DD]],
  • import=yes
  • url=https://wiki.openstreetmap.org/wiki/Import_of_the_Brazilian_National_Register_of_Health_Facilities

Element Tags

  • Source=OpenGovernmentData

Data merge workflow

Team Approach

The import will be done manually by JOSM experienced users with dedicated usernames.

References

Each user will consider the following information when importing the data by:

Maproulette:

  • Local knowledge
  • Ground Truth Verification
  • Contacting the facilities
  • Satellite Imagery
  • Existing OSM data
  • Building outline
  • Health Facility data from MR challenge

Imports:

  • Satellite Imagery
  • Existing OSM data
  • Building outline
  • Health Facility data Json file

Workflow

We will start with the Manaus data set and split into city or district level before assigning to local OSM users.

Following on from this data import we will perform the same process with priority cities before moving to the Brazilian states.

The cleaned up data sets will be divided into the 27 states of Brazil and assigned to users based on region.

Import of the Brazilian National Register of Health Facilities - data merge process flow