Quantcast
Channel: Addresses – Digital Geography
Viewing all articles
Browse latest Browse all 4

Geocoding Addresses in ArcGIS: the other approach

$
0
0
Today I stumbled upon a post from the German ESRI office and their blog gisIQ and a little tutorial (English translation) on how to geocode addresses in the ArcGIS platform. I was asking, whether there is a possibility to use other geocoders as well and so I tried to build my own solution without credits and with the possibility to choose a geocoder. Fortunately the Python world offers some nice little scripts. So let’s use geopy!

Prerequisites for Geocoding

We will start with a very basic txt file which has one attribute “name” and two fields for the address “str_num” and “code_place”. You can download the file with one “wrong” address (a common problem!) here. My goal is to create a shapefile with all rows and some nice points on my map:
Baden-Württemberg

tab delimited text file with addresses of universities in Baden-Württemberg (a state in Germany)

Furthermore we will need an empty point shapefile to hold the data. Create a shapefile and add fields according to your textfile. In my case I just needed to add the field “name” with type “Text” to hold the name:
As we don’t use the build-in geocoder but something from the Python world you need to download the geopy package here, extract the package using your archive software like 7-zip, winzip or whatever and place the geopy folder in your Python site package folder of your ArcGIS Python installation. The geopy-folder in the downloaded package:
geopy folder

package with geopy folder

The geopy-folder in my Python installation folder:
site-package

site-package folder of my QGIS 2.7 installation for ArcGIS

Now we are ready to come to the scripting part.

The Geocoding/Shapefile Creation

First of all, we need to know how to read the csv file. Python comes with native csv support and you can read and print information out of a csv like this using the python interface in ArcGIS:
import csv
with open('C:\\Users\\ricckli\\Documents\\wd_gis\\geocoding\\adresses.txt', 'rb') as csvfile:
	content = csv.DictReader(csvfile, delimiter='\t')
	for row in content:
		print row
The variable row has now three attributes stored. In the first the name, second and third information belongs to the address. Now we need to find out, how to geocode an address using geopy. We can simply use the information from the geopy github page and apply it to our example:
from geopy.geocoders import Nominatim
geolocator = Nominatim()
result = geolocator.geocode("1600 Pennsylvania Avenue Washington") #white house address
Once we know, how to iterate over the file and we know how to geocode an address, we can copy each row into our created shapefile (thanks to perrygeo):
import csv
from geopy.geocoders import Nominatim
geolocator = Nominatim()
cursor = arcpy.InsertCursor("Universities") #the shapefile is already part of my ArcGIS project
with open('C:\\Users\\ricckli\\Documents\\wd_gis\\geocoding\\adresses.txt', 'rb') as csvfile:
	content = csv.DictReader(csvfile, delimiter='\t')
	for row in content:
		feature = cursor.newRow() #this is the current object we work on.
		vertex = arcpy.CreateObject("Point")
		vertex.X = geolocator.geocode(row["str_num"].decode('utf-8') + row["code_place"].decode('utf-8')).longitude #I am using .decode('utf-8') just to handle the umlaut problem in German names 😉
		vertex.Y = geolocator.geocode(row["str_num"].decode('utf-8') + row["code_place"].decode('utf-8')).latitude
		feature.shape = vertex
		feature.Name = row["name"]
		cursor.insertRow(feature) 
del cursor
In the end the shapefile should hold all the information of the csv we have used. Now let’s zoom to the create shapefile:
dataframe = arcpy.mapping.ListDataFrames(arcpy.mapping.MapDocument('current'))[0]  
geocodelayer = arcpy.mapping.ListLayers(arcpy.mapping.MapDocument('current'), 'universities', dataframe)[0] 
layer_extent = geocodelayer.getExtent()
dataframe.extent = layer_extent
Hint: Sometimes the geocoder fails. Let’s create a csv with the rows which where not succesful in geocoding by using this enhanced script:
import geopy
import csv
from geopy.geocoders import Nominatim
geolocator = Nominatim()
cursor = arcpy.InsertCursor("Universities")
failed_text =""
numbers_failed = 0
with open('C:\\Users\\ricckli\\Documents\\wd_gis\\geocoding\\failed_codes.txt', 'w') as file:
	with open('C:\\Users\\ricckli\\Documents\\wd_gis\\geocoding\\adresses.txt', 'rb') as csvfile:
		content = csv.DictReader(csvfile, delimiter='\t')
		for row in content:
			feature = cursor.newRow()
			vertex = arcpy.CreateObject("Point")
			coord = geolocator.geocode(row["str_num"].decode('utf-8') + row["code_place"].decode('utf-8'))
			if coord is None:
				failed_text += row["name"] + row["str_num"] + row["code_place"]
				numbers_failed += 1
			if coord is not None:
				vertex.X = geolocator.geocode(row["str_num"].decode('utf-8') + row["code_place"].decode('utf-8')).longitude
				vertex.Y = geolocator.geocode(row["str_num"].decode('utf-8') + row["code_place"].decode('utf-8')).latitude
				feature.shape = vertex
				feature.Name = row["name"]
				cursor.insertRow(feature) 
	file.write(failed_text)
	file.close()
del cursor
print "failed geocodes: " + str(numbers_failed) + "!!! check the file C:/Users/ricckli/Documents/wd_gis/geocoding/failed_codes.txt"

dataframe = arcpy.mapping.ListDataFrames(arcpy.mapping.MapDocument('current'))[0]  
geocodelayer = arcpy.mapping.ListLayers(arcpy.mapping.MapDocument('current'), 'universities', dataframe)[0] 
layer_extent = geocodelayer.getExtent()
dataframe.extent = layer_extent
The whole script can be downloaded. Please adjust your filenames and drop us a comment if you know another solution/enhancement! Please also notice the usage policies of the geocoders. Like this here for the used Nominatim coder.

The post Geocoding Addresses in ArcGIS: the other approach appeared first on Digital Geography.


Viewing all articles
Browse latest Browse all 4

Trending Articles