Easily change coordinate projection systems in Python with pyproj

Python is an easy-to-use programming language which, thanks to a growing number of cool extension modules, is really taking off in the world of scientific data handling.  The Proj4 libraries are a set of programs for performing coordinate system transformations.  Both are open source, so you are free to install them on as many computers as you want and to share them with your friends.  I had been using both for a while but only recently discovered the pyproj module that performs coordinate transformations inside Python itself.  It makes life very easy.

The easiest way to get pyproj is as part of the Matplotlib Basemap package.  On Ubuntu Linux, this can be installed with the command: sudo apt-get install python-mpltoolkits.basemap

1.) Setting up coordinate systems

The first step is to define pyproj ‘objects’ to represent the coordinate systems that you want to use.  These can be defined using the Proj notation (see Proj4 documentation for details) but it is easier to set up commonly-used projections by referring to their standard code numbers.  These are called EPSG codes and can be looked up on spatialreference.org.

>>> import mpl_toolkits.basemap.pyproj as pyproj # Import the pyproj module

>>> # Define a projection with Proj4 notation, in this case an Icelandic grid
>>> isn2004=pyproj.Proj("+proj=lcc +lat_1=64.25 +lat_2=65.75 +lat_0=65 +lon_0=-19 +x_0=1700000 +y_0=300000 +no_defs +a=6378137 +rf=298.257222101 +to_meter=1")

>>> # Define some common projections using EPSG codes
>>> wgs84=pyproj.Proj("+init=EPSG:4326") # LatLon with WGS84 datum used by GPS units and Google Earth
>>> osgb36=pyproj.Proj("+init=EPSG:27700") # UK Ordnance Survey, 1936 datum
>>> UTM26N=pyproj.Proj("+init=EPSG:32626") # UTM coords, zone 26N, WGS84 datum
>>> UTM27N=pyproj.Proj("+init=EPSG:32627") # UTM coords, zone 27N, WGS84 datum
>>> UTM28N=pyproj.Proj("+init=EPSG:32628") # ... you get the picture

2.) Forward and inverse transformations

Now you can convert geographic coordinates into your chosen projected coordinate system.

>>> # Do the projection
>>> isn2004(-19.700,63.983)
(1665725.2429655411, 186813.38847515779)

Obviously, you want to capture the output:

>>> # Assign output to variables x and y
>>> x, y = isn2004(-19.700,63.983)
>>> x
1665725.2429655411
>>> y
186813.38847515779

You can also do the inverse transform:

>>> isn2004(x, y, inverse=True)
(-19.699999999999999, 63.982999999999564)

3.) Changing between different coordinate systems

In most cases, you will want to change between coordinate systems.  This is even the case with GPS or GoogleEarth data, which use the specific WGS84 datum.  Coordinate system changes are done with the transform function.

>>> # Convert x, y from isn2004 to UTM27N
>>> pyproj.transform(isn2004, UTM27N, x, y)
(563621.09135458851, 7095768.4648448965)

And when you have lots of data, transformations can be done with lists/tuples/arrays:

>>> lon = [-19.5, -19.7, -19.9]
>>> lat = [63.183, 63.583, 63.983]
>>> xx, yy = pyproj.transform(wgs84, isn2004, lon, lat)
>>> xx
[1674812.314071126, 1665231.4455360402, 1655933.043340466]
>>> yy
[97526.592642124306, 142220.30636996412, 186937.31348842022]

It’s a simple as that.

Pyproj can be installed with a single command on recent Linux systems (sudo apt-get install python-pyproj).  For other systems, check out the pyproj website.  If you are playing with coordinate transforms, then it is likely that at some point you are going to want plot stuff on a map.  Python can do maps, too;  check out the rest of the Matplotlib Basemap module.

British National Grid and the OSGB36 datum.

People working in the UK may have to go through another step to convert their data into the British National Grid format (BNG), which uses two-letter codes to define 100km-wide square regions instead of presenting the full 13-figure coordinates.  Thus Arthur’s Seat, an extinct volcano in the centre of Edinburgh, has a BNG grid reference of NT2755072950.  Below is a Python module that carries out the second step for you. Save the code as ‘BNG.py’ in a directory where Python can find it.

Note:  The script below was updated to fix a rounding error on 4 October 2013.  If you downloaded the code before then, you should replace it with the updated version below.

The function converts between a BNG grid reference as a string, and a tuple of OSGB36 (x,y) coordinates. When converting to BNG coordinates, there is an opportunity to specify how many figures to use.

BNG coordinates can be converted to GPS coordinates as follows:

>>> import BNG # import the BNG module
>>> BNG.to_osgb36('NT2755072950')
(327550, 672950)

Thus, you can get the GPS coordinates for Arthur’s Seat as follows:

>>> x, y = BNG.to_osgb36('NT2755072950')
>>> pyproj.transform(osgb36, wgs84, x, y)
(-3.1615548588213667, 55.944109545140932)

Use Python’s zip function handle multiple values:

>>> gridrefs = ['HU431392', 'SJ637560', 'TV374354']
>>> xy = BNG.to_osgb36(gridrefs)
>>> x, y = zip(*xy)
>>> x
(443100, 363700, 537400)
>>> y
(1139200, 356000, 35400)

You can convert OSGB36 coordinates to BNG coordinates like this:

>>> BNG.from_osgb36((327550, 672950))
'NT276730'

Again, use the zip function for multiple values. You can also specify the number of figures:

>>> x = [443143, 363723, 537395]
>>> y = [1139158, 356004, 35394]
>>> xy = zip(x, y)
>>> BNG.from_osgb36(xy, nDigits=4)
['HU4339', 'SJ6456', 'TV3735']

Note:  the coordinate transform between WGS84 and OSGB36 is complicated by some distortions in the British National Grid.  This is mainly important for high-precision work such as surveying or construction, and is described by the National Grid Transformation, OSTN02. To get the most accurate results, give Proj a datum shift grid file that describes the offsets.

Happy mapping!

Python module: BNG.py

#!/usr/bin/env python
# Filename: BNG.py

############################################################################
#
#  COPYRIGHT:  (C) 2012 John A Stevenson / @volcan01010
#			Magnus Hagdorn
#  WEBSITE: http://all-geo.org/volcan01010
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 3 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  http://www.gnu.org/licenses/gpl-3.0.html
#
#############################################################################/

__all__ = ['to_osgb36', 'from_osgb36']

try:
    import numpy as np
except ImportError:
    print "Numpy not installed.  Numpy comes with most scientific python packages."

import re

# Region codes for 100 km grid squares.
_regions=[['HL','HM','HN','HO','HP','JL','JM'],
	  ['HQ','HR','HS','HT','HU','JQ','JR'],
	  ['HV','HW','HX','HY','HZ','JV','JW'],
	  ['NA','NB','NC','ND','NE','OA','OB'],
	  ['NF','NG','NH','NJ','NK','OF','OG'],
	  ['NL','NM','NN','NO','NP','OL','OM'],
	  ['NQ','NR','NS','NT','NU','OQ','OR'],
	  ['NV','NW','NX','NY','NZ','OV','OW'],
	  ['SA','SB','SC','SD','SE','TA','TB'],
	  ['SF','SG','SH','SJ','SK','TF','TG'],
	  ['SL','SM','SN','SO','SP','TL','TM'],
	  ['SQ','SR','SS','ST','SU','TQ','TR'],
	  ['SV','SW','SX','SY','SZ','TV','TW']]
# Reshuffle so indices correspond to offsets
_regions=np.array( [ _regions[x] for x in range(12,-1,-1) ] )
_regions=_regions.transpose()

#-------------------------------------------------------------------------------
def to_osgb36(coords):
    """Reformat British National Grid references to OSGB36 numeric coordinates.
    Grid references can be 4, 6, 8 or 10 figures.  Returns a tuple of x, y.

    Examples:

    Single value
    >>> to_osgb36('NT2755072950')
    (327550, 672950)

    For multiple values, use the zip function
    >>> gridrefs = ['HU431392', 'SJ637560', 'TV374354']
    >>> xy=to_osgb36(gridrefs)
    >>> x, y = zip(*xy)
    >>> x
    (443100, 363700, 537400)
    >>> y
    (1139200, 356000, 35400)
    """
    #
    # Check for individual coord, or list, tuple or array of coords
    #
    if type(coords)==list:
        return [to_osgb36(c) for c in coords]
    elif type(coords)==tuple:
        return tuple([to_osgb36(c) for c in coords])
    elif type(coords)==type(np.array('string')):
        return np.array([ to_osgb36(str(c))  for c in list(coords) ])
    #
    # Input is grid reference...
    #
    elif type(coords)==str and re.match(r'^[A-Za-z]{2}(\d{6}|\d{8}|\d{10})$', coords):
        region=coords[0:2].upper()
        x_box, y_box = np.where(_regions==region)
        try: # Catch bad region codes
            x_offset = 100000 * x_box[0] # Convert index in 'regions' to offset
            y_offset = 100000 * y_box[0]
        except IndexError:
            raise ValueError('Invalid 100km grid square code')
        nDigits = (len(coords)-2)/2
        factor = 10**(5-nDigits)
        x,y = (int(coords[2:2+nDigits])*factor + x_offset,
               int(coords[2+nDigits:2+2*nDigits])*factor + y_offset)
        return x, y
    #
    # Catch invalid input
    #
    else:
        raise TypeError('Valid inputs are 6,8 or 10-fig references as strings e.g. "NN123321", or lists/tuples/arrays of strings.')

#-------------------------------------------------------------------------------
def from_osgb36(coords, nDigits=6):
    """Reformat OSGB36 numeric coordinates to British National Grid references.
    Grid references can be 4, 6, 8 or 10 fig, specified by the nDigits keyword.

    Examples:

    Single value
    >>> from_osgb36((327550, 672950))
    'NT276730'

    For multiple values, use the zip function
    >>> x = [443143, 363723, 537395]
    >>> y = [1139158, 356004, 35394]
    >>> xy = zip(x, y)
    >>> from_osgb36(xy, nDigits=4)
    ['HU4339', 'SJ6456', 'TV3735']
    """
    if (type(coords)==list):
        return [from_osgb36(c, nDigits=nDigits) for c in coords]
    #
    # Input is a tuple of numeric coordinates...
    #
    elif type(coords)==tuple:
        x, y = coords
        x_box=np.floor(x/100000.0)  # Convert offset to index in 'regions'
        y_box=np.floor(y/100000.0)
        x_offset=100000*x_box
        y_offset=100000*y_box
        try: # Catch coordinates outside the region
            region=_regions[x_box, y_box]
        except IndexError:
            raise ValueError('Coordinate location outside UK region')
    #
    # Format the output based on nDigits
    #
        formats={4:'%s%02i%02i', 6:'%s%03i%03i', 8:'%s%04i%04i', 10:'%s%05i%05i'}
        factors={4:1000.0, 6:100.0, 8:10.0, 10:1.0}
        try: # Catch bad number of figures
            coords=formats[nDigits] % (region, np.floor((x-x_offset)/factors[nDigits]), np.floor((y-y_offset)/factors[nDigits]))
        except KeyError:
            raise ValueError('Valid inputs for nDigits are 4, 6, 8 or 10')
        return coords
    #
    # Catch invalid input
    #
    else:
        raise TypeError('Valid inputs are x, y tuple e.g. (651409, 313177)')
Categories: Uncategorized

5 Comments

  1. Ria says:

    Hi,
    Recently I got stuck while transforming an everest transverse mercator Projected Coordinate System which has GCS_Everest as GCS, into WGS84(a vector file direcly digitized and obtained from Google Earth). The problem seemed really an
    easy one to solve. I undefined one of the projections and redefined it and
    later reprojected as the desired one. However, got no results. I have tried
    numerous other ways to do it.
    The problem here is I have two shapefiles in two different projections. If
    I had raster images, it was not a problem.

    I have been doing some reading on EPSG codes and trying to find an
    application to run on Python.Any help would be appreciated

    • I found these projections on Spatial Reference.org.

      ESRI:37006: GCS Everest Modified 1969
      ESRI:37202: GCS Everest Bangladesh
      ESRI:37203: GCS Everest India Nepal

      You can click each to get the proj4 details, e.g.
      +proj=longlat +a=6377276.345 +b=6356075.41314024 +no_defs

      You can use GDAL (www.gdal.org) to reproject different data types. It contains the OGR package which is used on vector data. You can also control GDAL via Python, although I haven’t tried myself, yet.

      https://pypi.python.org/pypi/GDAL/

      Hope that helps

  2. Dan Harasty says:

    I’d like to propose you have a bug in your code (and a fix for it). I really think that np.round() is the wrong operation at line 141 in the code. My justification is twofold:

    1) See the link provided, and watch the video. The BNG notation does not really identify points. Rather, it identifies squares. So you shouldn’t “round” a point to the “next nearest point”; rather, you should “floor” the value to find the grid square that contains the point — since that grid square is identified by it’s southwest (lower left) corner.

    2) The code (as written as of this posting) produced non-sensical results for certain inputs, such as from_osgb36((529900, 199900), nDigits=4). This results in TQ30100. The “100″ portion of “TQ30100″ is clearly wrong for nDigits = 4. I suggest the correct answer — using floor() — is TQ2999. However, if you insist on rounding, note that “bumps” you to the next tile north, an the correct BNG would be TL3000 (note “TL” instead of “TQ”).

Leave a comment:

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>