Basic Google Maps Manager with Python and OpenCV

All of the source for this project can be found here.

Lately I’ve been working to finalize the blimp project.  The idea is that rather than manually control a drone by manipulating every control, you can instead click on points on a map where the blimp will automatically guide itself.  The idea is that for aerial photography in particular as I’d initially intended, this frees the operator to instead manipulate the camera instead of worrying about flight controls.

However, this has some other benefits.  An operator could simply just want to be lazy and not have to worry about piloting.  Or there could be a safer flight, assuming all of the autopilot code worked properly.  Or, perhaps you’re trying to fly in a perfectly straight route over something and therefore GPS data is more reliable than eyeing the drone’s position.

Either way, I needed to interface with some map API and accomplish some basic tasks:

  • Load a map from the internets so that I could load any location in the world dynamically
  • Get GPS data, presumably from the blimp, and plot it as an x, y coordinate on the map
  • Convert mouse clicks into latitude, longitude coordinates that I could feed to the blimp

The results turned out pretty well. Check out this screen capture of the resulting program:

Interestingly, it was easy to validate that my x, y clicking was plotting the correct point because of the implicit conversions going on.  As you click on the screen, the window coordinates (i.e. a range from (0, 0) to (512, 512)) were converted to grid coordinates (i.e. (-256, -256) to (256, 256)).  The offset of the mouse clicks from the center were converted to degrees.  The degrees were added to the center lat, lon of the map.  Then to display the coordinates, the lat, lon coordinates were converted back to window x and y coordinates using the same process in reverse.

We can step through the code below:

class MapManager(object):

    BASE_URL = "http://maps.googleapis.com/maps/api/staticmap"

    def __init__(self, map_height, zoom, lat, lon):
        self.map_height = map_height
        self.zoom = zoom
        self.static_map = self.make_map_request(lat, lon)
        self.center_lat = lat
        self.center_lon = lon
        self.plotted_points = []

Nothing too crazy yet.  I just wanted to first define the class that we’re going to be working with.  You’ll notice that I’ve already defined the URL we’ll point to in order to make our requests (the google maps API).  The map is a square, so all we need is either the width or the height in pixels of the intended map.  The zoom is used to interface with the google maps API.  A zoom of 0 is a view of the entire earth.  Each increment of its value will result in making the map twice as large thus giving us half as much linear viewing space.  Notice that we go ahead and make a call to “make_map_request” which brings us to the next piece of code:

    def make_map_request(self, lat, lon):
        lat = "%s" % lat
        lon = "%s" % lon
        params = (self.BASE_URL, lat, lon, self.zoom, self.map_height, self.map_height)
        full_url = "%s?center=%s,%s&zoom=%s&size=%sx%s&sensor=false&maptype=satellite" % params
        response = urllib2.urlopen(full_url)
        png_bytes = np.asarray([ord(char) for char in response.read()], dtype=np.uint8)
        cv_array = cv2.imdecode(png_bytes, cv2.CV_LOAD_IMAGE_UNCHANGED)
        return cv_array

This method will make a call to the Google Maps API which will return a PNG image.  Here, I take the raw bytes and cast it as a Numpy Array.  The Numpy array can then be loaded as an OpenCV image.  Take note that in order to make this happen, the start of my file had these  imports:

import urllib2
import cv2
import numpy as np

Thus, when initializing this object, a map image will be loaded.  Now let’s back out and see where we are with some code that we’ll run:

# initialize the map
starting_coords = (37.79417, -122.412606667)  # my apartment...
manager = MapManager(MAP_HEIGHT, ZOOM, starting_coords[0], starting_coords[1])

# BGR Red
UNIVERSAL_COLOR = cv2.cv.Scalar(0, 0, 255)
cv2.namedWindow(WINDOW_NAME, cv2.CV_WINDOW_AUTOSIZE)
cv2.cv.SetMouseCallback(WINDOW_NAME, manager.mouse_callback)

So, line by line:

The initial coordinates of my apartment were grabbed from a GPS sensor that I was working with.  I was playing around with it today in the city, but I couldn’t get an actual reading until I went to my roof where there was a clear view of the Southern skies (FYI, satellites are in geosynchronous orbit, so if you’re in the Northern hemisphere then all satellites will be to your South).

Consistent with the class defined above, I pass the appropriate parameters, and an implicit google maps call is made.  Below that, I start delving into some OpenCV code.  OpenCV uses the BGR color space by default, so the value of red is (0, 0, 255).  I’ll initialize a window that automatically sizes to the image we pass it.  Then I’m going to set a mouse call back which brings us to the next piece of code:

    def mouse_callback(self, event, x, y, flag=0, param=None):
        if event == cv2.EVENT_LBUTTONDOWN:
            lat, lon = self.x_y_to_lat_lon(x, y)
            print lat, lon
            self.plot_point(lat, lon)

Note that I’m back inside of my MapManager class.  So to step through this code:

We define the mouse callback based on the definitions from OpenCV.  I don’t use the flag or param variables in this case, but I need to add those function headers in order for it to be registered as a callback.  There are multiple events that can be used such as mouse moves or right button clicks, but we only want to respond to standard mouse clicking (except I didn’t couple mouse up to mouse down).  Here we diverge into two function calls, but what’s basically happening logically is I’m taking mouse input and converting it to lat, lon points that are fed into a list.  Let’s dive into the x, y to lat, lon code:

    def _window_x_y_to_grid(self, x, y):
        '''
        converts graphical x, y coordinates to grid coordinates
        where (0, 0) is the very center of the window
        '''
        center_x = center_y = self.map_height / 2
        new_x = x - center_x
        new_y = -1 * (y - center_y)
        return new_x, new_y

    def x_y_to_lat_lon(self, x, y):
        grid_x, grid_y = self._window_x_y_to_grid(x, y)
        offset_x_degrees = (float(grid_x) / self.map_height) * self.degrees_in_map
        offset_y_degrees = (float(grid_y) / self.map_height) * self.degrees_in_map
        # lat = y, lon = x
        return self.center_lat + offset_y_degrees, self.center_lon + offset_x_degrees

X, Y points will be fed into this function as typical graphical x, y coordinates.  That is, there are no negative values, and (0, 0) is at the top left of the window.  Inside of my helper function, I convert those into grid coordinates where we can now imagine that an X and Y axis are split down the center of the window horizontally and vertically, and my coordinates and now have negative values.  Now, I can convert the deltas in X and Y relative to the center of the window as deltas in degrees.  In order to do so, we need to know how many decimal degrees are in the window.  We can do that like so:

    @property
    def degrees_in_map(self):
        '''
        This logic is based on the idea that zoom=0 returns 360 degrees
        '''
        return (self.map_height / 256.0) * (360.0 / pow(2, self.zoom))

As mentioned previously, 360 degrees total make up the amount of degrees in the circumference of the earth or the circumference of a sphere or circle.  At zoom 0, we can see the entire map.  Therefore, as we increase the zoom we half the amount of visible space in the map.  Now simply apply that number to the ratio of the map size relative to 256 pixels, which I guess is the standard size for google maps.  This will return a value in degrees which humans realistically care nothing about.  I wanted to know what this information meant in meters.  We can do that too:

    def degrees_to_meters(self, degrees):
        equator_length_km = 40008
        km_per_degree = equator_length_km / 360.0
        m_per_degree = km_per_degree * 1000
        return degrees * m_per_degree

    @property
    def linear_meters_in_map(self):
        meters_in_map = self.degrees_to_meters(self.degrees_in_map)
        return meters_in_map

Here we combine two logical steps: convert degrees to meters, and thus convert the degrees in the map to total meters in the map.  Although this logic isn’t 100% spot on, it certainly meets my needs.  The logic is simple and in this case doesn’t need much explanation.  Take the total diameter of the earth and the total degrees covered by the earth’s circumference (360) and simply take the ratio of degrees in the map compared to the total kilometers of earth’s circumference.

Now let’s step back out to our code that’s calling the class:

while True:
    img = manager.static_map
    for (x, y) in manager.get_plotted_points_as_x_y_list():
        cv2.circle(img, center=(x, y), radius=5, color=UNIVERSAL_COLOR, thickness=-1)
    cv2.imshow(WINDOW_NAME, img)
    cv2.waitKey(1)

The map itself is already saved inside the instance of the MapManager class and therefore we can draw it with OpenCV.  But I also want to draw mouse clicks and validate that the clicks are read as lat and lon coordinates.  As you recall, we set up a callback function for the mouse that saves lat, lon coordinates to a list.  That list can be read from get_plotted_points_as_x_y_lists() (I won’t step through that code, if you want to see that you can look in Github).  On a mouse click, lat, lon points are saved inside the object.  Now outside of the object inside of a view, I can get those coordinates back as window coordinates (thus validating that all of my calculations were done correctly) and draw them appropriately.

I stepped through about half the code since the other half is simply working the same things in reverse.  If you want to do any sort of playing around with Google Maps, this should give you a pretty good start!