GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS

Written by thoughtbot

How To Efficiently Display Large Amounts of Data on iOS Maps

This tutorial will demonstrate how to handle and display thousands of points of data on an iOS map in a way people understand and enjoy.

We are going to make an iOS app which ships with 87,000 hotels, each with a coordinate, a name and a phone number. This app will never ask the user to "redo search in area"; it will update the map as the user pans and zooms, allowing the user to freely explore the data.

This will require us to come up with an ultra quick data structure built for the task. We will need to build it in C for it to be performant. Once we have constructed our data structure we will come up with a clustering system, as to not overwhelm the user. Finally, we will give it the professional polish that is required for apps to compete in today's market.

By the end of this tutorial you will have all the components needed to make this app:

final app

The data structure

To come up with a good data structure for this problem, we first need to look at our data and determine we want to do with it. Our data is represented by a set of coordinates (latitude, longitude) that we need to plot onto a map. The map is able to zoom and pan, meaning that we won't need to have all the points up on screen. At the core this is a search problem. We need to find all coordinates which lie in a certain range, i.e within a rectangle.

A simple solution is to iterate through all our points and ask: (xMin <= x <= xMax and yMin <= y <= yMax). Unfortunately, this is a O(N) solution and won't work for our scenario.

A good approach to search problems, is to look for exploitable symmetries which reduce the search space as we search. So how can we reduce our search space after every iteration of a search? If instead of indexing by points, we index by region, we reduce our search space. This type of spatial indexing is achieved by a quad tree, which queries at O(H) (where H is the tree's height for a particular point).

Quad Tree

A quad tree is a data structure comprising nodes which store a bucket of points and a bounding box. Any point which is contained within the node's bounding box is added to its bucket. Once the bucket gets filled up, the node splits itself into four nodes, each with a bounding box corresponding to a quadrant of its parents bounding box. All points which would have gone into the parent's bucket now go into one of its children's buckets.

So how do we code this quad tree?

Lets define the base structures:

typedef struct TBQuadTreeNodeData {
    double x;
    double y;
    void* data;
} TBQuadTreeNodeData;
TBQuadTreeNodeData TBQuadTreeNodeDataMake(double x, double y, void* data);

typedef struct TBBoundingBox {
    double x0; double y0;
    double xf; double yf;
} TBBoundingBox;
TBBoundingBox TBBoundingBoxMake(double x0, double y0, double xf, double yf);

typedef struct quadTreeNode {
    struct quadTreeNode* northWest;
    struct quadTreeNode* northEast;
    struct quadTreeNode* southWest;
    struct quadTreeNode* southEast;
    TBBoundingBox boundingBox;
    int bucketCapacity;
    TBQuadTreeNodeData *points;
    int count;
} TBQuadTreeNode;
TBQuadTreeNode* TBQuadTreeNodeMake(TBBoundingBox boundary, int bucketCapacity);

TBQuadTreeNodeData struct will hold our coordinate (latitude and longitude).

void* data is a generic pointer and should be used to store whatever extra info we may need, i.e hotel name and phone number.

TBBoundingBox struct represents a rectangle defined like a range query i.e (xMin <= x <= xMax && yMin <= y <= yMax) where your top left corner is (xMin, yMin) and the bottom right is (xMax, yMax).

Finally, we see TBQuadTreeNode, which holds four pointers – one for each of its quadrants. It has a boundary and an array (bucket).

The spatial indexing happens as we build the quad tree. Here is an animation of a quad tree being built:

quad tree build

The following code is exactly what happens in the above animation:

void TBQuadTreeNodeSubdivide(TBQuadTreeNode* node)
{
    TBBoundingBox box = node->boundingBox;

    double xMid = (box.xf + box.x0) / 2.0;
    double yMid = (box.yf + box.y0) / 2.0;

    TBBoundingBox northWest = TBBoundingBoxMake(box.x0, box.y0, xMid, yMid);
    node->northWest = TBQuadTreeNodeMake(northWest, node->bucketCapacity);

    TBBoundingBox northEast = TBBoundingBoxMake(xMid, box.y0, box.xf, yMid);
    node->northEast = TBQuadTreeNodeMake(northEast, node->bucketCapacity);

    TBBoundingBox southWest = TBBoundingBoxMake(box.x0, yMid, xMid, box.yf);
    node->southWest = TBQuadTreeNodeMake(southWest, node->bucketCapacity);

    TBBoundingBox southEast = TBBoundingBoxMake(xMid, yMid, box.xf, box.yf);
    node->southEast = TBQuadTreeNodeMake(southEast, node->bucketCapacity);
}

bool TBQuadTreeNodeInsertData(TBQuadTreeNode* node, TBQuadTreeNodeData data)
{
    // Bail if our coordinate is not in the boundingBox
    if (!TBBoundingBoxContainsData(node->boundingBox, data)) {
        return false;
    }

    // Add the coordinate to the points array
    if (node->count < node->bucketCapacity) {
        node->points[node->count++] = data;
        return true;
    }

    // Check to see if the current node is a leaf, if it is, split
    if (node->northWest == NULL) {
        TBQuadTreeNodeSubdivide(node);
    }

    // Traverse the tree
    if (TBQuadTreeNodeInsertData(node->northWest, data)) return true;
    if (TBQuadTreeNodeInsertData(node->northEast, data)) return true;
    if (TBQuadTreeNodeInsertData(node->southWest, data)) return true;
    if (TBQuadTreeNodeInsertData(node->southEast, data)) return true;

    return false;
}

Now that we have a quad tree built we need to be able to perform bounding box queries on it and have it return our TBQuadTreeNodeData structs. Here is an animation of a bounding box query for all annotations in the light blue region. When an annotation is found it turns green.

quad tree query

And here is the query code:

typedef void(^TBDataReturnBlock)(TBQuadTreeNodeData data);

void TBQuadTreeGatherDataInRange(TBQuadTreeNode* node, TBBoundingBox range, TBDataReturnBlock block)
{
    // If range is not contained in the node's boundingBox then bail
    if (!TBBoundingBoxIntersectsBoundingBox(node->boundingBox, range)) {
        return;
    }

    for (int i = 0; i < node->count; i++) {
        // Gather points contained in range
        if (TBBoundingBoxContainsData(range, node->points[i])) {
            block(node->points[i]);
        }
    }
    
    // Bail if node is leaf
    if (node->northWest == NULL) {
        return;
    }

    // Otherwise traverse down the tree
    TBQuadTreeGatherDataInRange(node->northWest, range, block);
    TBQuadTreeGatherDataInRange(node->northEast, range, block);
    TBQuadTreeGatherDataInRange(node->southWest, range, block);
    TBQuadTreeGatherDataInRange(node->southEast, range, block);
}

This data structure can perform bounding box queries at lightning speed. It can easily handle hundreds of queries, in a database containing hundreds of thousands of points, at 60 fps.

Creating the quadtree using the hotel data

The hotel data comes from POIplaza and comes formatted in CSV. We are going to read from the disk, parse the data, and build a quad tree from it.

The code to build the quad tree will reside in TBCoordinateQuadTreeController

typedef struct TBHotelInfo {
    char* hotelName;
    char* hotelPhoneNumber;
} TBHotelInfo;

TBQuadTreeNodeData TBDataFromLine(NSString *line)
{
    // Example line:
    // -80.26262, 25.81015, Everglades Motel, USA-United States, +1 305-888-8797
    
    NSArray *components = [line componentsSeparatedByString:@","];
    double latitude = [components[1] doubleValue];
    double longitude = [components[0] doubleValue];

    TBHotelInfo* hotelInfo = malloc(sizeof(TBHotelInfo));

    NSString *hotelName = [components[2] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    hotelInfo->hotelName = malloc(sizeof(char) * hotelName.length + 1);
    strncpy(hotelInfo->hotelName, [hotelName UTF8String], hotelName.length + 1);

    NSString *hotelPhoneNumber = [[components lastObject] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    hotelInfo->hotelPhoneNumber = malloc(sizeof(char) * hotelPhoneNumber.length + 1);
    strncpy(hotelInfo->hotelPhoneNumber, [hotelPhoneNumber UTF8String], hotelPhoneNumber.length + 1);

    return TBQuadTreeNodeDataMake(latitude, longitude, hotelInfo);
}

- (void)buildTree
{
    NSString *data = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"USA-HotelMotel" ofType:@"csv"] encoding:NSASCIIStringEncoding error:nil];
    NSArray *lines = [data componentsSeparatedByString:@"\n"];

    NSInteger count = lines.count - 1;

    TBQuadTreeNodeData *dataArray = malloc(sizeof(TBQuadTreeNodeData) * count);
    for (NSInteger i = 0; i < count; i++) {
        dataArray[i] = TBDataFromLine(lines[i]);
    }

    TBBoundingBox world = TBBoundingBoxMake(19, -166, 72, -53);
    _root = TBQuadTreeBuildWithData(dataArray, count, world, 4);
}

With this we now have a full quad tree built using data preloaded on the iPhone. This will allow us to tackle the next part of our app: clustering.

Clustering

Now that we have a quad tree loaded up with our hotel data, we can solve the clustering problem. First, let's explore the reasoning behind clustering. We cluster because we want to avoid overwhelming the user with data. There are many ways to do this; Google Maps does it by just showing a portion of their overall data depending on your zoom level. The more you zoom in, the finer the detail gets until you see all the available annotations. We are going to take the clustering approach because it allows us to show off the amount of data we have while reducing the number of points on the map.

The final annotations will be represented as a circle with the number of hotels in the middle. Our approach will be the same as a simplistic image downsizing operation. We will draw a grid over the map. Then for each cell in the grid we will derive a clustered annotation from the hotels contained within that cell. We determine the coordinate of the cluster by taking the average over the coordinates of each hotel.

Here is an animation of the process:

clustering

Now for some code. Add a method to the TBCoordinateQuadTree:

- (NSArray *)clusteredAnnotationsWithinMapRect:(MKMapRect)rect withZoomScale:(double)zoomScale
{
    double TBCellSize = TBCellSizeForZoomScale(zoomScale);
    double scaleFactor = zoomScale / TBCellSize;

    NSInteger minX = floor(MKMapRectGetMinX(rect) * scaleFactor);
    NSInteger maxX = floor(MKMapRectGetMaxX(rect) * scaleFactor);
    NSInteger minY = floor(MKMapRectGetMinY(rect) * scaleFactor);
    NSInteger maxY = floor(MKMapRectGetMaxY(rect) * scaleFactor);

    NSMutableArray *clusteredAnnotations = [[NSMutableArray alloc] init];

    for (NSInteger x = minX; x <= maxX; x++) {
        for (NSInteger y = minY; y <= maxY; y++) {

            MKMapRect mapRect = MKMapRectMake(x / scaleFactor, y / scaleFactor, 1.0 / scaleFactor, 1.0 / scaleFactor);
            
            __block double totalX = 0;
            __block double totalY = 0;
            __block int count = 0;

            TBQuadTreeGatherDataInRange(self.root, TBBoundingBoxForMapRect(mapRect), ^(TBQuadTreeNodeData data) {
                totalX += data.x;
                totalY += data.y;
                count++;
        	});

            if (count >= 1) {
                CLLocationCoordinate2D coordinate = CLLocationCoordinate2DMake(totalX / count, totalY / count);
                TBClusterAnnotation *annotation = [[TBClusterAnnotation alloc] initWithCoordinate:coordinate count:count];
                [clusteredAnnotations addObject:annotation];
            }
        }
    }

    return [NSArray arrayWithArray:clusteredAnnotations];
}

The above method will make clustered annotations for us for a defined cell size. Now all we need to do is add these to an MKMapView. To do so we make a UIViewController subclass with an MKMapView as its view. We want the annotations to update whenever the visible region changes so we implement the mapView:regionDidChangeAnimated: protocol method.

- (void)mapView:(MKMapView *)mapView regionDidChangeAnimated:(BOOL)animated
{
    [[NSOperationQueue new] addOperationWithBlock:^{
        double zoomScale = self.mapView.bounds.size.width / self.mapView.visibleMapRect.size.width;
        NSArray *annotations = [self.coordinateQuadTree clusteredAnnotationsWithinMapRect:mapView.visibleMapRect withZoomScale:zoomScale];

        [self updateMapViewAnnotationsWithAnnotations:annotations];
    }];
}

Adding only the annotations we need

We want to spend as little time operating on the main thread as possible, which means we place everything inside a background operation queue. To reduce the amount of time spent on the main thread we need to make sure to only draw what is necessary. This will avoid unnecessary blocks when scrolling and ensure a smooth, uninterrupted experience.

To start, let's take a look at the following image: adding annotations

The leftmost screenshot is a snapshot of the map immediately before the region has been changed. The annotations in this snapshot are exactly the mapView's current annotations. We will call this the before set.

The rightmost screenshot is a snapshot of the map immediately after the region changed. This snapshot is exactly what we get back from clusteredAnnotationsWithinMapRect:withZoomScale:. We will call this the after set.

We want to keep all the annotations which both sets have in common (set intersection). Get rid of the ones which are not in the after set and add the ones which are left.

- (void)updateMapViewAnnotationsWithAnnotations:(NSArray *)annotations
{
    NSMutableSet *before = [NSMutableSet setWithArray:self.mapView.annotations];
    NSSet *after = [NSSet setWithArray:annotations];

    // Annotations circled in blue shared by both sets
    NSMutableSet *toKeep = [NSMutableSet setWithSet:before];
    [toKeep intersectSet:after];

    // Annotations circled in green
    NSMutableSet *toAdd = [NSMutableSet setWithSet:after];
    [toAdd minusSet:toKeep];

    // Annotations circled in red
    NSMutableSet *toRemove = [NSMutableSet setWithSet:before];
    [toRemove minusSet:after];

    // These two methods must be called on the main thread
    [[NSOperationQueue mainQueue] addOperationWithBlock:^{
        [self.mapView addAnnotations:[toAdd allObjects]];
        [self.mapView removeAnnotations:[toRemove allObjects]];
    }];
}

With this we are sure to be doing as little work as possible on the main thread, which will improve scrolling performance.

Next we are going to look at how to draw the annotations to make them reflect their different counts. Then we will add the final touches which will allow the app to stand toe-to-toe with the best out there.

Drawing the annotations

Because we are reducing the number of data points a user can see, we need to make each of those points reflect how much data we really have.

We are going to make a circular annotation with the cluster count in the middle. The circle's size will reflect the number of hotels in that cluster.

To do this we need to find an equation which will allow us to scale values on the order of 1 to 500+. We will use this equation:

scale equation, alpha = 0.3, beta = 0.4

Here lower values of x will grow quickly and tail off as x gets larger. Beta will control how fast our values go to 1, while alpha affects the lower bounds (in our case we want the smallest cluster (count of 1) to be 60% of our largest size).

static CGFloat const TBScaleFactorAlpha = 0.3;
static CGFloat const TBScaleFactorBeta = 0.4;

CGFloat TBScaledValueForValue(CGFloat value)
{
    return 1.0 / (1.0 + expf(-1 * TBScaleFactorAlpha * powf(value, TBScaleFactorBeta)));
}

- (void)setCount:(NSUInteger)count
{
    _count = count;

    // Our max size is (44,44)
    CGRect newBounds = CGRectMake(0, 0, roundf(44 * TBScaledValueForValue(count)), roundf(44 * TBScaledValueForValue(count)));
    self.frame = TBCenterRect(newBounds, self.center);

    CGRect newLabelBounds = CGRectMake(0, 0, newBounds.size.width / 1.3, newBounds.size.height / 1.3);
    self.countLabel.frame = TBCenterRect(newLabelBounds, TBRectCenter(newBounds));
    self.countLabel.text = [@(_count) stringValue];

    [self setNeedsDisplay];
}

This takes care of scaling our annotations; now lets make them look nice.

- (void)setupLabel
{
    _countLabel = [[UILabel alloc] initWithFrame:self.frame];
    _countLabel.backgroundColor = [UIColor clearColor];
    _countLabel.textColor = [UIColor whiteColor];
    _countLabel.textAlignment = NSTextAlignmentCenter;
    _countLabel.shadowColor = [UIColor colorWithWhite:0.0 alpha:0.75];
    _countLabel.shadowOffset = CGSizeMake(0, -1);
    _countLabel.adjustsFontSizeToFitWidth = YES;
    _countLabel.numberOfLines = 1;
    _countLabel.font = [UIFont boldSystemFontOfSize:12];
    _countLabel.baselineAdjustment = UIBaselineAdjustmentAlignCenters;
    [self addSubview:_countLabel];
}

- (void)drawRect:(CGRect)rect
{
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetAllowsAntialiasing(context, true);

    UIColor *outerCircleStrokeColor = [UIColor colorWithWhite:0 alpha:0.25];
    UIColor *innerCircleStrokeColor = [UIColor whiteColor];
    UIColor *innerCircleFillColor = [UIColor colorWithRed:(255.0 / 255.0) green:(95 / 255.0) blue:(42 / 255.0) alpha:1.0];

    CGRect circleFrame = CGRectInset(rect, 4, 4);

    [outerCircleStrokeColor setStroke];
    CGContextSetLineWidth(context, 5.0);
    CGContextStrokeEllipseInRect(context, circleFrame);

    [innerCircleStrokeColor setStroke];
    CGContextSetLineWidth(context, 4);
    CGContextStrokeEllipseInRect(context, circleFrame);

    [innerCircleFillColor setFill];
    CGContextFillEllipseInRect(context, circleFrame);
}

Final touches

Now that we have annotations which do a good job at representing our data, let's add a few final touches to make the app fun to use.

First we need to make an animation for the addition of new annotations. Without an animation, new annotations just appear out of nowhere and the whole experience is a bit jarring.

- (void)addBounceAnnimationToView:(UIView *)view
{
    CAKeyframeAnimation *bounceAnimation = [CAKeyframeAnimation animationWithKeyPath:@"transform.scale"];

    bounceAnimation.values = @[@(0.05), @(1.1), @(0.9), @(1)];

    bounceAnimation.duration = 0.6;
    NSMutableArray *timingFunctions = [[NSMutableArray alloc] init];
    for (NSInteger i = 0; i < 4; i++) {
        [timingFunctions addObject:[CAMediaTimingFunction functionWithName:kCAMediaTimingFunctionEaseInEaseOut]];
    }
    [bounceAnimation setTimingFunctions:timingFunctions.copy];
    bounceAnimation.removedOnCompletion = NO;

    [view.layer addAnimation:bounceAnimation forKey:@"bounce"];
}

- (void)mapView:(MKMapView *)mapView didAddAnnotationViews:(NSArray *)views
{
    for (UIView *view in views) {
        [self addBounceAnnimationToView:view];
    }
}

Next, we want to change our clustering grid based on how zoomed in we are on the map. We want to cluster smaller cells the closer we are. To do this we define the current zoom level of our map, where scale = mapView.bounds.size.width / mapView.visibleMapRect.size.width:

NSInteger TBZoomScaleToZoomLevel(MKZoomScale scale)
{
    double totalTilesAtMaxZoom = MKMapSizeWorld.width / 256.0;
    NSInteger zoomLevelAtMaxZoom = log2(totalTilesAtMaxZoom);
    NSInteger zoomLevel = MAX(0, zoomLevelAtMaxZoom + floor(log2f(scale) + 0.5));

    return zoomLevel;
}

Now that we have integer values for each zoomLevel:

float TBCellSizeForZoomScale(MKZoomScale zoomScale)
{
    NSInteger zoomLevel = TBZoomScaleToZoomLevel(zoomScale);

    switch (zoomLevel) {
        case 13:
        case 14:
        case 15:
            return 64;
        case 16:
        case 17:
        case 18:
            return 32;
        case 19:
            return 16;

        default:
            return 88;
    }
}

Now as we start zooming in you should see more, smaller clusters, until finally we see every hotel as a single count cluster.

Checkout out the complete working version of the app on GitHub.