Data September 28, 2016 3 min read

What Pokemon Go's outages teach about location data at scale

Pokemon Go is not a business application, but its infrastructure story in the summer of 2016 is a real lesson in what location data at scale actually costs.

Pokemon Go launched in July 2016 and immediately became one of the most discussed infrastructure failures of the year. Niantic had a real-time location product at a scale almost no company had tried before, and the system collapsed repeatedly in the first weeks. Server errors became a running joke, but the underlying problem was serious: they had a data model built for a certain load and got ten times that on day one.

I am not writing this to critique Niantic's engineering. I am writing it because several people I spoke to that summer said something along the lines of "well that's a consumer app, that doesn't happen to us." And I think that reasoning is too fast.

What went wrong technically

Pokemon Go is fundamentally a real-time location application. Every player is a moving coordinate. Every action - catching a creature, spinning a stop, entering a gym - requires a server round-trip with a precise geolocation. At peak, there were tens of millions of simultaneous sessions doing this.

The core challenge is that location data has a property that most business data does not: it is high-frequency, it arrives continuously, and its value decays quickly. Yesterday's location of a delivery driver is almost meaningless. A second-old location matters. You cannot batch this the way you batch sales figures.

Niantic used Google Cloud infrastructure and worked with Google to scale up, but the initial model had not anticipated this data velocity. The architecture had to be rebuilt under live fire.

Why this matters for non-consumer applications

Several real business domains have the same pattern: field service management, logistics and fleet tracking, industrial sensor monitoring, and asset management in warehouses or ports. These are not toy problems.

If you have 200 field technicians reporting location every 30 seconds and you are storing all of that in a transactional relational database designed for order management, you will eventually hit the same wall - just more slowly and less visibly than Niantic did.

The questions to ask before building

If your application has a location or sensor data component, the design questions are:

What is the update frequency per device, and how many devices are expected at peak?
What is the actual retention requirement - do you need six months of historical tracks, or only the last known position?
What queries will you run against this data - nearest-neighbour lookups, history replays, geofence triggers?
Is the read latency requirement strict, or is a few seconds of lag acceptable?

The answers to those questions determine whether a standard relational database is adequate, or whether you need a time-series store, a spatial database extension, or a streaming pipeline in front.

A practical recommendation

Separate your location data from your transactional data early. Even if you store both in PostgreSQL, keep them in different tables with different index strategies, different retention policies, and a clear plan for what happens when the location table grows by ten times faster than the rest of the database.

The Pokemon Go story is an extreme case. But the engineering failure it illustrates - underestimating the volume and velocity of continuous data - is common. The lesson is not "don't build location features." It is "price the infrastructure honestly before you commit to the data model."

Back to all posts

Contact

What went wrong technically

Why this matters for non-consumer applications

The questions to ask before building

A practical recommendation

If this resonated, write to me. I reply personally.