It’s no secret that Google is a data hoarder. It wants to know everything about you including your location to provide relevant experiences from driving directions to selling ads.  To be less creepy and more transparent about the data it has collected on you, Google has provided a way to export your data. In this blog post, I will take my own personal data and show you how to export it and put it to use based on my various locations. Typically, when working with GIS data, you have to manage a lot of GPS points. Working with a dataset with over a million data points is a comfortable walk in the park nowadays. Elasticsearch is a powerful search database that makes querying these large data sets fast and simple. Combined with Elasticsearch’s support for geo_point type data and built-in geo functions, it’s positioned to be a very powerful tool for visualizing geo-spatial information. Kibana is a powerful visualization tool for Elasticsearch.

So, let’s dive into how you can put “your” (let’s be honest, it’s Google’s) location history data to use. 

Export Your Location Data

Head over to Google’s Takeout service to get started https://takeout.google.com/. Here you can export data for all of Google’s products, download the whole kitchen sink if you want. For this article, we will only be discussing data from the “Location History” service. Make sure that box is checked and then scroll down to the bottom of the page and select next. Choose the download format and click “Create Export”. Google will send you an email with a link to download your data. Depending on the amount of data you export, it may take minutes or hours to receive your download. I got an email within a few minutes. The total size of my download was around 30MB.

Inspect Your Data

Inside the extracted data I found a huge file labeled ‘Location History.json’ and a helpful ‘archive_browser.html’ file, which I opened in my browser and found some very helpful information regarding the structure of the JSON data. Here is a sample of a single data point:

{
 "timestampMs": "1583870424946",
 "latitudeE7": 392699588,
 "longitudeE7": -778482117,
 "accuracy": 22,
 "velocity": 0,
 "heading": 0,
 "altitude": 120,
 "verticalAccuracy": 2,
 "activity": [
   {
     "timestampMs": "1583870364378",
     "activity": [
       {
         "type": "STILL",
         "confidence": 99
       },
       {
         "type": "UNKNOWN",
         "confidence": 1
       }
     ]
   },
   {
     "timestampMs": "1583870425031",
     "activity": [
       {
         "type": "STILL",
         "confidence": 99
       },
       {
         "type": "UNKNOWN",
         "confidence": 1
       }
     ]
   },
   {
     "timestampMs": "1583870485297",
     "activity": [
       {
         "type": "STILL",
         "confidence": 99
       },
       {
         "type": "UNKNOWN",
         "confidence": 1
       }
     ]
   }
 ]
}

A lot of interesting data here. For our visualizations, I am going to take a look at three different data points, and how they change over time. Those data points are velocity, accuracy, and altitude. I want to show how these values change over time and visualize those values on a map. 

Import Your Data into Elasticsearch

Before you can begin visualizing your data in Kibana, I need to get the data into an Elasticsearch cluster. For my purposes, I installed Elasticsearch and Kibana on my laptop. You can do the same, or use the official Docker images, or use Elastic Cloud to spin up a cluster. I wrote a quick NodeJS script to import the data. Elasticsearch is a document based database so data is stored in a JSON like syntax. This makes going from the JSON file to Elasticsearch very easy. There is only one main data transformation that I need to make before I can upload our data to Elasticsearch. If you looked closely at the sample data point, you would have noticed that the latitude and longitude values are labeled “latitudeE7” with a 9 digit integer value. In the archive_browser.html document I learned that Google transforms lat/long into an integer by multiplying it by 107. Elasticsearch expects the format to be in degrees, so I need to divide the latitude and longitude by 10,000,000 to bring the decimal place back to the proper spot. The last thing I need to do before I begin uploading the data is to create an index in Elasticsearch with the proper fieldtype mappings. Here is the mapping that I used. The most important thing to note here is that timestampMs is a date and location is a GeoPoint.

mappings: {
 properties: {
   timestampMs: { type: 'date' },
   location: { type: 'geo_point' },
   accuracy: { type: 'integer' },
   velocity: { type: 'integer' },
   heading: { type: 'integer' },
   altitude: { type: 'integer' },
   verticalAccuracy: { type: 'integer' },
   activity: {
     properties: {
       timestampMs: { type: 'date' },
       type: { type: 'text'},
       confidence: { type: 'integer' }
     }
   }
 }
} 

Visualizing Your Data in Kibana

Kibana is a really cool and powerful visualization tool. You can explore your data with ad hoc queries, create interesting dashboards, create reports, create a map display, manage your Elasticsearch cluster, and much more. There is so much that at times it can feel a little daunting. I will focus on just a couple of basic visualizations.

To start, here is a screenshot of a dashboard that I created.

My first stop when creating this dashboard was to get a count of the number of data points I was working with. The JSON file that I uploaded was 502MB. I used a metric visualization to display a simple count in the upper left-hand corner. Google has been collecting location data on me since 2012. There were a couple of years when I owned an iPhone and not an Android from 2013 – 2014. But from mid-2014 until today, I have owned an Android with Google Maps installed which has produced just over a million data points to work with. (Did I mention Google is a data hoarder?)

Next, I wanted to map the GPS coordinates to visualize the places I have been over the last eight years. Kibana, by default, uses OpenStreetMap tiles. There is an option to use a different WMS map server if you want to. For this exploration, I used the default. It also provides several marker options for the points. I played with circles and a geohash grid, but ultimately decided that the heat map was the coolest.

Next, I wanted to look at how accuracy, velocity, and altitude have changed over time. I used the Timelion visualization in Kibana. Timelion was specifically built for time series data. I created three different Timelion expressions. You can learn more about them here. Each of my expressions were very simple. Here is an example. `.es(index=locations, metric=avg:velocity,timefield=timestampMs)`. I replaced the metric value with the aggregation:field that I want to be displayed. This resulted in a visualization that looked like this:

What is noteworthy is that Google does not seem to have started gathering Velocity and Altitude data until roughly November 2017. 

Some other interesting points in my location history over the last year showed a spike in the accuracy. I was curious to find out where I was when that happened, and there was a spike in my altitude. Let’s dig into it!

First, the accuracy spike – I narrowed my date range filter to just the time during the spike. Come to find out it was while I was on a company off-site down in the Caribbean. Now I know I can get a break from life and Google in the Caribbean! Piña Coladas, anyone?

Next, I looked at the spike in altitude. This one was less surprising. I have family that lives in Southern Utah, which has a higher elevation than the east coast, so our 2-week trip out there caused a bump in my average altitude.

Conclusion

Hopefully, you found this exploration as interesting as I did. I learned even more about the Elastic Stack, which I know will benefit our clients. It was also eye opening to learn that Google is storing way more data about me than I realized. Maybe it is time to go back to Apple?


Categories: case studies

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *