I am an avid Google user. I think (most) of their products are great. I love Google Maps, Calendar, Drive, etc. I used to share my location through Google Latitude. With Latitude, you could ping your friends and they could share their location with you. That product has been retired for a couple of years now, but thanks to that and never turning off my GPS sharing to Google on my smart phone, I have an extensive record of my location through Google.
I don't know why I never turned it off (curiosity, perhaps), but in case you don't know, Google Location History can be turned on/off depending on your preferences (see screenshot above). If you use Google and are signed in, the link to your location history (timeline) is here: https://www.google.com/maps/timeline I am not quite sure what Google is doing with my location data, but I do like how accessible/transparent it is to download my own data. My friend tried downloading his location data from Apple and was unsuccessful. I also like the fact that I could tell Google to stop collecting location data or I could even erase my location history (see screenshot below). To be honest though, I'm not sure if they'd keep a copy of it on their end (would they delete my historical data permanently?) Being the data fan that I am, I opt for both parties having the location data. Not sure why, but I trust you, Google.
On to the data post: On November 5th, I decided to download a copy of all my location data. The location history came in a zipped JSON file with about 7.3 million rows which included data on latitude, longitude, time, accuracy, velocity, heading, altitude, and vertical accuracy. I'm not sure what those last 5 variables are, so I ignored them. I parsed the JSON file using Stata and kept one latitude and longitude set per timestamp, leaving me with 765,250 rows of data. The earliest observation is from December of 2010! That's almost 5 years worth of data.
I mapped my Google location points that were within the contiguous United States using Stata below. As expected, it looks very similar to the map above that Google gives me in my timeline, and includes points where I'm traveling. As you can see, I drove from Florida to California once, and many times to/from Southern California and Northern California. What interests me most is to see the time stamps that go along with all the location data. From this data, one could deduce:
You get the picture. With all this, Google can make suggestions and say:
Well, it never did tell me the last one, but the first two suggestions were definitely made. If Google were to create or purchase an activity tracker such as Fitbit, it would certainly know when you're sleeping (given that you have a Charge HR). How well you sleep. When you take a walk or a run, do you take your phone? I mean, the possibilities are endless. Google would know absolutely everything. Email: what you write/what you buy. Search Engine: What you search for/what you think/talk about. Calendar: what you do. Maps: where you're at. Fitness tracker: how active you are. Downloading my location history was a blast to the past and I plan on seeing what else I uncover. I'll keep you posted on any additional discoveries I have on my own data. I've included the Stata code I used below: /************************************************************************** Author: Belen Chavez Purpose: Parse JSON file and clean Google Location data ***************************************************************************/ clear version 14.1 cd "c:\users\bchavez\desktop" global fname LocationHistory.json tempfile f1 f2 f3 filefilter $fname `f1', f("}, {") t("\n") filefilter `f1' `f2', f(",\n") t(",") filefilter `f2' `f3', f(\n\n) t(\n) import delimited `f3', delimiter(",") drop in 1/2 forv i = 1/8{ replace v`i' = trim(v`i') } drop if v1=="" keep if regexm(v1,"timestamp")==1 keep if regexm(v2,"lat") assert regexm(v3,"lon") drop v4-v8 forv i = 1/3{ replace v`i' = subinstr(v`i', `"""', "", .) local name = substr(v`i', 1, strpos(v`i', " :")-1) cap ren v`i' `name' replace `name' = subinstr(`name', "`name' :","",.) } destring *, replace replace latit = latit/10000000 replace longi = longi/10000000 replace times = (clock("1970", "Y")-clock("1960", "Y"))+times-8*60*60*1000 format times %tc Comments are closed.
|
AuthorMy name is Belen, I like to play with data using Stata during work hours and in my free time. I like blogging about my Fitbit, Stata, and random musings. Archives
March 2018
Categories
All
|