At the Children’s Hospital of Pittsburgh’s Primary Care Center in Oakland many patients rely on public transit for transportation to and from their doctor appointment. To support physician researcher Dr. Ana Malinow in investigating the extent to which long travel times negatively impact appointment keeping, I created an ArcMap accessibility utility to estimate the time in transit to reach the Oakland clinic from any location in Allegheny county. Compared with using Google Maps or another travel planning service, the ArcMap accessibility utility 1) can estimate thousands of travel timesContinue Reading “Transit Time to Clinic”

Having previously established that Pittsburgh’s buses do bunch and a technical platform for archiving and accessing historic bus service data, I sought to extend this inquiry by quantifying aspects of frequency, timing, and location of bus bunching. The resulting project is available online at: http://bunching.github.io Dataset Vehicle locations for all buses servicing routes 61, 71, P1, and G2 were recorded once every sixty seconds throughout the month of March, 2016, resulting in a dataset of some 1.5m records of timestamped bus locations. Each location dataContinue Reading “The When and Where of Pittsburgh’s Bunched Buses”

I came across a quote yesterday in Cathy O’Neil and Rachel Schutt’s Doing Data Science that really resonates: The best minds of my generation are thinking about how to make people click ads… That sucks. ~Jeff Hammerbacher One surprise about data science is that most data science jobs exist within the marketing departments of large corporations. Marketing departments have “big data” on their potential customers, a clear business case for hiring smart people to mine those insights, and budgets with which to pay those smart people. But I can’tContinue Reading “Data Science Careers”

It’s a bit messy (time constraints!) but I recently put together a simple web page that displays the average time between bus arrivals for any PAAC Route 61A/61B/61C/61D stop. It also shows the average wait time, and the excess wait times caused by variance in arrival time spacing. This website can be accessed at the following link: http://pittsburgh-bus-wait-times.appspot.com/

As a fun exercise in my Data Science Pipeline class, I used my smartphone to  collect location data for approximately three weeks. A built-in algorithm also attempted to determine my activity (e.g. riding in vehicle, walking, etc.). By combining my location data with my timestamped activity data, I was able to produce a map of travels and modes of transportation: Around Home / Squirrel Hill Around Pittsburgh Around Pennsylvania For more, view the live version of the project here: http://mark-where-and-what.appspot.com/

What began as a casual observation that Pittsburgh’s buses, when running late, often arrive in pairs turned into a data warehouse and empirical investigation. Fellow students Bhavna, Ranjana, Rohita, Enbo and I built a data warehouse to capture the real-time bus location data published by the Port Authority of Allegheny County. Our analysis of the data revealed that, indeed, buses do tend to “bunch.” I’m excited to share that this project and its results are currently features on the CMU Students for Urban Data Analytics website.Continue Reading “Pittsburgh Bus Bunching Project Features on CMU Students for Urban Data Analytics Website”