Three years ago, I made a real-time map of Rutgers buses as a side project. It was a messy PHP script which grabbed data from the NextBus API, and displayed it on a Leaflet map. That map worked nicely, but the code was an embarrassment. I decided to try again, and re-wrote it.
There's an instance for Rutgers buses at rutge.rs, but this version actually supports all Nextbus-tracked bus agencies. Just set a config parameter!
The goal was to take data from the NextBus API Feed, store it in a database, and display it on a map. I wanted the database cache for two reasons:
- I can record historic data for future data science projects.
- The NextBus data has tons of quirks. Normalizing it server-side and storing the result allows for multiple front-ends (a mobile app?) without re-writing the normalization code each time.
Being familiar with SQLAlchemy, I was excited to try it out on a clean-slate project. I used Flask because it provides URL routing, templating, and a slew of other features without getting in my way.
Celery is used here to sync new data from NextBus on a schedule. It kicks off a data update job every few seconds for vehicle locations and predictions, and once a day for route and stop information.
Some data has to be fetched with large batches of requests. These are done asynchronously (using requests-futures).
- app.py uses Flask to serve the user-facing webpage with the map, as well as the AJAX endpoint providing real-time data.
- celerytasks.py is where the Celery tasks are called from - these use the NextBus logic to populate the database.
- In addition to these, manage.py has a few functions which can be used from the command line to initially populate the database, watch the Nextbus API quota usage, and more.
I have a few things I'd like to add to PyBusMap, such as routing (building the quickest route to a destination and providing directions).