New York City's Office of Technology Innovation provides a collection of useful APIs that let you access City data. For the past few weeks I have been playing with the APIs looking for useful application ideas and I've been using Google's colab product for that exploration. These APIs are free to access and you can sign up here.
If you are new to Colab, Google provides an introduction notebook that covers the basics. If you've used jupyter with Python you should be good to go. While Colab is frequently used for data science and AI, I think this is a great platform for building internal tools. For one specific type of user, users with lots of domain specific knowledge who may not know an API or tool, Colab is useful as a way of bundling instructions and code in a way that allows them to be productive.
Minimal code example for connecting to an end point
This is a minimal test script. It connects to an API, uses a secure way of holding the API keys and allows the user to visually set the parameters sent to the API. After execution, the JSON output is displayed. While this is fine, an a step above using CURL from the command line, this isn't what I would send to a user.
import requestsimport jsonimport datetimeimport urllib.parsefrom google.colab import userdatafrom_date = "2025-05-27" # @param {type: "date"}to_date = "2025-06-06" # @param {type: "date"}url = "https://api.nyc.gov/public/api/GetCalendar?fromdate={}&todate={}"api_url = url.format(urllib.parse.quote(from_date), urllib.parse.quote(to_date))subkey = userdata.get('Ocp-Apim-Subscription-Key')headers = {"Ocp-Apim-Subscription-Key": subkey,}response = requests.get(api_url, headers=headers)print(response.text)
This is what it looks like on screen. The left side has the python code while the UI for setting dates is on the right.
Storing secrets outside of notebooks
Colab lets you store API keys outside of the code and use a class for accessing the keys. I highly recommend doing this rather than pasting the keys in code. Sharing a Colab notebook does not share the keys so you will need to work with users to have them install API keys but once you install a key, it is usable in any of the notebooks the user has.
Adding exception handling and extracting out the API call
First step is to extract out a function to call the API and to create a short circuit test in case there is no API key set in secrets. After that, we include error handling and the libraries we will be using below. Normally, this is code you would hide and simply execute to set up.
import requestsimport jsonimport datetimeimport urllib.parsefrom google.colab import userdata
# fail quick if no API Key assert not not userdata.get('Ocp-Apim-Subscription-Key')
def call_api(from_date, to_date):
try:
url = "https://api.nyc.gov/public/api/GetCalendar?fromdate={}&todate={}" api_url = url.format(urllib.parse.quote(from_date), urllib.parse.quote(to_date)) subkey = userdata.get('Ocp-Apim-Subscription-Key') headers = { "Ocp-Apim-Subscription-Key": subkey, }
response = requests.get(api_url, headers=headers) response.raise_for_status() # Raise HTTPError for responses 4xx or 5xx
return response
except requests.exceptions.RequestException as e: print(f"Error fetching API: {e}") except Exception as e: print(f"Error while fetching API: {e}")
def extract_response(response):
try:
json_data = json.loads(response.text) return json_data
except json.JSONDecodeError as e: print(f"Error decoding JSON: {e}")
Using a form for data entry
In the same way you want to avoid putting any API keys in code you should avoid having users edit code to set variables. Colab has a really nice form system. When combined with the hide code option, you can build a tool where a domain expert can populate a request and execute it without any knowledge of the internals. Avoid having users put API keys in a form. Users might email screenshots or paste them in a bug tracker and they may not be careful about leaking data.
Below is the block that includes the selection of dates and executes the function call_api.
from_date = "2025-05-27" # @param {type: "date"}to_date = "2025-06-03" # @param {type: "date"}
# These are the two critical callsresponse = call_api(from_date, to_date)json_data = extract_response(response)
Now that we have a json response, we can figure out what we want to do with it.
Format output data
Colab has a nice data formatter for printing out tables with interactive features. Data can be sorted, filtered and exported via markdown. Overall, it's a pretty good feature set for basic reporting.
At this point the user has a way of executing the set up and then alternating between picking dates and using an interactive table. This is fine assuming all is well.
## Build interactive table##from pandas import DataFrame as df from google.colab import data_table
data_table.disable_dataframe_formatter()
table = []
for day in json_data["days"]: for item in day["items"]: item["day"] = day["today_id"] table.append(item)
df = df(table)order = ['day', 'type', 'status', 'details', 'exceptionName']df = df.reindex(columns=order)
df
You end up with a table like this. While not a full featured spreadsheet, it allows for a fair amount of manipulation of the data.
Building tests
It's nice when things work nicely but that isn't always the case and in almost all cases, the time it
Pytest is a great tool for testing and I use it for most the of the Python scripts I write. And while it is possible to run pytest in Colab, I normally wouldn't bother. The process for using it exceeds the value I get so I normally just use exception handlers for critical code where I need to know the cause of a failure and simple boolean evaluations to "test" the results of my code.
In this notebook I have loud errors for the three blockers that a user might encounter, lack of a security key, the API not responding and the API not returning JSON. In cases where everything runs but the chart can not be populated, I section called Test Results Dump.
Testing and reporting errors
The general form for testing software is Given, When, Expect. I don't remember where I learned that from but whenever I am writing tests, I keep those three words in that order in my head.
If I am given a URL, when I call the URL with these values I expect the following outcomes. In order to follow that best practice I need to rewrite the code to extract out some functions and break the code down into sections that can be executed individually. The section that calls the API and then runs the tests should be able to executed multiple times with different inputs. And it should be possible to export the results in a way that could be attached to a bug tracking system.
Normally I would follow the Right Bicep rules but I am only going to test the three things that would break my use case, which is a script to automatically add exceptions to my calendar.
- That there is a collection of items under "days" that is equal to the date range between from and to.
- That for any day, there are three items-Alternate Side Parking, Collections and Schools.
- If schools are closed, or trash pick up is delayed or suspended or alternate sides parking is suspended, that there is a non empty exception for the item.
Dumping test output
Executing the code in the Test Results Dump results in executing some test code that inspect the JSON that is returned to see if it conforms with the expectations. If a field is renamed or some other data is added then one or more tests will fail and hopefully lead to information that is useful in mitigating any issues.
This results in a blob of test that can be opened in an editor. While it isn't pretty, it can be read.
The bonus round
This is a list of items that I may not be needed but I think are worth mentioning.
Charts
Working with Google Sheets
Using custom runtimes
If you use Google Cloud, you can set up a Compute Engine instance as your back end. This is useful in cases where you have a secure cloud environment and you would like to a access the databases, APIs and other resources while using the same security.
Running a local copy
Colab allows you to export a notebooks to, and import a notebook from, GitHub. You can grab a copy of this example from here.
Both Visual Studio Code and IntelliJ have Python plugins that will execute Python notebooks. You'll have to adjust the way you store keys and remove the Google specific libraries but the notebook would run as expected.
Hope this is useful.
If you found this useful and want to encourage me to write more, feel free to comment either here or where ever you found this.
Comments
Post a Comment