Blog by Sumana Harihareswara, Changeset founder
The Programmer Experience: Redundancy Edition
Hi, reader. I wrote this in 2017 and it's now more than five years old. So it may be very out of date; the world, and I, have changed a lot since I wrote it! I'm keeping this up for historical archive purposes, but the me of today may 100% disagree with what I said then. I rarely edit posts after publishing them, but if I do, I usually leave a note in italics to mark the edit and the reason. If this post is particularly offensive or breaches someone's privacy, please contact me.
Sunday morning, one thought led to another, and I realized I'd like to read a US nonprofit's Form 990 (annual report on assets, revenue, etc.). I started looking for 990s, and stumbled upon a big set of machine-readable 990s from 2011 through sometime in 2016. Is the one I seek in there? How recently was the dataset updated?
So I broke out bpython, the json module, and wget and started figuring it out.
Some things I learned/realized/remembered along the way:
Use this database to view summaries of 3 million tax returns from tax-exempt organizations and see financial details such as their executive compensation and revenue and expenses. You can browse IRS data released since 2013 and access over 9.6 million tax filing documents going back as far as 2001.
For example, attractively presented data, and full 990s, for the Organization for Transformative Works.
So, uh, turns out I didn't need to actually get all "make a new GitLab repo! how does Arrow work? gotta refactor this!" and so on, but ah well. Now I will have a fresh new project to use for testing as I delve into Python packaging and test Warehouse, the new Python Package Index.
* Update: Mike Linksvayer, on Identi.ca, edifies: "The archived gitorious floss-foundations filings repo is now actively maintained by Martin Michlmayr at https://gitlab.com/floss-foundations/npo-public-filings".