Redmond Diary

By Andrew J. Brust

Blog archive

New York City Gets Data Happy

This past Monday, June 21st, the New York City Council Committee on Technology in Government held a hearing on its proposed legislation, known as Introduction 029-2010, that would require all City agencies to publish their data online, in "raw" form. The data would be available to private citizens who wished to analyze it, hobbyist developers who wished to work with it, and commercial entities looking to utilize it internally or create products that use and add value to it.

Such initiatives have already taken root in other jurisdictions, including the U.S. Federal Government. Its open data portal, at www.data.gov, serves as a very good example to state and local governments that wish to implement the same sort of good government transparency through technology.

At first blush, the legislation would appear difficult for anyone to oppose. Donn Morrill, chairman of the New York Technology Council (NYTECH) , eloquently expressed this point of view in his testimony at the hearing:

"This should not be a contentious bill. None of you will lose a vote. None of you will lose an endorsement. None of you will lose a dollar in financing by supporting this bill. What you will gain is recognition from the community that your affirmative vote will open doors for enterprising companies to develop new and exciting ways to experience New York City."

In my opinion, Morrill’s quite right; enactment of the legislation should be a "no brainer." The fact remains, however, that there has been resistance to it. A year ago I was present, and testified, at another City Council hearing on last year’s version of the same legislation. At that meeting representatives from the Mayor’s Office, in their own testimony, discussed the then recently announced NYC Big Apps competition and the data feeds that were published to facilitate it. They proclaimed the Mayor’s "customer-centric" position that only select data should be published, because only select data would be of interest to City residents.

The Mayor’s thinking, at the time, was that investment of City resources in producing feeds of all non-privacy-protected City data would be impractical, given that only a relatively small subset of it would be used. Committee Chairperson Gale Brewer disagreed, explaining that determining which data was valuable prior to its publication was virtually impossible. Andrew Hoppin, the New York State Senate’s CIO (follow him on Twitter at @ahoppin) made a similar point in his testimony this week, appealing to the City to "Resist the temptation to adjudicate what [data] is of value and what is not of value."

I agree, and said so in my own testimony at this week’s hearing. I think the whole point of publishing government data is that seemingly mundane data can form the raw material of extremely useful information, be it related to health, economics, commerce or even potholes.

The good news is that at this year’s hearing, the Administration’s attitude seems to have changed. The recently appointed Commissioner of the City’s Department of Information Technology and Telecommunications (DoITT), Carole Post, essentially the City’s CIO, expressed her agreement in principle with the legislation’s aim of making all agency data available. Where Post took exception to the legislation is in the cost and effort feasibility of making such a wide ranging set of data available in a relatively short timeframe. This is a tough call. As I said in one of my live tweets during the hearing, the "ask" of DoITT is big and scary. But if they express that, it looks like they're stonewalling. For what it’s worth, I think Commissioner Post is not stonewalling. I think she’s dedicated to this effort, and wants to avoid over promising and under-delivering.

Getting the data out can be difficult, both for reasons of bureaucracy and technology. If the data to be published is managed by a legacy mainframe application, getting it out there in digestible XML or CSV form may not be so easy. And even if it is, publishing static data is only the first step. Eventually the data should have an API (application programming interface) around it so that developers can query it interactively and, in some cases, create/submit data as well. Think about it: we don’t just want to get a of list crimes that took place. We want to be able to query that list by neighborhood or police precinct, median income level of the area, severity of the crime, and determine also if the rate of crime in that narrowed category is increasing or decreasing. Insurance companies, real estate agents, and makers of security products may like to know likewise. And the City itself should want to know too, so resources in the next fiscal year can be allocated appropriately.

Beyond such analytical inquiries, there should also be an API for reporting crimes (which would require creation of data, not consumption of it) too. This could allow citizens to report crime, in real time, from their mobile phones, with the tap of a button in an app. The GPS in the phone could alert authorities to the precise location of the incident, and the phone's camera could even submit a photo. Maybe such technology could address the "bystander effect" allegedly exhibited in the infamous Kitty Genovese stabbing case of the 1960s. Imagine if New York City government used technology to obsolesce a phenomenon made prominent in its own jurisdiction. Removing that blemish would be a proud moment.

I mentioned in my testimony that Microsoft's Open Government Data Initiative (OGDI) should be considered by the City as one platform for serving the City’s Open Data. OGDI is based on the Open Data Protocol (OData - itself based on numerous open Web standards) and Microsoft’s cloud platform, Windows Azure. OGDI is itself an Open Source starter kit that provides not just for the publication of data, but of a programming API and the ability both to read and write data.

Regardless of the protocol used, Open Data for this nation’s largest city is immensely important and it's very heartening to see the Council and the Administration in relative agreement on this point. Should the legislation be enacted, and implemented, the opportunities for entrepreneurs and the potential benefit to the public, to business and to the City itself will likely be big, exciting, and inspiration for municipalities across the world.

Posted by Andrew J. Brust on 06/24/2010


comments powered by Disqus

Featured

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

  • What's New for Python, Java in Visual Studio Code

    Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

Subscribe on YouTube