Redmond Diary

By Andrew J. Brust

Blog archive

New York City Gets Data Happy

This past Monday, June 21st, the New York City Council Committee on Technology in Government held a hearing on its proposed legislation, known as Introduction 029-2010, that would require all City agencies to publish their data online, in "raw" form. The data would be available to private citizens who wished to analyze it, hobbyist developers who wished to work with it, and commercial entities looking to utilize it internally or create products that use and add value to it.

Such initiatives have already taken root in other jurisdictions, including the U.S. Federal Government. Its open data portal, at www.data.gov, serves as a very good example to state and local governments that wish to implement the same sort of good government transparency through technology.

At first blush, the legislation would appear difficult for anyone to oppose. Donn Morrill, chairman of the New York Technology Council (NYTECH) , eloquently expressed this point of view in his testimony at the hearing:

"This should not be a contentious bill. None of you will lose a vote. None of you will lose an endorsement. None of you will lose a dollar in financing by supporting this bill. What you will gain is recognition from the community that your affirmative vote will open doors for enterprising companies to develop new and exciting ways to experience New York City."

In my opinion, Morrill’s quite right; enactment of the legislation should be a "no brainer." The fact remains, however, that there has been resistance to it. A year ago I was present, and testified, at another City Council hearing on last year’s version of the same legislation. At that meeting representatives from the Mayor’s Office, in their own testimony, discussed the then recently announced NYC Big Apps competition and the data feeds that were published to facilitate it. They proclaimed the Mayor’s "customer-centric" position that only select data should be published, because only select data would be of interest to City residents.

The Mayor’s thinking, at the time, was that investment of City resources in producing feeds of all non-privacy-protected City data would be impractical, given that only a relatively small subset of it would be used. Committee Chairperson Gale Brewer disagreed, explaining that determining which data was valuable prior to its publication was virtually impossible. Andrew Hoppin, the New York State Senate’s CIO (follow him on Twitter at @ahoppin) made a similar point in his testimony this week, appealing to the City to "Resist the temptation to adjudicate what [data] is of value and what is not of value."

I agree, and said so in my own testimony at this week’s hearing. I think the whole point of publishing government data is that seemingly mundane data can form the raw material of extremely useful information, be it related to health, economics, commerce or even potholes.

The good news is that at this year’s hearing, the Administration’s attitude seems to have changed. The recently appointed Commissioner of the City’s Department of Information Technology and Telecommunications (DoITT), Carole Post, essentially the City’s CIO, expressed her agreement in principle with the legislation’s aim of making all agency data available. Where Post took exception to the legislation is in the cost and effort feasibility of making such a wide ranging set of data available in a relatively short timeframe. This is a tough call. As I said in one of my live tweets during the hearing, the "ask" of DoITT is big and scary. But if they express that, it looks like they're stonewalling. For what it’s worth, I think Commissioner Post is not stonewalling. I think she’s dedicated to this effort, and wants to avoid over promising and under-delivering.

Getting the data out can be difficult, both for reasons of bureaucracy and technology. If the data to be published is managed by a legacy mainframe application, getting it out there in digestible XML or CSV form may not be so easy. And even if it is, publishing static data is only the first step. Eventually the data should have an API (application programming interface) around it so that developers can query it interactively and, in some cases, create/submit data as well. Think about it: we don’t just want to get a of list crimes that took place. We want to be able to query that list by neighborhood or police precinct, median income level of the area, severity of the crime, and determine also if the rate of crime in that narrowed category is increasing or decreasing. Insurance companies, real estate agents, and makers of security products may like to know likewise. And the City itself should want to know too, so resources in the next fiscal year can be allocated appropriately.

Beyond such analytical inquiries, there should also be an API for reporting crimes (which would require creation of data, not consumption of it) too. This could allow citizens to report crime, in real time, from their mobile phones, with the tap of a button in an app. The GPS in the phone could alert authorities to the precise location of the incident, and the phone's camera could even submit a photo. Maybe such technology could address the "bystander effect" allegedly exhibited in the infamous Kitty Genovese stabbing case of the 1960s. Imagine if New York City government used technology to obsolesce a phenomenon made prominent in its own jurisdiction. Removing that blemish would be a proud moment.

I mentioned in my testimony that Microsoft's Open Government Data Initiative (OGDI) should be considered by the City as one platform for serving the City’s Open Data. OGDI is based on the Open Data Protocol (OData - itself based on numerous open Web standards) and Microsoft’s cloud platform, Windows Azure. OGDI is itself an Open Source starter kit that provides not just for the publication of data, but of a programming API and the ability both to read and write data.

Regardless of the protocol used, Open Data for this nation’s largest city is immensely important and it's very heartening to see the Council and the Administration in relative agreement on this point. Should the legislation be enacted, and implemented, the opportunities for entrepreneurs and the potential benefit to the public, to business and to the City itself will likely be big, exciting, and inspiration for municipalities across the world.

Posted by Andrew J. Brust on 06/24/2010


comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube