Open Data in China: Gov's thoughts
The Chinese Gov’s interest in data is kind of mystery. In last post, I wrote that Chinese Gov has started working on Big Data and two cities: Shanghai and Beijing have established their data portals. However, it is not clear whether the gov wants to just turn data – the asset they hold – into tradable goods for business or they actually want to make them open for civic innovation.
Last Sunday, I came across an announcement of the international forum of e-gov posted by Associate Prof. Zheng Lei at Fudan University on Weibo (Chinese Twitter). The open data/open gov track of the forum did catch my eye and I decided to attend the forum. It turned out that it is really worth my time.
The track included 6 talks given by 2 professors, 2 students and interestingly two staff for gov project. One speaker is Mr Tang DingChun, the deputy head of Open Gov Information department (i.e. Freedom of Information department), Shanghai Gov Office. He talked about the story behind Shanghai’s Data Portal and the local gov’s plan.
According to Mr Tang, Shanghai Gov has interest in open data since 2011. At that time, Shanghai Gov’s top leaders thought it could be interesting to open data but they really need experts’ advise on whether it is feasiable and how to do it. So one local expert wrote a long report for the gov discussing why it is important to open data, how it can improve local people’s life quality and how the gov can achieve it. Leaders were convinced and at that time, the deputy Mayor (now the Mayor) Mr Yang Xiong decided to run an open data pilot program which involved 9 departments.
It is never easy to run such program. There were a lot of debates on what kind of data could be opened and how they are opened. Especially, Officers did not want to take the risk of releasing data that might potentially reveal personal inforamtion or buisness secerets. So they finally decided to open relatively low risk data, location data, in their pilot program. And then, they built the data portal to showcase their data and released the portal in the end of 2012 for a public beta test.
Mr Tang said: for a long time, the data management inside the gov is quite messy. First of all, not every file/document/data is managed by computer so Chinese Gov do not have strong data infrastructre. Secondly, for a long time, there is no cooperation between departments on data management. Each department considers itself as the data owner and does not want to share the data even within the gov. Some Departments might not even know what kind of data actually held by another department so there are many redundant data and the data management is quite inefficient.
Therefore, to truly open data, the Shanghai gov wants to first get the big picture of what kinds of data it really holds. An internal data resource registry was established and rolled out to several departments in last year. Those departments were required to register what data they hold and describe each dataset in detail(e.g. the detail scheme of dataset). The Shanghai gov plans to ask all its departments to register data resource on this platform and hopefully gets the complete datasets by the end of 2014. After that, they will think about how to change their data management strategy and how to really open those data.
The wonderful talk sadly came to the end due to time limits, but Mr Tang did have time to answer three interesting questions. The first question was about how the Shanghai gov can make open data sustainable. Mr Tang revealed that Shanghai Gov did think about the problem and one idea they came up is to allow departments to charge fees of some datasets at the early stage of open data initative. With the possible financial benefits, departements will be willing to “open” data and later if the whole open data ecosystem become more mature in Shanghai, they will finally make all datasets free of charge.
The second question was about how to ensure data released by gov is accurate and up-to-date. Mr Tang said when they started the open data pilot program, they negotiated with each department about how frequently they should update the data. In future, they will take into account citizens’ opinion on the frequency of update and the capacity of each department to make the decision.
The last question asked one important issue and somehow is related to the first question: is there any restrictions on using those open data? Mr Tang talked about his personal opinion that every dataset held by gov should be opened (of course, that excludes any datasets containg business secerets and personal info) and they should be free of charge. So no restrictions on who can use data and how they can use data. But whether they will use a explict license to gurantee it? It was not clear.
After Mr Tang’s talk, the chief engineer of Shanghai’s Real-time Air Quality Reporting System, Mrs Fu QingYan talked about the strory behind the system. The talk itself is not directly related to open data itself, though it discussed one important issue: how public bodies can present data in a way that the wide public can understand. Mrs Fu introduced the idea of Air Baby, a lovely little baby with different face experissions to represent the six levels of air quality. (For instance, if the baby cries, the air quality is poor.) The snapshot of real-time bund view is another smart tool they use to efficientivly communicate with non-professional citizens. Citizens do not need to understand the exact amount of PM2.5 but just need to look at the snapshot then they can quickly understand how good/bad the air quality is.
In the Q&A, I asked when they plan to make air quality data as open data: CSV and API are both welcomed. She said, in fact the data can be downloaded but you need to do a lot of search to find it (Unfortunately, I still did not find it yet. Help needed!). Regarding the API, she said they do not plan it yet since they have strong concern on how third parties might interpret the data and how they will present the data. She said many developers wrongly present the data and sometimes exaggerate the difference between Chinese Gov’s air quality data and US ambassy’s data. Her worry is reasonable since many people in China beleive the gov underreports the air quality level however the opposite might be true according to Bu Shujian’s master thesis project which surprisingly found out Chinese air quality data is actually more accurate (see more in this post ). So she thinks unless the ecosystem of open data become more mature and people can more rationaly interpret the data, they will not take action to make them open.
So these are two stories shared by gov staff and I really appreicated this wonderful opportunity to hear their thoughts on open data. However, I do think the Shanghai Gov should reveal such plan in public with appropriate media coverage so they could engage citizens in actually using those data and working with gov to grow the open data ecosystem together. More importantly, the Shanghai Gov must explictyly make policy to back up open data so everyone can have confidence in both publishing and using data. There are really a lot of works to be done both inside and outside of Gov, and will the initative succedd? Time will tell.
P.S. As a member in the open data community, I hope you find this post at least help you know the current status of open data initative in China and your thoughts and advices are really welcomed and appreciated. please either comment below or drop me a line at firstname.lastname@example.org.
Let's Share and Discuss
Thanks for reading this post and I hope you find it is interesting and useful.
Please note the post is publised under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 license