Feng Gao bio photo

Feng Gao

Open Data Advocate and Open Knowledge's ambassador for China. Researching, Designing and Coding for Change. Available for Freelancing work

Email Twitter Google+ Github Weibo

Notes on translating Open Data Handbook

Open Data Handbook is a crowdsourced document, coordinated by Open Knowledge Foudnation, to guide the practice of Open Data. Last week, I spent two days on translating the document to Simpleified Chinese as part of my OKFNCN work. And hopefully the new Chinese version of the handbook could help interes ted people/organization in China to know much better about Open Data.

This post, therefore, records notes on my translation work.

Problems on Translation Tools

OKF used the Sphinx to extract text from original handbook and crowdsource the translation on Trasfiex. However, there are three existing problmes:

  1. There are some strings missing in extracted file produced by Sphinx. For instance, the three bullet items in “make-data-available” section are missing. This issue is already reported on github #86. but it seems there is no action to fix it.
    In addition, the glossary is not actually included in the file so if you did translate those vocubularies, it causes problems when you build the transaltion.

  2. Translating strings, which have Sphinx markups such as “:term:”,”:doc:” and “:ref:”, will cause problems. When you build the translation, the engine can not correctly render those markups and treat them as plain text.

  3. Transifex is not such a good platform for translating documents. At least for me, I’m comfortable with translating documents paragraph by paragraph. However, Transifex actually shows strings in somehow random order and it makes you lose the big context of those words you are translating.

Notes on Building the Translation

The detail instruction on how to build the HTML pages of your translation is now updated on the repository. The process is pretty straighttforward if you have basic git knowledge and know how to interact with the command line.

If you want to create the pdf version. I’ll suggest trying latex approach first, see here. Then try rst2pdf approach (tutorial here).

The problem I got is pretty much related to Chinese Language. The first problem is that the rst2pdf can not support Chinese orignally, so you need to create a special Chinese.style file to tell the engine which Chinese fonts you want to use to replace default standard fonts. A tutorial is available here in Chinese.

The second problem is the way rst2pdf render the text. This is partilly related to the template the engine uses and raise a new TODO item of the handbook that we need to design good looking template to create nice formatted pdf. But for Chinese, it is more complicated since the language is different from English and needs additional care. Unfortunately I am not such an expert on this topic so I gave it up now.