Please visit od4d.com for the latest articles.

Early insight 2: Definitions of open data

To examine the use and possible impact of open data in South African higher education governance, the project is currently conducting interviews with university planners, higher education studies researchers, the Department of Higher Education and Training, and other stakeholders.

As this process unfolds, early glimpses of possible findings and insights are revealed -- some expected, others surprising. Some of these insights will be shared on the ODDC website, and this is the first of these early glimpses. What is shared here should not be taken as definitive or final.

In collecting data for the OpenUCT ODDC team's most recent paper, Viscous open data: The flow of data in a public university governance ecosystem, we confronted what appears to be a common problem: how to define open data. In our case, the definition was important because how we defined open data would determine which datasets would be classified as open and which ones as closed in our open data ecosystem. And at a recent ODDC regional meeting in Cape Town, the issue of definitions also reared its head (and not for the first time at such meetings).

At an open session at the ICTD 2013 conference in Cape Town, one of the items up for discussion is the issue of how to define open data. Based on our research, we offer the following: the definition of open data at its most basic level is that of the Open Definition: "A piece of data or content is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and/or share-alike." It is important that there is at least a basic common definition across all contexts to ensure a shared understanding, and a clear differerentiation between open data and other forms of data.

However, existing broader definitions offered by the likes of the ODDC, the Open Government Partnership (OGP) and others with an interest in the impact of open data, indicate, we believe, the importance of the perceived impact and context on what is included in the definition. For example, because the OGP is concerned with potential of open data to improve transparency and accountability, criteria such as granularity and timeliness feature in their definition. In the case of research done by McKinsey on the economic impact of open data, granularity and timeliness are jettisoned because only four criteria are regarded as important when evaluating the impact of an open data set on innovation and performance. The ODDC criteria reveal the importance of context: the most contested criteria are those which are hardest to meet in a developing country context. These include criteria such as machine readable data and data that is interoperable. (See the table below for the variances across three broader definitions of open data.) 

Our proposed broader definition of open data is therefore one that takes both the impact and context of the open data into consideration:


Open data = Open definition + (impact/context)


3 Definitions of Open Data


ODDC Open Data 10-point Evaluation

8 principles of Open Government Data



Does the data exist?

Data must be complete

Data must be accessible


Is it available online in digital form?

Data must be primary

Data must be machine readable


Is the data machine readable?

Data must be timely

Data accessible free or at negligible cost


Is the data available in bulk?

Data must be accessible

Data must be license-free


Is the dataset available free of charge?

Data must be machine readable



Is the data openly licensed?

Access must be non-discriminatory



Is the dataset up-to-date?

Data formats must be non-proprietary



Is the publication of the dataset sustainable?

Data must be license-free



Was it easy to find information on the dataset?




Are linked data URIs provided?








Developing countries

Multiple national

Seven domains at national level (education, transportation, healthcare, etc.)

¢ Contested

¢  Unique