5 ★ Open Data

Tim Berners-Lee, the inventor of the Web and Linked Data initiator, suggested a 5-star deployment scheme for Open Data. Here, we give examples for each step of the stars and explain costs and benefits that come along with it.

5-star steps by example

By Example …

Below, we provide examples for each level of Tim’s 5-star Open Data plan. The example data used throughout is ‘the temperature forecast for Galway, Ireland for the next 3 days’:

    • make your stuff available on the Web (whatever format) under an open license1
    • example …
  • ★★
    • make it available as structured data (e.g., Excel instead of image scan of a table)2
    • example …
  • ★★★
    • make it available in a non-proprietary open format (e.g., CSV as well as of Excel)3
    • example …
  • ★★★★
    • use URIs to denote things, so that people can point at your stuff4
    • example …
  • ★★★★★
    • link your data to other data to provide context5
    • example …

Costs & Benefits …

What are the costs & benefits of Web data?

As a consumer …

  • ✔ You can look at it.
  • ✔ You can print it.
  • ✔ You can store it locally (on your hard drive or on an USB stick).
  • ✔ You can enter the data into any other system.
  • ✔ You can change the data as you wish.
  • ✔ You can share the data with anyone you like.

As a publisher …

  • ✔ It’s simple to publish.
  • ✔ You do not have explain repeatedly to others that they can use your data.

“It’s great to have the data accessible on the Web under an open license (such as PDDL, ODC-by or CC0), however, the data is locked-up in a document. Other than writing a custom scraper, it’s hard to get the data out of the document.”

What are the costs & benefits of ★★ Web data?

As a consumer, you can do all what you can do with Web data and additionally:

  • ✔ You can directly process it with proprietary software to aggregate it, perform calculations, visualise it, etc.
  • ✔ You can export it into another (structured) format.

As a publisher …

  • ✔ It’s still simple to publish.

“Splendid! The data is accessible on the Web in a structured way (that is, machine-readable), however, the data is still locked-up in a document. To get the data out of the document you depend on proprietary software.”

What are the costs & benefits of ★★★ Web data?

As a consumer, you can do all what you can do with ★★ Web data and additionally:

  • ✔ You can manipulate the data in any way you like, without the need to own any prorietary software package.

As a publisher …

  • ⚠ You might need converters or plug-ins to export the data from the proprietary format.
  • ✔ It’s still rather simple to publish.

“Excellent! The data is not only available via the Web but now everyone can use the data easily. On the other hand, it’s still data on the Web and not data in the Web.”

What are the costs & benefits of ★★★★ Web data?

As a consumer, you can do all what you can do with ★★★ Web data and additionally:

  • ✔ You can link to it from any other place (on the Web or locally).
  • ✔ You can bookmark it.
  • ✔ You can reuse parts of the data.
  • ✔ You may be able to reuse existing tools and libraries, even if they only understand parts of the pattern the publisher used.
  • ⚠ Understanding the structure of an RDF “Graph” of data can be more effort than tabular (Excel/CSV) or tree (XML/JSON) data.
  • ✔ You can combine the data safely with other data. URIs are a global scheme so if two things have the same URI then it’s intentional, and if so that’s well on it’s way to being 5-star data!

As a publisher …

  • ✔ You have fine-granular control over the data items and can optimise their access (load balancing, caching, etc.)
  • ✔ Other data publishers can now link into your data, promoting it to 5 star!
  • ⚠ You typically invest some time slicing and dicing your data.
  • ⚠ You’ll need to assign URIs to data items and think about how to represent the data.
  • ⚠ You need to either find existing patterns to reuse or create your own.

“Wonderful! Now it’s data in the Web. The (most important) data items have a URI and can be shared on the Web. A native way to represent the data is using RDF, however other formats such as Atom can be converted/mapped, if required.”

What are the costs & benefits of ★★★★★ Web data?

As a consumer, you can do all what you can do with ★★★★ Web data and additionally:

  • ✔ You can discover more (related) data while consuming the data.
  • ✔ You can directly learn about the data schema.
  • ⚠ You now have to deal with broken data links, just like 404 errors in web pages.
  • ⚠ Presenting data from an arbitrary link as fact is as risky as letting people include content from any website in your pages. Caution, trust and common sense are all still necessary.

As a publisher …

  • ✔ You make your data discoverable.
  • ✔ You increase the value of your data.
  • ✔ Your own organisation will gain the same benefits from the links as the consumers.
  • ⚠ You’ll need to invest resources to link your data to other data on the Web.
  • ⚠ You may need to repair broken or incorrect links.

“Brilliant! Now it’s data, in the Web linked to other data. Both the consumer and the publisher benefit from the network effect.”

See Also

Kudos to Andy Seaborne for pointing out the CSV bug, to Kerstin Forsberg for suggesting the ‘data highlighting’ in the 4/5-star examples, as well as to Vassilios Peristeras for proposing to explain not only the ‘what’ but also the ‘why’. Thanks to Egon Willighagen for providing more details about benefits of one-star data. Additional contributions from Christopher Gutteridge. The background picture of Tim Berners-Lee was taken by Paul Clarke and licensed under the Creative Commons Attribution-Share Alike 4.0 International license. This site was originally brought to you by the EC FP7 Support Action LOD-Around-The-Clock (LATC), and now brought to you independently by James G. Kim and Michael Hausenblas.