This post is an extension of conversation on the issues raised in the post Map data sets as CC 4.0 . I am creating a different thread, since it is more general than the specific issue that the referred post is about.
Data represent facts. And facts cannot be copyrighted. Compilation of data can be an expression, which can be copyrighted. These are some take-home points that one can gather from any 101 course on copyright.
This raises serious philosophical questions about what is a fact and how to represent a fact. When I say âTom is a cat.â, is certainly an expression of a fact. It is not in a spreadsheet or a table. But it is possible to make a generic table for it with three columns, subject, predicate and object. And add more rows to represent more facts, e.g. âJerry is a ratâ, as well as âTom chases Jerryâ. By now I am already telling a story through a table. Now, is this a compilation of facts sufficient to say that this is an expression of a story and therefore copyrightable? Apart from these, there are multiple ways in which this data can be represented, as an SQL dump, CSV file, JSON format, RDF format, XML format, and embed this in an expressive programming language like LISP or PROLOG, where the distinction between data and function begins to blurr.
Give me any data set, I can convert that into a triple store (a triple store is a generic way of representing data in the form of a subject, predicate, and object.) Or give a triple store, I can convert that into a paragraph of sentences each representing a fact. The paragraph may appear boring, but who cares. I just want to make the point that there seems to be not much difference after all between data/facts and what we can express as sentences.
When data is presented in the form of a table it tells us a story much better than when each of the facts told as sentences. Then the table is a creative expression of the data, making us gather the meaning more effectively. Then why do we consider only charts, graphs of data as a compilation/expression and not a table or a spreadsheet is not clear to me. One may say that a chart or a table is an image, therefore it is an expression, but it is an image only because it is represented as a table (at least in a digital computer). I am interpreting a 2D representation of an image also as a table of pixel values.
Can I tell a story in the form of a triple store? It is challenging but possible. Who would like to read a story in that format? Who cares, when I am doing that for the sake of an argument.
So, the moral of this story: the assumption that data/facts when represented in a table format or a database are not expressions may stand in a court, but does not appeal @G_N.
One question that I did not address in this post is about what makes an expression creative? We can discuss that some other time.