The prospect of a Web of information and knowledge supporting systems capable of automating reasoning, inference, and decision making has been challenging Web researchers, developers, and “real-world” Web users since the publication of the Semantic Web vision. The main challenge for businesses is to learn how to leverage the Web’s full potential, which is expected to be exponentially higher in “semantically enabled” environments. Businesses are warming up to the idea of semantics and are becoming impatient to leverage the “magic,” but few understand what semantics really mean. There is a bit of a joke in the IT community that any old idea presented with a “semantic” twist (semantic enterprise,semantic search and retrieval, etc.) is going to benefit from the buzz and contribute to the mushrooming of “Semantic Web ware,” which includes software, toolkits, platforms, environments, standards, protocols, and a lot, a lot of talk, sometimes with limited appreciation of the core issues. Semantics in relation to information technology and data models is nothing new. It is even fundamentally something rather simple, but when it comes to semantics and the Web, with everything that the Web represents, the challenges inevitably become significant. In this article, I discuss and clarify some assumptions in relation to the notion of the semantic enterprise and the firewall and to place “semantics” in the context of an evolving Web.
What Do You Mean, Er ... “Semantic”?
In relation to the Internet and the Web, the notion of semantics remains very open to interpretation, probably because the overall concept of semantics in W3C is deliberately broad [read: a bit vague]. The distinction between syntactic, semantic, and pragmatic “dimensions” in linguistic and communication theory was credited to American semiotician and philosopher Charles Morris, but in common everyday IT language, semantics refers to the study of the meaning of words, as opposed to syntax, which refers to the structure of words and languages, such as grammars (be they natural language or computer grammars). It is worth remembering that in the logic of discourse, the distinction between semantics and syntax is sometimes fuzzy. In the context of organizational information, semantics is represented by the relationship between information objects, in their various degrees of granularity (such as data and knowledge), to other information — about the objects themselves, about other objects, or about relations. Such relations, when properly elicited and structured, contain the “intelligence” that we seek from our systems. Cognitive systems (knowledge systems, information systems, intelligent systems, etc.) and the IT infrastructures that are designed to support them should be developed along all three dimensions to be able to communicate intelligently. Today, in relation to the Web, “semantics” is generally defined in terms of what has been established by the W3C (or, by those who may disagree, in contrast to it). The W3C glossary states that the Semantic Web is “the Web of data with meaning.” It refers to a vision of advanced knowledge capabilities — automatic querying and retrieval, reasoning — that can be carried out on the open Internet by agents (i.e., software that has been programmed to do something) thanks to a pervasive “web of content.” For example, a semantic search for a term would not return an unsorted list of relevant and less relevant results that a human must sift through in order to select the most appropriate response. Instead, the results would be presented in the context of a knowledge schema the user has defined. However, the notion of “semantics” started to trickle into data modeling in the 1970s, when IT researchers and scientists started to understand the limitations of the relational data model (which was then becoming dominant) and the practical implications of constraining the data view of the world to tuples and rows.
Drawing the Line
The “semantic” dimension of information, which initially has been the focus of interest mainly for knowledge architects, now is acquiring relevance also in terms of organizational planning, as real-time information exchange is starting to affect the nature of operations. The first step toward developing a semantic enterprise is to ensure that the terminology used in all existing organizational documentation — from E/R diagrams to data flow charts to key policy documents — is consistent and optimized. It must also be adopted correctly in the metadata and all critical information structures, such as rule catalogs. Before even thinking of becoming a semantic enterprise, an organization must have good vocabularies (keyword catalogs) and schemas to represent the organizational knowledge and processes, with some mapping the “operational logic” using “normalized” natural language form, making sure to label clearly the schema elements and their values. In the future, a policy change published on some regulatory authority’s Web site could well prompt a modification in a regulated organization’s process flow, but this can only happen without disastrous consequences if the terminology and information schemas are valid, harmonized, and consistently implemented, with due support and provision for security measures. When it comes to planning for the semantic enterprise, CIOs and CEOs are going to have to rethink where to draw the lines between public and private. It is how these lines are drawn that will determine the shape of things to come. The ability to balance opportunities and risks is going to be reflected in how technologies are configured and used, which in turn is going to open up new organizational perspectives.
Excerpt from: The Rise of the Semantic Enterprise, (with articles by Paola Di Maio, John Kuriakose, Shamod Lacoul, Hyoung-Gon (Ken) Lee, San Murugesan, Edmund W. Schuster, Bhuvan Unhelkar)
Download the Entire Issue from
Give your story premium visibility!