David Janes, CEO of BlogMatrix has been writing about Google Base quite a bit
He looks at the technology from every possible angle, confronts obvious faults, suggests improvements and applies distinct creativity in his thinking.
Under the scuritiny of David Janes an apparently straighforward technology like
Google Base acquires entirely a new dimension.
Just recently he has come up with the idea that the Google Base data model could be used as a Semantic Web "language".
We ask him to explain it a little bit in more simple terms for us.
"As a gross simplification, you can think of Google Base as a collection of giant spreadsheets running somewhere inside of Google. Each spreadsheet defines an "Information Type", each column has a name and a type the data has to fall into, and users are free to add new rows or columns to the spreadsheet whenever they like.
Whenever you look at a Google Base entry, you're really just looking at an HTML representation of a row in one of those "spreadsheets".
Google lets you directly edit and edit rows of the
"spreadsheet" from their site; or you can upload a CSV file exported from your own
spreadsheet (say Excel), or you can generate RSS/XML extended with name/value pairs in a Google-defined manner and upload that into Google.
The main problem right now, in terms of formal structured data types, Google Base is a Roach Motel. The data goes in, but it doesn't come out
-- except as metadata-poor HTML.
BlogMatrix is interested in extending our tools so that we can directly populate Google Base on your behalf, without our users having to worry
about RSS, Atom, or CSV.
And if we have well specified data for Google Base, then the thinking goes, can we not use it elsewhere in other
applications?"
What do you mean exactly with 'semantic web languages'?
I'm using the term "semantic web language" to mean a syntax plus a way of modeling data.
RDF/XML and N3 are two different syntaxes built on the RDF triple model. The RDF triple model has the virtue of being simple and very extensible but suffers from the flaw that almost no one thinks of their data as being a big pile of triples.
Microformats are another semantic web language, built on the (X)HTML format and a number of well-specified modules which can be composited together.
I call the the Google Base XML upload formats a (potential) semantic web language because it's expressive enough for many applications and it's likely to be widely deployed.
The syntax is RSS or Atom and the model is a single database entry or a spreadsheet row: name/value pairs with values having a type. The beauty of this is everyone understands the data model; the downside is that there's only so much you can expressively model with a set of name/value tuples.
In our series of posts on Google Base, we've made a number of suggestions to increase the
expressiveness without removing the simplicity.
BlogMatrix particular interest is based on using structured blogging to produce data in addition to HTML from blogging tools. Please explain
BlogMatrix is trying to create a new type of CMS -- one where adding new content to a site is as "easy as blogging"; where it is easy for users to add data /ad-hoc/ (not just HTML) to a site and still have that data formally defined; and where it is easy to export that data in many
different formats -- RDF, N3, Google Base/RSS/Atom, as microformats, and so forth -- as well as HTML.
Typical data types one might add to a post would be addresses, events, maps, contact information and so forth. The user can select what he or she wants to add as they construct a post. Additionally, we can define "single purpose" structured data extensions to provide workflow or
specialized interfaces. The beautiful thing about blogging is that one doesn't have to think about the end product "as a website" -- it just
falls together. We want to do the same thing with data.
The resultant site created with BlogMatrix doesn't have to look like a blog. For example, we created Get Going Canada (http://www.getgoingcanada.ca) using the BlogMatrix Platform.
On the front end, it's a website (with lots of cool mapping features, faceted searches and so forth); on the administrative side, it just looks like they're maintaining a blog.
David Janes is the Founder of www.BlogMatrix.com

Comments
Post new comment