correlate Rotating Header Image

How many webs are there?

As you attend conferences, read industry literature or peruse blogs, you may come to the conclusion that there are a lot of “webs” out there. And if a scientist was reading this, they would probably completely agree with that premise. But for most, we’ve come to know the web as one thing. So if there is one thing, why are there so many names for it?

Here are some of them: Hidden Web, Deep Web, Web 1.0, Web 2.0, Semantic Web, Implicit Web, Visible Web and Invisible Web. And I’m sure I am missing some commonly used. Let’s try to settle some of the mystery. Here is the way I define them at a high level:

Web 1.0 – The web as we came to know it in the late 90′s where the number of pages were ramping at a tremendous rate, web businesses were blossoming, Geocities was considered innovative and the lead up to and through the bubble. The companies such as Amazon and eBay made a name for themselves.

Web 2.0 – The next phase of the web where the power of people and data really started to unlock all of the content published during the 1.0 generation. I posted on this topic previously.

Visible Web – The visible web is quite related to the deep web…in fact you can most well refer to it as the exact opposite. The visible web is the web that is, in fact, accessible to web crawlers.

Deep Web – The deep web typically refers to web pages and content that is not as easily accessible via typical crawling or retrieval methods. Often, this content is not accessible because it is located in structured databases. Many refer to this type of content as not being part of the “surface web”. And other terms such as “hidden web” and “invisible web” are synonyms of this one (the terms were probably both created at two different conferences). Wikipedia has a good reference on this topic. According to BrightPlanet, the “Deep Web” is 400 to 550 times largerthan the traditional web. See their white paper on the topic. Cal Berkeley also has very good reference information on the topic.

Implicit Web – The implicit web is a term commonly used with much more of a user-focus. In today’s web, much of the value for a user is self-directed or derived through explicit activities. There is a fantastic post, “Implicit” Web: A Case for Self-Determination?. Many sites are now trying to add further value to users by “tailoring of services and content to users based upon historical analysis of user actions and interests”.

Semantic Web – The “semantic web” is commonly (and recently) been referred to as Web 3.0 where the web extends to more than pages being portrayed at the document level and where sites (and content pages) are structured in a manner where all of the entities within a page are machine readable. I wrote a post, “Semantic Web, the next wave of the web?” a few weeks ago on the semantic web that talks more about this. If you need an analogy, think of RSS extending to web page construction, a universal standard where a web site can be machine readable.

There are probably other webs that are missing from this, feel free to leave me comments. Perhaps we should combine all of these under one term called “tangled web”. Remember if it catches on, you heard it hear first…

  • http://therehearsalstudio.blogspot.com/ Stephen Smoliar

    Many years ago Drew McDermott wrote a wonderful little essay entitled “Artificial Intelligence Meets Natural Stupidity.” His basic argument was that most of the misunderstandings of what had been done in the name of “artificial intelligence” or “expert systems” could be attributed to the deceptive nature of those two sobriquets. Notwithstanding McDermott’s cautionary lessons, the entire IT culture has been infected with this practice of deceptive naming (not to mention what McDermott called “wishful mnemonics”); and, as far as I am concerned, “Semantic Web” is the latest move in what I like to call “cheating at the language game.”
    Much as it pains me to praise Google, I think that one of the better reality checks on Semantic Web thinking came about a year ago in an exchange between Tim Berners-Lee (he who made the language game move) and Peter Norvig (Google Director of Search). This was reported at the time on CNET News.com. I wrote up my own comments in my <a href=”http://blog.360.yahoo.com/blog-Mff23hgidqmHGqbcv.lfskakEtS6qLVHUEMFUG4-?cq=1&p=49″ rel=”nofollow”>
    Yahoo 360° blog</a>. Those comments are still there (hence the hyperlink). I just checked the link I installed there to the CNET site; and it is still “alive.” Regarding Lou’s “thumbnail” description, I prefer to think of the Semantic Web as a vision for getting more mileage out of XML (which could then impact the effectiveness of RSS); and my blog post illustrates this with the “standard diagram” of the underlying architecture for the Semantic Web.
    My “bottom line” on this matter is that, as Habermas has observed, we all live in three “worlds,” an objective world, a subjective world, and a social world. Norvig basically took Berners-Lee to task for living only in the objective world and assuming that everyone on the World Wide Web does the same, and I continue to support Norvig on this. However, in McDermott’s spirit of coming up with more “honest names,” I would say that the Semantic Web is just the latest attempt to succeed at the “ontological engineering” that was attempted (with limited success) by the Cyc project. This can keep you very busy but will not take you very far, simply because, once you get into the subjective world, you cannot <em>have</em> an ontology unless it is supported by a phenomenology! (I also believe that the social world has a phenomenology, but that is another debate.)
    This probably all amounts to using a sledgehammer to drive home Lou’s main point. It is not that the Web is “tangled” but that we keep finding ourselves victimized by deceptive nomenclature. What we need is more plain speaking. Unfortunately, plain speaking does not get academic papers published; nor does it make for particularly effective marketing strategies!

  • http://therehearsalstudio.blogspot.com/ Stephen Smoliar

    Many years ago Drew McDermott wrote a wonderful little essay entitled “Artificial Intelligence Meets Natural Stupidity.” His basic argument was that most of the misunderstandings of what had been done in the name of “artificial intelligence” or “expert systems” could be attributed to the deceptive nature of those two sobriquets. Notwithstanding McDermott’s cautionary lessons, the entire IT culture has been infected with this practice of deceptive naming (not to mention what McDermott called “wishful mnemonics”); and, as far as I am concerned, “Semantic Web” is the latest move in what I like to call “cheating at the language game.”

    Much as it pains me to praise Google, I think that one of the better reality checks on Semantic Web thinking came about a year ago in an exchange between Tim Berners-Lee (he who made the language game move) and Peter Norvig (Google Director of Search). This was reported at the time on CNET News.com. I wrote up my own comments in my <a href=”http://blog.360.yahoo.com/blog-Mff23hgidqmHGqbcv.lfskakEtS6qLVHUEMFUG4-?cq=1&p=49″ rel=”nofollow”>
    Yahoo 360° blog</a>. Those comments are still there (hence the hyperlink). I just checked the link I installed there to the CNET site; and it is still “alive.” Regarding Lou’s “thumbnail” description, I prefer to think of the Semantic Web as a vision for getting more mileage out of XML (which could then impact the effectiveness of RSS); and my blog post illustrates this with the “standard diagram” of the underlying architecture for the Semantic Web.

    My “bottom line” on this matter is that, as Habermas has observed, we all live in three “worlds,” an objective world, a subjective world, and a social world. Norvig basically took Berners-Lee to task for living only in the objective world and assuming that everyone on the World Wide Web does the same, and I continue to support Norvig on this. However, in McDermott’s spirit of coming up with more “honest names,” I would say that the Semantic Web is just the latest attempt to succeed at the “ontological engineering” that was attempted (with limited success) by the Cyc project. This can keep you very busy but will not take you very far, simply because, once you get into the subjective world, you cannot <em>have</em> an ontology unless it is supported by a phenomenology! (I also believe that the social world has a phenomenology, but that is another debate.)
    This probably all amounts to using a sledgehammer to drive home Lou’s main point. It is not that the Web is “tangled” but that we keep finding ourselves victimized by deceptive nomenclature. What we need is more plain speaking. Unfortunately, plain speaking does not get academic papers published; nor does it make for particularly effective marketing strategies!

  • http://correlate.wordpress.com Lou Paglia

    First off, let me say congratulations for being my most active reader and commenter! Even though you may be winning a competition of one or two, I do appreciate your readership…
    As for your ending quote: “Unfortunately, plain speaking does not get academic papers published; nor does it make for particularly effective marketing strategies!” It is a great point. Plain speaking would be much easier to understand and new “terms” is what publishes papers, sells conference registrations and gains marketing traction.
    But there may also be a side benefit of the “tangled web” and that is once you understand all the terms, it is much easier to reference your meaning with one or two words. I can’t imagine talking about “semantic web” or “web 2.0″ if it wasn’t defined in two words, it would take sentences each time to achieve.

  • http://correlate.wordpress.com Lou Paglia

    First off, let me say congratulations for being my most active reader and commenter! Even though you may be winning a competition of one or two, I do appreciate your readership…

    As for your ending quote: “Unfortunately, plain speaking does not get academic papers published; nor does it make for particularly effective marketing strategies!” It is a great point. Plain speaking would be much easier to understand and new “terms” is what publishes papers, sells conference registrations and gains marketing traction.

    But there may also be a side benefit of the “tangled web” and that is once you understand all the terms, it is much easier to reference your meaning with one or two words. I can’t imagine talking about “semantic web” or “web 2.0″ if it wasn’t defined in two words, it would take sentences each time to achieve.

blog comments powered by Disqus