correlate Rotating Header Image

Google

Search Engine 2007 Meeting

Still in Boston as we just completed the Search Engine 2007 meeting. The last two days have been a jam-packed information session and dialogue around some of the most interesting innovations taking place in the search industry. The structure of the two days was really made up of two different lenses. The first being product demos. The second, and more interesting, was study presentations where industry experts presented their findings around technology, strategic approach and overall philosophy. Many of the players were quite specialized which really shed light on some of the distinct thinking taking place in the industry.

The conference overall was quite technology focused but there was a good mix of a business/solutions focus as well. On first blush, part of me wants to say that a lot of the search players in attendance are trying to be “everything to everyone”, doing everything ranging from analytics, mining, entity extraction, clustering and search. Even so, the level of advanced analytics and technological approach was quite impressive. There are certainly a number of companies that I now have on my mental radar that I plan to follow over the coming months and year to see how they do. Several I believe will make their mark in some fashion on the industry commercially.

Here are some (not all) of the players that had representatives at the event, took part with speaking roles and contributed greatly to the collective intelligence of the event attendees:

There were simply too many too mention and this list above is not exhaustive. I did not exclude anyone intentionally. Also in attendance were Sue Feldman from IDC and Stephen Arnold from AIT, both of whom took part in a panel conversation to close the session. I will post again shortly with some summary notes that I found interesting from the conference.

Speaking Appearance at Search Engine 2007

I will be speaking at Search Engine 2007 in Boston next week. Should be a good conference run by Infonortics with a full list of speakers on a variety of different search topics ranging from search experience, collaborative search, business intelligence and generally where search is going. Speakers will be from a variety of organizations in the space including Google, Fast and Endeca to name a few.

My topic for discussion will be “Beyond Search: Visualizing Emerging Intelligence” and will cover:

This presentation discusses the current state of search, the advantages to text mining in extracting meaning from unstructured data as well as the future of search such as a move towards a role-based search environment, which will likely be one of the biggest technology trends to affect the enterprise. The concept of “role-based” search is about systems intelligent enough to understand the totality of what you do: your industry, your job and the daily tasks you undertake, and then help you accomplish those specific things more effectively. Effective role-based search applications will use technologies that uncover trending, comparison, discovery and determination of sentiment, which will then feed into applications that present the information using visualization and analytics. The session will also address business searching and how search networks will realign themselves to help all types of professionals find better information, faster.

To be honest, I’m as interested in attending to hear the variety of search topics from others in the industry as I am to share my experience and speaking at the event. The event should be very informative and any opportunity to get a bowl of New England Clam Chowder at Union Oyster House is a plus as well.

Correlation versus search relevance?

Before getting into any of the nuances and complexities of some of technologies we see leveraged today in the Web 2.0 world, it is very interesting when you simply look at ‘search’, something that we’ve all been around for years. When you boil it down to the most simple unscientific level, isn’t search relevance simply high correlations of a document found to ‘meeting the users needs’? Thus, couldn’t one argue that the very “best” article/document found from a search query has a correlation of 1.0 with the user’s expectations?

Take a look at a study done by a faculty member at the University of Technology in Sydney comparing the correlation of results by contextualising search engine with conventional engines. The premise of the hypothesis is that “an automated analysis not only of the content of candidate documents, but also the content of related (i.e. either directly or indirectly linked) documents and the structure of the relationships can be used to improve the effectiveness of search engines (Ellis 1996).” As the study points out:

Table 1: Comparison of contextualising search engine with conventional engines
Search query Conventional search engines Contextualising search engine
Terms = Fish
Context = Recipe
p = 0.63 p = 0.69
Terms = Java Programming
Context = Changes Recent Evolution
p = 0.74 p = 0.76
Terms = Interest Rate
Context = Mortgage Loan House Purchase
p = 0.72 p = 0.81
Terms = Beatles
Context = Music Lyrics
p = 0.54 p = 0.73

While the results are preliminary, the improved correlation of results in different searches does shed light that contextual searching is a value added way of standard searching. Pretty interesting about the power of context, something much more difficult to achieve off web…

The paper was written in 2000 which makes it interesting that Google is not even mentioned when the likes of Yahoo!, Magellan, AltaVista and Lycos are. Hey, for that matter, why not About.com, we were still considered a search engine back in ’00 even when we were using Inktomi as our core engine. Back to the point, Google via Page Rank truly pioneered by innovating search and contextualizing it through the power of linking and basing relevancy accordingly. It seems so intuitive now but you really have to give them credit for coming up with it then.