| By Rado Kotorov, Jake Freivald | Article Rating: |
|
| March 6, 2008 04:45 AM EST | Reads: |
6,454 |
This indexing process flow involves numerous steps: capturing the new incoming customer communication, creating dynamic joins with other tables and applications, running a procedure to aggregate the related case records, structuring and transforming the message into an indexing format required by the search engine, and passing it to the search engine for re-indexing, and deleting the prior record.
Vendors have taken different approaches to transactional data indexing:
- Crawling databases: Web search engines
have adopted an approach to transactional indexing similar to document
indexing - they crawl tables in databases using SQL select statements.
Crawling is an acceptable choice for slowly changing tables, but not
for large volumes of frequently changing data that needs to be
available for search in near-real time. It is also not very effective
for applications and highly normalized operational data stores.
- Passing the search query to the application:
This solution relies on some intelligence to determine how to match
search terms with applications. It then relies on the application for
data extraction and aggregation. This approach works well for simple
queries, such as stock price information. Implementation becomes more
daunting if users can run multiple queries against the same
application. In those cases, a self-service application will likely
offer more robust querying capabilities and be less confusing to the
user.
- Pushing application data to the index:
Instead of letting the engine crawl the records, an application pushes
data into the index using a search engine-provided indexing API. The
application makes all connections into the underlying data store and
has complete control over scheduling, interfacing protocols, and data
structures. The scope of effort to configure and use this method
depends on the extraction and transformation complexity and the
available application tools for it.
- Integrating data through SOA and process flows: These same APIs can let integration tools broaden the scope of the index. It requires integration capabilities, including transformation tools, process flow capabilities, and adapters, to define and execute the process that captures and enriches transaction data in real time.
User Interface Augmentation
With search
technologies, we're used to thinking that less is more. When a BI
search returns a large number of records, however, simple interfaces
displaying search hits ordered by relevancy aren't enough. Consider a
bell curve, for instance: even though the right-hand tail is small, it
may represent a large number of records in absolute terms. No one has
the time to page through hundreds of results, so BI search results must
enable interactivity to supplement relevancy. This helps users avoid
information overload and easily find the exact information they need.
Search Results Classification and Categorization
Two methods enhance the filtering of search results: classification and
categorization of the hits. Both methods appear the same to end users.
The underlying data is used to group the search results, and then
present the groups in ordinary tree controls to let the user select
parameters and narrow down the hits. This interaction is referred to as
guided navigation (see Figure 2).
Although they appear the same to users, categorization and classification create groups in fundamentally different ways.
Search companies, with roots in unstructured data, typically extract categories from the unstructured text using statistical methods. This automates the grouping process, but it doesn't give information architects any control over how records are grouped.
BI companies, with roots in structured data, dynamically classify records instead. Information architects define metadata about the structures they want to index; this metadata can precisely control how records are grouped.
The two methods aren't mutually exclusive. Categorization offers definite advantages with parameterized searchable structured data as well as unstructured content that contains structured metatags (pre-categorized unstructured content). Given the trend of tagging every piece of structured or unstructured content, classification clustering appears to be more complementary to categorization. If the BI search solution provides both methods, the classification and categorization can be displayed simultaneously, providing the user with a robust overview of the data.
As search emerges as the primary information access point, robust metadata will become even more important as it is used to build custom, adaptable navigation interfaces to augment or replace many current application interfaces.
Search Results Analytics
Users need to do more
with search results than filter them. Search returns a data set -
potentially quite large - and users will benefit from the ability to
manipulate it. Expect vendors to differentiate based on this emerging
requirement.
The common capability to sort results by date or relevancy provides little value on large result sets, because the first result page only shows the top or bottom hits. Sorting on metadata categories, which are provided by some vendors, gives users more power to explore and organize large result sets (see Figure 3).
Some vendors have recently added the ability to convert the search results from the standard Google-like display with snippets to a tabular view (see Figure 4). This suits structured data but, as with all features, not all tabular views are equal: most tabular views provide static data and can only be sorted by date, relevancy, and other predefined categories. Also, server-based sorting operations regenerate the tabular view on each user interaction. In these cases, the user only benefits from a different display compared to the standard view.
Other vendors convert results into a dynamic tabular view that applies calculations, visualizations, charts, roll ups, and pivot tables locally in the browser. This opens a whole new perspective on search, making the result set much more useful and enabling users to do reporting and ad hoc analyses; for example, comparing data along two or more dimensions, as they're accustomed to doing with pivot tables in Excel. A user's search for an HDTV might return hundreds of results, which the user could use to compare prices by brand and monitor size (see Figure 5).
Since reporting and analysis of this type is often done using a data warehouse, it's not surprising that some vendors require the creation of an intelligent data warehouse at the time of indexing. However, some vendors provide the ability to manipulate the data directly in the browser without requiring any additional technology. Keeping the data and reports self-contained provides additional advantages, such as saving and sharing them via e-mail.
Ad hoc analytics on search results seems to be the most promising area for creating a true search-driven BI.
Search-Based Reporting
To provide BI search to the masses, you have to avoid re-creating all the complexities of traditional BI.
For example, if the chosen solution only indexes reports, how will you support a user whose needed information isn't in any indexed report? In this type of solution, the report usually acts as an entry point that takes the user to the BI world to refine her request. The user may find what she needs by drilling down from within the report; if not, however, she has to use the regular BI tools to modify the existing report or to create an ad hoc report. The user has dropped from a simple search paradigm into all the complexities of BI that search should eliminate.
A metadata-based approach provides a different user experience. The indexed records or transactions act as the entry points to BI, and dynamically constructed metadata-driven report links can take the user to any information resource. For example, a police record search application can provide, directly from each criminal offense record, links to the offense details, a summary report of all criminal records for the offender, another summary report on all criminal activities within date and geographic ranges, a crime analysis, and police activity structured ad hoc reports. Any metadata associated with the hit is passed to the report or to the structured ad hoc form. This BI search solution gives untrained users one-click access to all reporting capabilities without dropping them into any BI tool. Unless the reporting capabilities are as robust and simple as the search is, applications and tools will remain the preferred point of entry to BI.
Conclusion
Search and BI complement each other
through more than just access to data, reports, and related documents.
Together, they expose a rich set of information resources to ordinary
users. It remains to be seen whether combined search and BI will go
mainstream; however, there are many applications that could leverage
their symbiotic relationship, and if the right indexing methodology and
technologies are deployed search may help bring BI to the masses.
Published March 6, 2008 Reads 6,454
Copyright © 2008 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Rado Kotorov
Dr. Rado Kotorov is a technical director of strategic product management at Information Builders Inc., responsible for emerging reporting, analytic and visualization technologies. Prior to joining Information Builders, he managed the implementation of BI solutions and decision-support systems, data warehouses, and custom applications. He has developed analytic models and applications for the pharmaceutical, retail, CPG, financial, and automotive industries. Rado Kotorov has a PhD in decision and game theory and economics from Bowling Green State University. He has publications on business processes, emerging technologies, CRM, KM, innovation, and entrepreneurship.
More Stories By Jake Freivald
Jake Freivald is the vice president of corporate marketing for Information Builders and iWay Software, an Information Builders company and leader in enterprise integration. In this position, he is responsible for developing and executing all of the solution marketing strategies. Jake joined Information Builders in 1999, prior to that he held several managerial positions with Andersen Consulting and Prudential Life Insurance Company of America.
- Yahoo! to Keynote 4th Cloud Expo: Accelerating Innovation with Cloud Computing
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Yahoo! SVP Shelton Shugar to Discuss Innovation at Cloud Computing Expo
- Ulitzer Provides a Powerful Social Journalism Platform
- Live Demo of Yahoo! Query Language at Cloud Computing Expo
- Bernanke Should Go Back to Teaching
- How to Extract Your Contacts from LinkedIn and Facebook
- Yahoo! Announces Open-Source Cloud Server
- Google Responds to the Bing Challenge
- Google Open Sources its JavaScript Tools
- Adobe Cans Another 9% of its Workforce
- Unix Co-Creator Writes New Open Source Programming Language for Google
- Yahoo! Named “Platinum Sponsor” of Cloud Computing Expo
- Yahoo! to Keynote 4th Cloud Expo: Accelerating Innovation with Cloud Computing
- Confessions of a Ulitzer Addict
- Wave on Ulitzer: Confessions of a Google Wave Fanboy
- Twitter, Linked In, Ning and Ulitzer: Easy Personal Branding Strategy
- Ulitzer Live! New Media Conference & Expo
- Ulitzer vs. Ning
- Yahoo! SVP Shelton Shugar to Discuss Innovation at Cloud Computing Expo
- Google Wave Hits Wider Beta
- Ulitzer Provides a Powerful Social Journalism Platform
- Social Media on Ulitzer - Strategy Nets New AUM for RIA
- Live Demo of Yahoo! Query Language at Cloud Computing Expo
- Where Are RIA Technologies Headed in 2008?
- The Top 250 Players in the Cloud Computing Ecosystem
- Google Version 2.0: Googzilla - The Calculating Predator
- Google Space Launches at Heathrow Airport
- SEO/SEM Tips & Tricks: How and When Should You Submit Your Website to Google?
- Google Snaps Up the Father of the Orion Search Engine
- AOL To Enhance Video Search Engine by Adding RSS Feeds
- Ulitzer vs Knol - Google Wants Its Own Wikipedia
- AJAXWorld Knocks Spots Off LinuxWorld
- The World's Youngest "Google Entrepreneur" Is One Month Old
- Microsoft's Chase After Google Reverberates
- Google Jabbers On with GoogleTalk






























