Some notes on database-driven information visualization
Information visualization metaphors have been around for many years
yet they have not reached mass adoption. We argue that a possible
reason is that without proper content organization and structure,
there is not much room for visualization. A database provides the
necessary infrastructure for organizing and accessing data in a very
flexible way. Also, database techniques for scalability and
performance can help to visualize massive data sets. Adding a
visualization on top of a database system to take advantage of the
various database features thus is a natural choice.
This page contains a few demos of some ideas. Everything is work in progress
so things may be down. Drop me a note if things are not working.
- Dynamic exploration of large data sets using Tree maps. This
live demo
(Java plug-in is required)
illustrates a data exploration process that consists of
selecting a particular data source and using a tree map to visualize
attributes and therefore gain insights. It starts with a selection of a
large email data set and the visualization of
topics using clustering.
The tree map allows the user to select different attributes (score, size, etc.)
to explore the data set using the visualization. It is also possible to
select the content of each cluster and present the content in the same
interface. Since the interface is driven by the data exploration views and
metadata, certain sources have different attributes. By selecting either
different sources or attributes one can explore different data sets using
the same tree map visualization. There are currently three data sources
available: the Apache mailining list archive, Google News, and the Apache
CVS data set.
The demo shows a 3-view panel that
contains 1) the tree map visualization, 2) filter options, and 3) the content
area for presenting the document. Click on the images below in case you
can't access the demo.
- How does it work? The prototype uses the Oracle10g database
as the back-end with our generation package that takes an XML representation of
the data to be visualized and generates the exploration views.
There are four main components in the prototype: (1) data exploration
generator, (2) cluster extractor, (3) transformer, and (4) applet
generator. The generation of views is done using a template approach
and taking advantage of all the metadata available in the data
dictionary (table names, keys, etc.). The cluster extractor component
issues SQL queries to the data exploration view and returns results in
an intermediate XML representation. The transformer then modifies the
data for the particular tree map format. Finally, the applet generator
produces a Web page with all Java applet parameters. This applet
reads the query output stream and produces the visualization in the
browser.
Of course, you can select
the content
of a cluster and also click on a particular item. I'm using an
early copy of Lab Escape's SDK
for the tree map.
Back