Time has been a subject of study in many disciplines particulary in philosophy,
physics, and art.
Time is an important dimension of any information space
and can be very useful in information retrieval.
A quick look at any of the current
search engines shows that the temporal aspect is restricted to sort
the hit list by the date attribute only. Can we do better? Is there room for
improving
relevance? How about search results presentation?
In this project, we study different ways in which temporal information
explicit or implicit in documents and document collections can be used to
cluster hit-lists based on time, profile documents based on their time
properties, and explore search results using timelines.
Once again, information retrieval top illustrator
(Mateo),
exemplifies
Below is the list of current sub-projects, an early version of the demo, and
a few screenshots. As soon as new stuff is ready, I'll update this
page.
Hit-list clustering with temporal attributes
Clustering of search results is an important feature in many of today's
information retrieval applications. The notion of hit list clustering appears
in Web search engines and enterprise search engines as a mechanism that allows
users to further explore the coverage of a query. However, there has been
little work on exposing temporal attributes for constructing and presentation
of clusters.
Pacha: exploratory search using timelines
In search situations where the task requires the browsing
and exploration of search results, we believe that
temporal information can help significantly to accomplish
respective tasks. The presentation of relevant information
along a well-defined and understood timeline is an
important step to find, for example, the most recent
document relevant to a query or the first point in time a
document (based on the temporal information contained in
the document).
Demos
The machine is a bit slow and the server may be down, but give it a try. Also,
please let me know what do you think.
Subset of DBLP
demo with Timeline is here.
The TimeWall version is also available with a bit of a raw interface, but
you can get the idea.
TimeBank - coming soon
Visualizations
Using timelines to present search results looks like an
easy task. That said, a timeline is not necessary a
straight arrow. After all, Minard's classic chart is also a timeline.
There's been interesting work on time-based visualizations, so we
try to leverage some of those components as much as we can.
Here are a couple of screenshots of the current prototype.
You can click on the images to enlarge.
Recenty, Google just added a timeline representation with view:timeline for
queries.
Inxight's TimeWall
The first screenshot is using the DBLP dataset for the query "Modula".
The cards on the wall represent
records that contain the DBLP entry (author, title, journal, year).
Next to it, we see the same visualization metaphor with the TimeBank data
set. The cards on
the wall now represent the different news sources (CNN, AP, WSJ, etc.).
If you click on the card, you can see the entire document on the right panel.
SIMILE's TimeLine
Using the same DBLP
dataset but now for the query "compiler".
The backend system retrieves
all journal articles that contain "compiler" in the title and
returns a hit list clustered by year. All the search results are
anchored in the timeline. If more than one article falls
within a year, the order is based on its relevance to the
query.
The last screenshot shows the same visualization but with the TimeBank data set.
Our timeline generation shows the same document anchored in three different
time periods.