lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Brennsteiner <christ...@brennsteiner.at>
Subject stream of events never to know when it ends? how to index such things & search
Date Wed, 18 Feb 2009 15:20:35 GMT
dear lucene community,

i am playing around with lucene right now. and have come to very bad problem.

given environment:

a signal source gives signals with eventids ans eventdescriptions

for example EVENTID=1 and EVENTDESCRIPTION="STARTING EVENT"

those events can be running very long (e.g. one month) during this
period we will receive for example

EVENTID=1 and EVENTDESCRIPTION="EXECUTING XYZ"
10 minutes later
EVENTID=1 and EVENTDESCRIPTION="EXECUTING YZA"
10 minutes later
EVENTID=1 and EVENTDESCRIPTION="PASSED MILESTONE1"
10 minutes later
EVENTID=1 and EVENTDESCRIPTION="EXECUTING ZAB"

after e.g. 1 week we receive
EVENTID=1 and EVENTDESCRIPTION="STOPING EVENT"

what i want:
i want to be able to search e.g. which eventids are connected to "XYZ"
AND "ZAB" AND have already passed "MILESTONE1"

so my current try is to index all events by full indexing (without
storing) eventdescriptions AND stemming e.g. EXECUTING

then searching for "+XYZ +ZAB +MILESTONE1"
--> result no document since those are all seperated documents
when i search
 "XYZ ZAB MILESTONE1"
i am getting 3 times EVENTID 3
--> this is bad since when i get 1000000 of such events how do i rank them?

CONCLUSION:
my biggest problem is that my lucene document given to the index
currently is not in a final state BUT i have to index and search it
also while it is in progress.
as a result of this the ranking as i do it now has no real value since
the ranking is just based on a "line of a whole event"

QUESTION:
is there a solution within lucene to combine search results? e.g. merge them OR
is there a better workaround how i would do such updates to the index
without storing the original docmuent inside the index (since this
consumes so many space)? e.g. extracting the keywords that were stored
for the item?

any hints appreciated.

regards chris


----------
Christian Brennsteiner
Salzburg / Austria / Europe

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message