lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gudiseashok <>
Subject New to Apache Lucene: Need help in querying data - text with wildCards
Date Mon, 10 Feb 2014 17:06:59 GMT
I have an application which is a log-analyzer, and I am using Apache Lucene
to index my data, and I am storing only message in it (I am not storing all
other fields in my object), and I am not using any database so I am using
store for message though its huge) but I am taking care of deleting this
data weekly to start a fresh indexing.
I have created a domain object to ease my search with lucene in retrieving
and indexing  my data.
I have these kind of fields in my object, 
className (value is fully qualified class with package, example:
com.domain.infrastructure.MyClass), messageType (value example: xml, log
message, exception)
logLevel, timestamp (I am storing this as Long type)  
and logMessage (contains text and special characters like <,[,{.etc.)
Main purpose is to retrieve logMessage based on user request, few scenarios

Case 1:  User can request a soap message (messageType:XML), at
particularTime (timestamp: longVariable), 
Case 2: User can request a particular message (messageType: logMessage), at
particular time (timestamp:longVariable), from particular className
Or Case 3: User can request a particular message(messageType: Exception), in
loglevel (logLevel: DEBUG) at particular time (timestamp:longVariable)
Currently I am Indexing data like this:
document.add(new StringField("className", logsVO.getClassName(),
	        document.add(new StringField("logLevel", logsVO.getLogLevel(),
document.add(new TextField("logMessage", logsVO.getLogMessage(),
document.add(new StringField("messageType",
logsVO.getMessageType().toString(), Field.Store.NO));
document.add(new NumericDocValuesField("path", logsVO.hashCode()));
document.add((new LongField("timeStamp", logsVO.getTimeStamp().getTime(),
Actual Log Line is like this:
2013-12-19 15:53:42.379 [server.startup : 0]  DEBUG 
o.a.commons.digester3.Digester -
[ObjectCreateRule]{maplist/recvmap/recvfrag/recvfragoccurs/recvprop} Pop
So here 2013-12-19 15:53:42.379 is timestamp, 
[server.startup : 0] - I will ignore this part
DEBUG   is logLevel, 
‘o.a.commons.digester3.Digester’ is className 
[ObjectCreateRule]{maplist/recvmap/recvfrag/recvfragoccurs/recvprop} Pop
'' ---- This is my logMessage

Now I am coming to my Problem: I have tried PhraseQuery,BooleanQuery and
WildcardQuery too, but only time I am getting results is when I mentioned a
small string like “pop” (in above logMessage), in all other cases which has
any special characters I am not getting the results. Can anyone suggest what
would be the pattern I have to use to satisfy above mentioned three cases
user request? 

I appreciate your help in this regard.

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message