lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: [RegexQuery] how to check what words were founded in particulary Documents ?
Date Fri, 20 Jul 2007 20:49:21 GMT
Erick - I think you're mixing things up with WildcardQuery.   
RegexQuery does support all regex capabilities (depending on the  
underlying regex matcher used).

A couple of techniques you could use to achieve the goal:

	* Use RegexTermEnum, though that'll give you the terms across the  
entire index, so maybe in your use case you could index a single  
document into a RAMDirectory and RegexTermEnum on it.

	* Try out SpanRegexQuery and use getSpans() to get the exact matches.

Erik



On Jul 20, 2007, at 4:10 PM, Erick Erickson wrote:

> First, the period (.) isn't part of the syntax, so make sure you look
> more carefully at the Lucene syntax...
>
> Then, you might be able to use WildcardTermEnum to find
> the terms that match and TermDocs to find the documents
> that contain those terms.
>
> There's nothing built into Lucene to do this out of the box, you
> have to roll your own.
>
> Best
> Erick
>
> On 20 Jul 2007 21:27:40 +0200, mhzmark@interia.eu <mhzmark@interia.eu>
> wrote:
>>
>> Hello.
>>
>> Let assume that I have this code in my application:
>>
>>    (...)
>>    Query query = new RegexQuery(new Term("field", "C.T"));;
>>    // searching...
>>    (...)
>>
>> And now, I would like to know if my application founded "cat" or  
>> "cot" or
>> something else. How can I check what was founded by my application ?
>>
>> I would like to write application like this:
>>    INPUT -> regular expression
>>    OUTPUT -> file  ---> word
>>
>> example: INPUT = "C.T"
>>          OUTPUT =
>>                   a.txt --> CAT
>>                   a.txt --> COT
>>                   b.txt --> CAT
>>                   b.txt --> CAT
>>                   b.txt --> COT
>>                   (...)
>>
>> So, how to check what words were founded in particulary Documents  
>> after
>> searching? I see that Hits class contains only founded documents and
>> nothing more (I am new in this technology so I can be wrong...)
>>
>>
>>
>>
>>
>>
>>
>> --------------------------------------------------------------------- 
>> -
>> Dowiedz sie, co naprawde podnieca kobiety. Wiecej wiesz, latwiej je
>> oczarujesz
>>
>> >>>http://link.interia.pl/f1b17
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message