lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Karman <pe...@peknet.com>
Subject Re: [lucy-user] C Library: Regex query
Date Fri, 09 Jun 2017 03:12:48 GMT
Serkan Mulayim wrote on 6/8/17 7:12 PM:
> Hi guys,
> 
> I would like to ask if it is possible to do regex queries (without adding new fields,
and tokenizing differently) in the C library. What I need to do is to be able to be able to
return documents based on file name suffix. So that a query as (*.pdf) should return all documents
that contain a PDF file type.
> 
> I can understand the complexity it creates for the searcher to do a suffix query. But
in my use case there would not be many files that are associated with the documents. So that
attachment fields will exist for small number of documents.
> 
> If this is not possible, I will also index the documents with their file types in a new
field. (or reverse the attachment names).
> 

afaik there is no C implementation of the Regex query. I wrote the Perl version.

https://metacpan.org/release/LucyX-Search-WildcardQuery

You will be *much* happier with storing the file extension as a separate field 
and searching on that. Far far more efficient at search time than munging a regex.


-- 
Peter Karman  .  https://karpet.github.io  .  https://keybase.io/peterkarman

Mime
View raw message