lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: SpanQueries in Luke
Date Fri, 05 Mar 2010 11:26:37 GMT
>>The lack of standardized metadata is an issue, of course - we could
start experimenting with this in Luke, to see whether we can squeeze a
subset of Solr schema there.

Actually, an "AnalyzerFactory" interface in Luke might provide the abstraction which would
allow Solr, my proprietary metadata system, and any other config resource to hook into Luke.

That would be pretty cool 



----- Original Message ----
From: Andrzej Bialecki <ab@getopt.org>
To: java-user@lucene.apache.org
Sent: Fri, 5 March, 2010 11:11:12
Subject: Re: SpanQueries in Luke

On 2010-03-05 11:22, mark harwood wrote:
>>> I'll commit the current mostly-working state today, you can take a look
> 
> OK. However I think this XMLQueryParser addition will only resurface a long-standing
issue with Luke and Lucene in general.
> This query parser works best on multiple fields (e.g. free-text<UserQuery>  tags
and<TermsFilter>  on structured fields). Each field typically requires different analyzers
and there is currently no way of recording this information as metadata alongside an index.
> Without this metadata each user's Luke session starts with a game of "guess-which-analyzer-to-use?"

I guess ;) that generally speaking there is no good answer to this - the same token stream
could have been produced by varying analysis chains, even across indexing sessions that append
to the same index.

> 
> I use my own proprietary system for storing such index metadata and this is through an
XML file that contains a BeanEncoder-serialized PerFieldAnalyserWrapper among other things.
> It would be nice to see some standardisation in how this information can be made available
in *any* Lucene index but I guess this overlaps with things like Solr's config.

Yes.

Theoretically one could store such information in IndexCommit.getUserData(). The lack of standardized
metadata is an issue, of course - we could start experimenting with this in Luke, to see whether
we can squeeze a subset of Solr schema there.

-- Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message