lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Math Processing for Solr
Date Thu, 15 Apr 2010 13:58:22 GMT
Payloads are used to set boosts for tokens.  Have a look at the PayloadTermQuery.  There is
a patch for support in Solr, but it isn't committed yet.

-Grant

On Apr 15, 2010, at 8:46 AM, mato@gjgt.sk wrote:

> Yes, I considered creating own analyzer with a set of filters. Trouble is,
> that I wouldn't be able to set different boosts for the tokens created by
> the filters(filters need to create additional token to the input one and
> set a lower boost for it), which is kind of crucial funcionality. Even the
> tokenizer at the beginning of the process needs to set different boosts to
> different tokens produced. As far as I know, it is possible to set boosts
> only to Fields though.
> This is now more of a discussion for the Lucene lists, I guess.
> 
> Thanks for the replies anyway.
> 
> Martin
> 
>> (perhaps more appropriate on solr-user@)
>> 
>> It sounds like you want to make a MathML filter?  Check out the
>> analyzer packages...
>> 
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
>> 
>> simple example:
>> https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/java/org/apache/solr/analysis/LengthFilterFactory.java
>> 
>> ryan
>> 
>> 
>> 2010/4/14  <mato@gjgt.sk>:
>>> Hello everybody,
>>> 
>>> I'm new to all this so I hope this isn't too noob a question and that it
>>> isn't very inappropriate here.
>>> 
>>> I'm currently working on a indexing/searching application based on
>>> Apache
>>> Lucene core, that can process mathematical formulae in MathML format
>>> (which is extension to XML) and store it in the index for searching. No
>>> troubles here, since I'm making everything above Lucene.
>>> 
>>> But I started to think it would be nice to write this mathematical
>>> extension so it could be incorporated into Solr as easy as possible in
>>> the
>>> future. The thing is I looked into Solr's sources and I'm all confused
>>> to
>>> be honest and don't know which way to do this.
>>> 
>>> Basic workflow of the whole math processing would be:
>>> Check the input document for any math->if found, mathematical unit needs
>>> to process it and produce many string-represented formulae with
>>> different
>>> boosts->put these into index not tokenized furthermore.
>>> 
>>> That's about it.
>>> Any ideas? Any help will be appreciated.
>>> 
>>> Thank you
>>> 
>>> Martin
>>> 
>>> 
>> 
> 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


Mime
View raw message