lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: tweak to analysis.jsp for payload
Date Sun, 06 Apr 2008 09:18:04 GMT
I don't know just yet that the AnalysisReqH (ARH) is going to replace  
analysis.jsp.  The JSP page does things that the ARH doesn't,  
specifically, handling the output after every token filter.  In my  
mind, the ARH is useful as a Token server for things like machine  
learning (i.e. Mahout :-)  ) and/or other applications that just have  
a need for the final tokens of a document.  I think the response would  
get pretty ugly looking if it were to try to serve up the intermediate  
tokens.  In other words, I have no intent on working on it, but if  
someone else comes up w/ a useful way of doing it, then I wouldn't try  
to stop it, either.

It might be useful to define a mechanism whereby one can plugin a  
Payload decoder into Solr that could be used by analysis.jsp.  This  
would allow applications a means to make sense of payloads and have  
them attached to tokens.

-Grant

On Apr 6, 2008, at 1:59 AM, Tricia Williams wrote:

> Replies to several comments in this thread inline:
>
> Grant Ingersoll wrote:
>> Yes, that is definitely the case, but I think Tricia was more  
>> getting at how to use them for display, i.e deserializing them into  
>> a String or whatever.  I still have on my plate that I want to  
>> figure out how to incorporate payloads with SpanQuery as that is  
>> the logical means of getting at them query wise.
>>
>> -Grant
>>
>
> Grant is right that my intention is to visualize the Payloads in the  
> same way that analysis.jsp allows users to visualize what  
> TokenFilters are doing to the position, term text, token type, and  
> start and end offsets.  This would be a crude way to debug or demo  
> what your payload savvy TokenFilter/Tokenizer does to a given  
> TokenStream.
>
> I went through the JIRA issues trying to figure out what was being  
> done with Payloads to see if this would help clarify my display  
> problem.  I came across Grant's AnalysisRequestHandler which looks  
> like its intent is to replace analysis.jsp at some point.  It looks  
> like two short months ago the call on including Payloads was to  
> punt, "since Solr doesn't currently support payloads, not much point  
> in outputting them just yet."  I guess that is what he was trying to  
> tell me in this thread too.
>
> Grant Ingersoll wrote:
>> As the guy who wrote PayloadHelper, what I really imagined was  
>> using Lucene's vint, etc. stuff, but that was a bit more  
>> refactoring wise.  It can be handy for some payloads, but it is  
>> still on the app developer to know what was put in the payload.   
>> What this means in terms of Solr is still up in the air.  No one  
>> has worked through what adding payloads means yet.
>
> Would it be completely ignorant of me to suggest that an abstraction  
> of Payload contain a public decode() method with an Object as a  
> return type?  Or maybe Payload's toString should be overridden to  
> provide a string representation for display -- possibly doing  
> something like Hoss described?
>
> Chris Hostetter wrote:
>> I've never really looked at PayloadHelper, but if i were tasked  
>> with trying to find a way to display in HTML an arbitrary byte[]  
>> that may or may not be a String, i would start by attempting a  
>> String conversion, if that succeds *and* all chars in the resulting  
>> String are "printable" ( ie: Character.isDefined(c) && !  
>> Character.isISOCOntrol(c)) then display the first N chars (where N  
>> is some reasonable max size to display) ... if not, then just  
>> display the first N characters of the hex string representing the  
>> byte[].
> Thanks for the feedback.  It is always appreciated!
>
> Tricia


Mime
View raw message