lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: Is any way to return the number of indexed tokens in a field?
Date Sun, 14 Apr 2013 13:38:40 GMT
Alex,

It's not what do you need to count, pre-analyzed values or tokens as an
analysis result.
if former, I suggest you to look into something like
FieldLengthUpdateProcessorFactory, in case of later you need to override
Similarity.computeNorm(String, FieldInvertState) / encode/decodeNorm.



On Sun, Apr 14, 2013 at 8:29 AM, Alexandre Rafalovitch
<arafalov@gmail.com>wrote:

> Hello,
>
> We seem to have all sorts of functions around tokenized field content, but
> I am looking for simple count/length that can be returned as a
> pseudo-field. Does anyone know of one out of the box?
>
> The specific situation is that I am indexing a field for specific regular
> expressions that become tokens (in a copyField). Not every field has the
> same number of those.
>
> I now want to find the documents that have maximum number of tokens in that
> field (for testing and review). But I can't figure out how.  Any help would
> be appreciated.
>
> Regards,
>    Alex.
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all at
> once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mkhludnev@griddynamics.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message