lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2380) Add FieldCache.getTermBytes, to load term data as byte[]
Date Sat, 05 Jun 2010 16:07:56 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875913#action_12875913
] 

Yonik Seeley commented on LUCENE-2380:
--------------------------------------

FYI, while trying to implement an iterator over the fieldcache terms, I ran into a bug where
each term is written twice. This causes double the memory usage for the bytes (but no functionality
bugs). I'll fix shortly, and anyone who has done performance tests might want to redo them
again (cache effects, GC differences, and bigger entry build times). 

> Add FieldCache.getTermBytes, to load term data as byte[]
> --------------------------------------------------------
>
>                 Key: LUCENE-2380
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2380
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-2380.patch, LUCENE-2380.patch, LUCENE-2380.patch, LUCENE-2380.patch
>
>
> With flex, a term is now an opaque byte[] (typically, utf8 encoded unicode string, but
not necessarily), so we need to push this up the search stack.
> FieldCache now has getStrings and getStringIndex; we need corresponding methods to load
terms as native byte[], since in general they may not be representable as String.  This should
be quite a bit more RAM efficient too, for US ascii content since each character would then
use 1 byte not 2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message