lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan McKinley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2649) FieldCache should include a BitSet for matching docs
Date Thu, 23 Sep 2010 16:20:33 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914082#action_12914082
] 

Ryan McKinley commented on LUCENE-2649:
---------------------------------------

bq. Q: why aren't the CachePopulator methods just directly on EntryConfig - was it easier
to share implementations that way or something?

Two reasons (but I can be talked out of it)
1. this approach separates what you are asking for (bits/values/etc) from how they are actually
generated (the "populator").  Something makes me uncomfortable about the caller asking for
Values needing to also know how they are generated.  Seems easy to mess up.  With this approach
the 'populator' is attached to the field cache, and defines how stuff is read, vs the 'EntryConfig'
that defines what the user is asking for (particulary since they may change what they are
asking for in subsequent calls)

2. The 'populator' is attached to the FieldCache so it has consistent behavior across subsequet
calls to getXxxxValues().  Note that with this approach, if you ask the field cache for just
the 'values' then later want the 'bits' it uses the same populator and adds the results to
the existing CachedArray value.


bq. It doesn't seem like we need two methods fillValidBits , fillByteValues

The 'fillValidBits' just fills up the valid bits w/o actually parsing (or caching) the values.
 This is useful when:
1. you only want the ValidBits, but not the values (Mike seems to want this)
2. you first ask for just values, then later want the bits.  

Thinking some more, I think the populator should look like this:
{code:java}
public abstract class CachePopulator 
  {
    public abstract ByteValues   createByteValues(   IndexReader reader, String field, EntryConfig
config ) throws IOException;
    public abstract ShortValues  createShortValues(  IndexReader reader, String field, EntryConfig
config ) throws IOException;
    public abstract IntValues    createIntValues(    IndexReader reader, String field, EntryConfig
config ) throws IOException;
    public abstract FloatValues  createFloatValues(  IndexReader reader, String field, EntryConfig
config ) throws IOException;
    public abstract DoubleValues createDoubleValues( IndexReader reader, String field, EntryConfig
config ) throws IOException;
    
    public abstract void fillByteValues(   ByteValues   vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;
    public abstract void fillShortValues(  ShortValues  vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;
    public abstract void fillIntValues(    IntValues    vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;
    public abstract void fillFloatValues(  FloatValues  vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;
    public abstract void fillDoubleValues( DoubleValues vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;

    // This will only fill in the ValidBits w/o parsing any actual values
    public abstract void fillValidBits( CachedArray  vals, IndexReader reader, String field,
EntryConfig config ) throws IOException;
  }
{code}

The default 'create' implementation could look something like this:

{code:java}

    @Override
    public ShortValues createShortValues( IndexReader reader, String field, EntryConfig config
) throws IOException 
    {
      if( config == null ) {
        config = new SimpleEntryConfig();
      }
      ShortValues vals = new ShortValues();
      if( config.cacheValues() ) {
        this.fillShortValues(vals, reader, field, config);
      }
      else if( config.cacheValidBits() ) {
        this.fillValidBits(vals, reader, field, config);
      }
      else {
        throw new RuntimeException( "the config must cache values and/or bits" );
      }
      return vals;
    }
{code}

And the Cache 'createValue' would looks somethign like this:
{code:java}

  static final class ByteCache extends Cache {
    ByteCache(FieldCache wrapper) {
      super(wrapper);
    }
    
    @Override
    protected final ByteValues createValue(IndexReader reader, Entry entry, CachePopulator
populator) throws IOException {
      String field = entry.field;
      EntryConfig config = (EntryConfig)entry.custom;
      if (config == null) {
        return wrapper.getByteValues(reader, field, new SimpleEntryConfig() );
      }
      return populator.createByteValues(reader, field, config);
    }
  }
{code}

thoughts?  This would open up lots more of the field cache... so if we go this route, lets
make sure it addresses the other issues people have with FieldCache.  IIUC, the other big
request is to load the values from an external source -- that should be possible with this
approach.

> FieldCache should include a BitSet for matching docs
> ----------------------------------------------------
>
>                 Key: LUCENE-2649
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2649
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch,
LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch, LUCENE-2649-FieldCacheWithBitSet.patch
>
>
> The FieldCache returns an array representing the values for each doc.  However there
is no way to know if the doc actually has a value.
> This should be changed to return an object representing the values *and* a BitSet for
all valid docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message