lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Empty Sink Tokenizer
Date Tue, 31 Mar 2009 16:26:38 GMT
Well, we don't make any guarantees about it in docs, AFAICT, but we  
have in the past advertised it (via the mailing lists) as such.  The  
Tee/Sink stuff does rely on what has been the de facto way of doing  
things up until 2.3 it sounds.  The snippet of code I included can  
easily be converted to a test case if we wish to enforce it going  
forward.

What's the benefit of collation?  I don't know if this is considered a  
back-compatible breakage or not (likely not) but this issue does come  
up from time to time and there are people that have relied on our  
answer.

In the end, we should document whichever it is going to be and then  
make sure the Tee/Sink stuff documents it as well.



On Mar 31, 2009, at 10:51 AM, Michael McCandless wrote:

> Uh-oh: I think this happened as part of LUCENE-843, which landed in  
> 2.3.
>
> IndexWriter now first collates each Field instance, by name, and then
> visits those fields in sorted order.  Multiple instances of the same
> field name are written in the order that they appeared in the
> document.
>
> StoredFieldsWriter taps in to the indexing chain after that per- 
> field collation.
>
> But, if getting back to this is important, we should be able to move
> StoredFieldsWriter up in the chain so that it visits the original
> document, instead.  Offhand, I'm not sure if there are any tradeoffs
> in doing that.
>
> Mike
>
> On Tue, Mar 31, 2009 at 9:30 AM, Grant Ingersoll  
> <gsingers@apache.org> wrote:
>> Has the way fields get added changed recently?
>>  http://www.lucidimagination.com/search/document/954555c478002a3/empty_sinktokenizer
>>
>> See also:
>> http://www.lucidimagination.com/search/document/274ec8c1c56fdd54/order_of_field_objects_within_document#5ffce4509ed32511
>>
>> http://www.lucidimagination.com/search/document/d6b19ab1bd87e30a/order_of_fields_returned_by_document_getfields#d6b19ab1bd87e30a
>>
>> http://www.lucidimagination.com/search/document/deda4dd3f9041bee/the_order_of_fields_in_document_fields#bb26d84091aebcaa
>>
>>
>> The following little program confirms that they are indeed in alpha  
>> order
>> now and not in added order:
>> public class TestFieldOrdering extends LuceneTestCase {
>>  protected RAMDirectory dir;
>>
>>  protected void setUp() throws Exception {
>>    super.setUp();
>>    dir = new RAMDirectory();
>>
>>  }
>>
>>  public void testAddFields() throws Exception {
>>    IndexWriter writer = new IndexWriter(dir, new SimpleAnalyzer(),  
>> true,
>> IndexWriter.MaxFieldLength.LIMITED);
>>
>>    Document doc = new Document();
>>    doc.add(new Field("id", "one", Field.Store.YES, Field.Index.NO));
>>    doc.add(new Field("z", "document z", Field.Store.YES,
>> Field.Index.ANALYZED));
>>    doc.add(new Field("a", "document a", Field.Store.YES,
>> Field.Index.ANALYZED));
>>    doc.add(new Field("e", "document e", Field.Store.YES,
>> Field.Index.ANALYZED));
>>    doc.add(new Field("b", "document b", Field.Store.YES,
>> Field.Index.ANALYZED));
>>    writer.addDocument(doc);
>>    writer.close();
>>    IndexReader reader = IndexReader.open(dir);
>>    Document retreived = reader.document(0);
>>    assertTrue("retreived is null and it shouldn't be", retreived !=  
>> null);
>>    List fields = retreived.getFields();
>>    for (Iterator iterator = fields.iterator(); iterator.hasNext();) {
>>      Field name = (Field) iterator.next();
>>      System.out.println("Name: " + name);
>>    }
>>  }
>>
>> }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message