lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Diehl <mdi...@lexprompt.com.INVALID>
Subject Re: Lucene 4.8 - Reusing Document during indexing
Date Sun, 11 Jun 2017 09:57:36 GMT
Thanks for the details Shad.

It's a little bit of a pain to use. Not as easy as what you showed, since
you have to typecast:
 ((TextField)luceneDoc.GetField("text")).SetStringValue( block.Text );

If you do not typecast, then SetStringValue is not available.

Also strangely, it doesn't matter what I typecast it to. I can typecast to
Int32Field and I get SetStringValue and SetInt32Value. I can typecast to
TextField and still have SetInt32Value.
If it doesn't matter what it is cast to, can we get the function
definitions in IIndexableField, which is the return type of GetField()?

Thanks,
Matt


On Sun, Jun 11, 2017 at 2:00 AM, Shad Storhaug <shad@shadstorhaug.com>
wrote:

> Matt,
>
> Since a field needs to keep track of both the value and the type, the
> field values are set using methods that include type name.
>
> luceneDoc.GetField( "text" ).SetStringValue( block.Text );
>
> Setting the field value using a common SetValue function is something that
> was carefully considered, but it would mean you would have to be extremely
> explicit when setting the correct type. For example:
>
> float value1 = 5.00000001;
> string value2 = value1.ToString()
>
> luceneDoc.GetField( "number" ).SetValue(value2);
>
> object value3 = luceneDoc.GetField( "number" ).GetNumericValue();
>
>
> The above code would produce an error because the field was originally set
> as a string, but a float was expected to be stored. This would produce a
> bug that might be hard to track down, where forcing the developer to think
> about what type they are trying to set (SetSingleValue) makes it more
> explicit and less likely to go wrong, since it would produce a compile-time
> error.
>
>
> That said, an overloaded SetValue is more .NET-like and in this particular
> case we don't have any duplicate types that would cause collisions so we
> could add an overloaded SetValue method and convert the existing methods
> into extension methods in the Support namespace. I would be interested in
> hearing any feedback on whether explicitly specifying the type in the
> method name or explicitly casting to the correct type (as was the case in
> 3.0.3) is preferable. In .NET, the overloaded methods don't normally all
> store the value in the same object variable under the covers, so making
> explicit methods seems like a better choice to me.
>
>
> On a side note, it looks like we should deprecate all of the
> FieldExtensions methods except IsStored to make sure people are aware that
> they will not be available after Lucene.Net 4.8, since the corresponding
> enumerations have been deprecated.
>
> Thanks,
> Shad Storhaug (NightOwl888)
>
>
> -----Original Message-----
> From: Matt Diehl [mailto:matt@gooddiehl.net.INVALID]
> Sent: Sunday, June 11, 2017 9:57 AM
> To: user@lucenenet.apache.org
> Subject: Lucene 4.8 - Reusing Document during indexing
>
> Hi,
>
> I am not understanding how to reuse Document like we could in 3.0.3 for
> indexing purposes.
>
> For instance, in 3.0.3, I could create and then set several common Field
> values, and then just iterate changing a single field in the Document, and
> add to index:
>
> Document lucenedoc = createDocumentAndSetFileSpecificFields( file );
>
> foreach ( var block in blocks )
> {
>         luceneDoc.GetField( "text" ).SetValue( block.Text );
>         indexWriter.AddDocument( luceneDoc ); }
>
> In 4.8, SetValue is not a function anymore, and it seems like I have to
> recreate my 8-field Document every time I write to Index.
>
> foreach ( var block in blocks )
> {
>     Document lucenedoc = createDocumentAndSetFileSpecificFields( file,
> block.Text );
>     indexWriter.AddDocument( luceneDoc ); }
>
> Can someone help me realize what I am missing?
>
> Thanks,
> Matt
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message