lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1270) The FloatField (and probably others) field type takes any string value at index, but JSON writer outputs as numeric without checking
Date Wed, 22 Jul 2009 19:21:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734263#action_12734263
] 

Hoss Man commented on SOLR-1270:
--------------------------------

bq. What do people thing about the attached patch?

+0

This slows down query time response writing for the 95% case:  people who have indexed clean
data and don't need it to be sanity checked.  it seems like it violates the whole point of
FloatField: be fast and trust the data.

But if yonik thinks it's worth the trade off - i won't argue. 

Personally: if we're going to make FloatField more paranoid, it seems like validation when
input (indexing) would be saner then validating on output since input tends to happen less
often then output, and users are typically more concerned about query speed then indexing
speed.

Although it looks like Yonik already made a similar change to IntField and LongField back
in SOLR-424 (how did i miss seeing that before?) so i guess we should at least make all the
basic types consistent. (which means we shouldn't forget DoubleField and ByteField)

(In an ideal world, FloatField would have been named "SimpleUncheckedFloatField" and the javadocs
would have made it clear that it was for backwards compatibility with existing lucene indexes,
that it did no sanity checking of the input, and it's only distinction from StrField was to
preserve metadata about the datatype (ie: float) for use by the response writers.  Then we
could have reserved the name "FloatField" for a much more stringent FieldType that sanity
checked the data coming in *and* going out -- which is essentially what SortableFloatField
is)


> The FloatField (and probably others) field type takes any string value at index, but
JSON writer outputs as numeric without checking
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1270
>                 URL: https://issues.apache.org/jira/browse/SOLR-1270
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.2, 1.3, 1.4
>         Environment: ubuntu 8.04, sun java 6, tomcat 5.5
>            Reporter: Donovan Jimenez
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1270.patch
>
>
> The FloatField field type takes any string value at index. These values aren't necessarily
in JSON numeric, but the JSON writer does not check its validity before writing it out as
a JSON numeric.
> I'm aware of the SortableFloatField which does do index time verification and conversion
of the value, but the way the JSON writer is working seemed like either a bug that needed
addressed or perhaps a gotch that needs better documented?
> This issue originally came from my php client issue tracker: http://code.google.com/p/solr-php-client/issues/detail?id=13

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message