lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcus Herou <marcus.he...@tailsweep.com>
Subject Re: Short is buggish ?
Date Mon, 08 Feb 2010 08:48:03 GMT
More info from the registry.jsp page

Solr Specification Version: 1.4.0Solr Implementation Version: 1.4.0 833479 -
grantingersoll - 2009-11-06 12:33:40
Lucene Specification Version: 2.9.1
Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25



/M


On Mon, Feb 8, 2010 at 9:46 AM, Marcus Herou <marcus.herou@tailsweep.com>wrote:

> Hi. Thanks for the quick response.
>
> We have looked through the shards trying to find a value which is greater
> than radix 10 which would throw this exception. We did not find any. We have
> values between 0 and 100 in that field. Would not SOLR complain if we tried
> to index a "non-short" like for example a float or am integer size number ?
>
> Came to think of something. Would this really explain why sorting works
> without the "shards" parameter ? It only spits out the string version when
> we use sharding.
>
> I will show by a few examples what I mean. Note that both the field
> feedType and sentimentScore which are the only shorts both get skewed when
> using the shards param.
>
>
> -----------------------------------------------------------------------------------------------------------------------
> 1. A specific blog entry by id without sharding.
> GET "
> http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916&rows=1&indent=on
> "
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">1</int>
>  <lst name="params">
>   <str name="sort">sentimentScore asc</str>
>   <str name="indent">on</str>
>   <str name="q">feedItemId:137768916</str>
>   <str name="rows">1</str>
>  </lst>
> </lst>
> <result name="response" numFound="1" start="0">
>  <doc>
>   <str name="author"/>
>   <str name="description">hej bloggen!vissa av er klagar på att jag tar
> bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT
> ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött
> på att ni sitter bakom eran data o klagar på alla?visa ...</str>
>   <int name="feedId">2958282</int>
>   <long name="feedItemId">137768916</long>
>   <short name="feedType">1</short>
>   <str name="hashedLink">-92603838263017753</str>
>   <str name="link">
> http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html</str>
>   <date name="publishedDate">2010-01-03T13:32:54Z</date>
>   <date name="publishedDateDay">2010-01-03T12:00:00Z</date>
>   <date name="publishedDateMonth">2009-12-01T12:00:00Z</date>
>   <date name="publishedDateWeek">2009-12-28T12:00:00Z</date>
>   <short name="sentimentScore">0</short>
>   <str name="title">ge er vafan!</str>
>   <date name="tstamp">2010-01-29T16:55:39.52Z</date>
>   <date name="tstampDay">2010-01-29T12:00:00Z</date>
>  </doc>
> </result>
> </response>
>
> -----------------------------------------------------------------------------------------------------------------------
> 2. A specific blog entry by id with sharding
> GET "
> http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916&rows=1&shards=192.168.10.11:8110/solr/blogosphere-sv-2010Q1&indent=on"
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">11</int>
>  <lst name="params">
>   <str name="shards">192.168.10.11:8110/solr/blogosphere-sv-2010Q1</str>
>   <str name="indent">on</str>
>   <str name="q">feedItemId:137768916</str>
>   <str name="rows">1</str>
>  </lst>
> </lst>
> <result name="response" numFound="1" start="0">
>  <doc>
>   <long name="feedItemId">137768916</long>
>   <int name="feedId">2958282</int>
>   <str name="feedType">java.lang.Short:1</str>
>   <date name="publishedDate">2010-01-03T13:32:54Z</date>
>   <date name="publishedDateDay">2010-01-03T12:00:00Z</date>
>   <date name="publishedDateWeek">2009-12-28T12:00:00Z</date>
>   <date name="publishedDateMonth">2009-12-01T12:00:00Z</date>
>   <date name="tstamp">2010-01-29T16:55:39.52Z</date>
>   <date name="tstampDay">2010-01-29T12:00:00Z</date>
>   <str name="author"/>
>   <str name="description">hej bloggen!vissa av er klagar på att jag tar
> bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT
> ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött
> på att ni sitter bakom eran data o klagar på alla?visa ...</str>
>   <str name="title">ge er vafan!</str>
>   <str name="link">
> http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html</str>
>   <str name="hashedLink">-92603838263017753</str>
>   <str name="sentimentScore">java.lang.Short:0</str>
>  </doc>
> </result>
> </response>
>
> Hopes it makes sense to you,  it does not to us :)
>
> Oh and changing the field to "integer" solves the issue.
>
> Cheers
>
> //Marcus
>
>
>
>
>
>
>
>
>
>
> Cheers
>
> //Marcus
>
>
>
> On Fri, Feb 5, 2010 at 9:19 PM, Grant Ingersoll <gsingers@apache.org>wrote:
>
>> In looking at the code, I see:
>> <code>
>> try {
>>      short val = Short.parseShort(s);
>>      writer.writeShort(name, val);
>>    } catch (NumberFormatException e){
>>      // can't parse - write out the contents as a string so nothing is
>> lost and
>>      // clients don't get a parse error.
>>      writer.writeStr(name, s, true);
>>    }
>> </code>
>>
>> And it makes me wonder if you are hitting the NFE.  Can you recreate this
>> in a self-contained test?
>>
>> -Grant
>>
>> On Feb 5, 2010, at 4:10 AM, Marcus Herou wrote:
>>
>> > Hi.
>> >
>> > When using the field type solr.ShortField in combination with sharding
>> we
>> > get results like this back:
>> > <str name="sentimentScore">java.lang.Short:40</str>
>> >
>> > Making it impossible to sort on that value.
>> > Changing the field to IntegerField solves it.
>> >
>> > Example search:
>> >
>> > GET "
>> >
>> http://127.0.0.1:8110/solr/blogosphere-sv-2010Q1/select?q=(title:hej+OR+description:hej)+AND+(publishedDate:[2010-01-01T00:00:00.000Z+TO+2010-02-04T23:59:59.099Z])&rows=5&shards=127.0.0.1:8110/solr/blogosphere-sv-2010Q1,127.0.0.1:8110/solr/blogosphere-sv-2010Q1&indent=on&sort=sentimentScore+asc<http://127.0.0.1:8110/solr/blogosphere-sv-2010Q1/select?q=%28title:hej+OR+description:hej%29+AND+%28publishedDate:%5B2010-01-01T00:00:00.000Z+TO+2010-02-04T23:59:59.099Z%5D%29&rows=5&shards=127.0.0.1:8110/solr/blogosphere-sv-2010Q1,127.0.0.1:8110/solr/blogosphere-sv-2010Q1&indent=on&sort=sentimentScore+asc>
>> > "
>> >
>> > Cheers
>> >
>> > //Marcus Herou
>> >
>> >
>> > --
>> > Marcus Herou CTO and co-founder Tailsweep AB
>> > +46702561312
>> > marcus.herou@tailsweep.com
>> > http://www.tailsweep.com/
>>
>>
>
>
> --
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> marcus.herou@tailsweep.com
> http://www.tailsweep.com/
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message