lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcus Herou <marcus.he...@tailsweep.com>
Subject Re: Short is buggish ?
Date Mon, 08 Feb 2010 08:46:37 GMT
Hi. Thanks for the quick response.

We have looked through the shards trying to find a value which is greater
than radix 10 which would throw this exception. We did not find any. We have
values between 0 and 100 in that field. Would not SOLR complain if we tried
to index a "non-short" like for example a float or am integer size number ?

Came to think of something. Would this really explain why sorting works
without the "shards" parameter ? It only spits out the string version when
we use sharding.

I will show by a few examples what I mean. Note that both the field feedType
and sentimentScore which are the only shorts both get skewed when using the
shards param.

-----------------------------------------------------------------------------------------------------------------------
1. A specific blog entry by id without sharding.
GET "
http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916&rows=1&indent=on
"

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">1</int>
 <lst name="params">
  <str name="sort">sentimentScore asc</str>
  <str name="indent">on</str>
  <str name="q">feedItemId:137768916</str>
  <str name="rows">1</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <str name="author"/>
  <str name="description">hej bloggen!vissa av er klagar på att jag tar bort
alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA
ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på
att ni sitter bakom eran data o klagar på alla?visa ...</str>
  <int name="feedId">2958282</int>
  <long name="feedItemId">137768916</long>
  <short name="feedType">1</short>
  <str name="hashedLink">-92603838263017753</str>
  <str name="link">
http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html</str>
  <date name="publishedDate">2010-01-03T13:32:54Z</date>
  <date name="publishedDateDay">2010-01-03T12:00:00Z</date>
  <date name="publishedDateMonth">2009-12-01T12:00:00Z</date>
  <date name="publishedDateWeek">2009-12-28T12:00:00Z</date>
  <short name="sentimentScore">0</short>
  <str name="title">ge er vafan!</str>
  <date name="tstamp">2010-01-29T16:55:39.52Z</date>
  <date name="tstampDay">2010-01-29T12:00:00Z</date>
 </doc>
</result>
</response>
-----------------------------------------------------------------------------------------------------------------------
2. A specific blog entry by id with sharding
GET "
http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916&rows=1&shards=192.168.10.11:8110/solr/blogosphere-sv-2010Q1&indent=on"

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">11</int>
 <lst name="params">
  <str name="shards">192.168.10.11:8110/solr/blogosphere-sv-2010Q1</str>
  <str name="indent">on</str>
  <str name="q">feedItemId:137768916</str>
  <str name="rows">1</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <long name="feedItemId">137768916</long>
  <int name="feedId">2958282</int>
  <str name="feedType">java.lang.Short:1</str>
  <date name="publishedDate">2010-01-03T13:32:54Z</date>
  <date name="publishedDateDay">2010-01-03T12:00:00Z</date>
  <date name="publishedDateWeek">2009-12-28T12:00:00Z</date>
  <date name="publishedDateMonth">2009-12-01T12:00:00Z</date>
  <date name="tstamp">2010-01-29T16:55:39.52Z</date>
  <date name="tstampDay">2010-01-29T12:00:00Z</date>
  <str name="author"/>
  <str name="description">hej bloggen!vissa av er klagar på att jag tar bort
alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA
ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på
att ni sitter bakom eran data o klagar på alla?visa ...</str>
  <str name="title">ge er vafan!</str>
  <str name="link">
http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html</str>
  <str name="hashedLink">-92603838263017753</str>
  <str name="sentimentScore">java.lang.Short:0</str>
 </doc>
</result>
</response>

Hopes it makes sense to you,  it does not to us :)

Oh and changing the field to "integer" solves the issue.

Cheers

//Marcus










Cheers

//Marcus


On Fri, Feb 5, 2010 at 9:19 PM, Grant Ingersoll <gsingers@apache.org> wrote:

> In looking at the code, I see:
> <code>
> try {
>      short val = Short.parseShort(s);
>      writer.writeShort(name, val);
>    } catch (NumberFormatException e){
>      // can't parse - write out the contents as a string so nothing is lost
> and
>      // clients don't get a parse error.
>      writer.writeStr(name, s, true);
>    }
> </code>
>
> And it makes me wonder if you are hitting the NFE.  Can you recreate this
> in a self-contained test?
>
> -Grant
>
> On Feb 5, 2010, at 4:10 AM, Marcus Herou wrote:
>
> > Hi.
> >
> > When using the field type solr.ShortField in combination with sharding we
> > get results like this back:
> > <str name="sentimentScore">java.lang.Short:40</str>
> >
> > Making it impossible to sort on that value.
> > Changing the field to IntegerField solves it.
> >
> > Example search:
> >
> > GET "
> >
> http://127.0.0.1:8110/solr/blogosphere-sv-2010Q1/select?q=(title:hej+OR+description:hej)+AND+(publishedDate:[2010-01-01T00:00:00.000Z+TO+2010-02-04T23:59:59.099Z])&rows=5&shards=127.0.0.1:8110/solr/blogosphere-sv-2010Q1,127.0.0.1:8110/solr/blogosphere-sv-2010Q1&indent=on&sort=sentimentScore+asc<http://127.0.0.1:8110/solr/blogosphere-sv-2010Q1/select?q=%28title:hej+OR+description:hej%29+AND+%28publishedDate:%5B2010-01-01T00:00:00.000Z+TO+2010-02-04T23:59:59.099Z%5D%29&rows=5&shards=127.0.0.1:8110/solr/blogosphere-sv-2010Q1,127.0.0.1:8110/solr/blogosphere-sv-2010Q1&indent=on&sort=sentimentScore+asc>
> > "
> >
> > Cheers
> >
> > //Marcus Herou
> >
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > marcus.herou@tailsweep.com
> > http://www.tailsweep.com/
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message