lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From david.dav...@correo.aeat.es
Subject Re: Problem with faceting
Date Fri, 06 Feb 2015 12:27:44 GMT
Hi Alvaro,

this is the definition:

                 <fieldType name="entidades" class="solr.TextField">
                                 <analyzer type="index">
                                                 <tokenizer 
class="solr.PatternTokenizerFactory" pattern="#"/>
                                 </analyzer>
                 </fieldType


As you can see we store all the ID split with a #. Normally this have 
worked fine, and I think that the problem has nothing to do with the 
definition. 
Besides, I have seen that when the correct value in the facet field would 
be 2, Solr shows 4, and when it would be 1 it shows 2. In conclusion, for 
some reason values are being duplicated. Why? I have no idea.  And this 
doesn't happen always, it´s more, only with some queries or some 
documents. It's very weird, maybe Solr Cloud is merging the results from 
the two shards in a wrong way in some situations, but I have no idea.

Regards,
 

David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de Tecnologías de Análisis de la Información e Investigación 
del Fraude
Teléfono: 915828763
Extensión: 36763



De:     Alvaro Cabrerizo <toporniz@gmail.com>
Para:   "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>, 
Fecha:  06/02/2015 12:34
Asunto: Re: Problem with faceting



Hi David,

Yes it sounds weird.

Just for testing purpose, It would be nice to have the ID_bent fieldtype
definition.

Regards.

On Fri, Feb 6, 2015 at 9:05 AM, <david.davila@correo.aeat.es> wrote:

> Hello,
>
> we have been using faceting for a long time, but now I have discovered a
> problem that I can't understand:
>
> the issue is that in a query with 2 results, in some facet values Solr 
is
> answering that there are 4 results. But faceting only applies over the
> result documents, therefore I think that this makes no sense.
>
> This is the query:
>
>
>   "responseHeader": {
>     "status": 0,
>     "QTime": 330,
>     "params": {
>       "facet": "true",
>       "fl": "ID_bent",
>       "indent": "true",
>       "q": "aitana",
>       "_": "1423207958751",
>       "facet.field": "ID_bent",
>       "wt": "json",
>       "fq": "ee_Procedimiento:ZZ12 AND ee_Referencia:\"CURSO\" AND
> doc_FormatoDocumento:PDF"
>     }
>   },
>   "response": {
>     "numFound": 2,
>     "start": 0,
>     "maxScore": 0.17735688,
>     "docs": [
>       {
>         "ID_bent": "#77762702P#77762953Y#77768200D#77763320M#77760725D#"
>       },
>       {
>         "ID_bent":
>
> 
"#77760631F#77766156N#77760725D#77762702P#77765788N#48991207P#77762953Y#77760302T#12312312K#89890001K#77768200D#89890003T#11111111H#77763453T#99999999R#00020080J#Y4332393N#04889446Z#12345655Z#77763320M#11100336Z#Y4222970X#"
>       }
>     ]
>   },
>   "facet_counts": {
>     "facet_queries": {},
>     "facet_fields": {
>       "ID_bent": [
>         "77760725D",
>         4,
>         "77762702P",
>         4,
>         "77762953Y",
>         4,
>         "77763320M",
>         4,
>         "77768200D",
>         4,
>         "00000336Z",
>         2,
>         "00020000J",
>         2,
>         "04889446Z",
>         2,
>         "11111111H",
>         2,
>         "12312312K",
>         2,
>         "12345655Z",
>         2,
>         "48261207P",
>         2,
>         "77760302T",
>         2,
>         "77760631F",
>         2,
>         "77763453T",
>         2,
>         "77765788N",
>         2,
>
>
> We are using Solr 4.7 in cloud configuration with 2 shards.  Any idea 
what
> it is happening?
>
> Thanks in advance,
>
> David Dávila Atienza
> AEAT - Departamento de Informática Tributaria
> Subdirección de Tecnologías de Análisis de la Información e 
Investigación
> del Fraude
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message