lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Briggs Thompson <w.briggs.thomp...@gmail.com>
Subject Re: DataImportHandler w/ multivalued fields
Date Thu, 01 Dec 2011 19:07:36 GMT
Hey Rahul,

Thanks for the response. I actually just figured it thankfully :). To
answer your question, the raw_tag is indexed and not stored (tokenized),
and then there is a copyField for raw_tag to "raw_tag_string" which would
be used for facets. That *should have* been displayed in the results.

The silly mistake I made was not camel casing "multiValued", which is
clearly the source of the problem.

The second email I sent changing the query and using the split for the
multivalued field had an error in it in the form of a missing line:
transformer="RegexTransformer"
in the entity declaration.

Anyhow, thanks for the quick response!

Briggs


On Thu, Dec 1, 2011 at 12:57 PM, Rahul Warawdekar <
rahul.warawdekar@gmail.com> wrote:

> Hi Briggs,
>
> By saying "multivalued fields are not getting indexed prperly", do you mean
> to say that you are not able to search on those fields ?
> Have you tried actually searching your Solr index for those multivalued
> terms and make sure if it returns the search results ?
>
> One possibility could be that the multivalued fields are getting indexed
> correctly and are searchable.
> However, since your schema.xml has a "raw_tag" field whose "stored"
> attribute is set to false, you may not be able to see those fields.
>
>
>
> On Thu, Dec 1, 2011 at 1:43 PM, Briggs Thompson <
> w.briggs.thompson@gmail.com
> > wrote:
>
> > In addition, I tried a query like below and changed the column definition
> > to
> >            <field column="raw_tag" name="raw_tag" splitBy="," />
> > and still no luck. It is indexing the full content now but not
> multivalued.
> > It seems like the "splitBy" ins't working properly.
> >
> >    select group_concat(freetags.raw_tag separator ', ') as raw_tag,
> site.*
> > from site
> > left outer join
> >  (freetags inner join freetagged_objects)
> >     on (freetags.id = freetagged_objects.tag_id
> >       and site.siteId = freetagged_objects.object_id)
> > group  by site.siteId
> >
> > Am I doing something wrong?
> > Thanks,
> > Briggs Thompson
> >
> > On Thu, Dec 1, 2011 at 11:46 AM, Briggs Thompson <
> > w.briggs.thompson@gmail.com> wrote:
> >
> > > Hello Solr Community!
> > >
> > > I am implementing a data connection to Solr through the Data Import
> > > Handler and non-multivalued fields are working correctly, but
> multivalued
> > > fields are not getting indexed properly.
> > >
> > > I am new to DataImportHandler, but from what I could find, the entity
> is
> > > the way to go for multivalued field. The weird thing is that data is
> > being
> > > indexed for one row, meaning first raw_tag gets populated.
> > >
> > >
> > > Anyone have any ideas?
> > > Thanks,
> > > Briggs
> > >
> > > This is the relevant part of the schema:
> > >
> > >    <field name ="raw_tag" type="text_en_lessAggressive" indexed="true"
> > > stored="false" multivalued="true"/>
> > >    <field name ="raw_tag_string" type="string" indexed="false"
> > > stored="true" multivalued="true"/>
> > >    <copyField source="raw_tag" dest="raw_tag_string"/>
> > >
> > > And the relevant part of data-import.xml:
> > >
> > > <document name="merchant">
> > >         <entity name="site"
> > >                   query="select * from site ">
> > >             <field column="siteId" name="siteId" />
> > >             <field column="domain" name="domain" />
> > >             <field column="aliasFor" name="aliasFor" />
> > >             <field column="title" name="title" />
> > >             <field column="description" name="description" />
> > >             <field column="requests" name="requests" />
> > >             <field column="requiresModeration"
> name="requiresModeration"
> > />
> > >             <field column="blocked" name="blocked" />
> > >             <field column="affiliateLink" name="affiliateLink" />
> > >             <field column="affiliateTracker" name="affiliateTracker" />
> > >             <field column="affiliateNetwork" name="affiliateNetwork" />
> > >             <field column="cjMerchantId" name="cjMerchantId" />
> > >             <field column="thumbNail" name="thumbNail" />
> > >             <field column="updateRankings" name="updateRankings" />
> > >             <field column="couponCount" name="couponCount" />
> > >             <field column="category" name="category" />
> > >             <field column="adult" name="adult" />
> > >             <field column="rank" name="rank" />
> > >             <field column="redirectsTo" name="redirectsTo" />
> > >             <field column="wwwRequired" name="wwwRequired" />
> > >             <field column="avgSavings" name="avgSavings" />
> > >             <field column="products" name="products" />
> > >             <field column="nameChecked" name="nameChecked" />
> > >             <field column="tempFlag" name="tempFlag" />
> > >             <field column="created" name="created" />
> > >             <field column="enableSplitTesting"
> name="enableSplitTesting"
> > />
> > >             <field column="affiliateLinklock" name="affiliateLinklock"
> />
> > >             <field column="hasMobileSite" name="hasMobileSite" />
> > >             <field column="blockSite" name="blockSite" />
> > >             <entity name="merchant_tags" pk="siteId"
> > >             query="select raw_tag, freetags.id,
> > > freetagged_objects.object_id as siteId
> > >                from freetags
> > >    inner join freetagged_objects
> > >    on freetags.id=freetagged_objects.tag_id
> > >     where freetagged_objects.object_id='${site.siteId}'">
> > > <field column="raw_tag" name="raw_tag"/>
> > >      </entity>
> > >         </entity>
> > >     </document>
> > >
> >
>
>
>
> --
> Thanks and Regards
> Rahul A. Warawdekar
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message