lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: DataImportHanlder - Multiple entities will step into each other
Date Wed, 05 Jan 2011 02:30:15 GMT
No, this is all #explicitly# defined in schema.xml. Solr has no required
fields.
If you're using the default schema.xml, the problem is that that file
defines
the id field with 'required="true"' option set, so any doc that does not
have an
id field is rejected.

Id is used in the default schema in conjunction with
<uniqueKey>id</uniqueKey> to enforce document uniqueness.
Solr only uses <uniqueKey> to identify documents for replacement. That
is, if you add a document with a <uniqueKey> that is already in the index,
the old version of the document is removed and the new one added.

But <uniqueKey> is not required at all and no field in solr is required
unless
required="true" is specified.

Best
Erick

On Tue, Jan 4, 2011 at 7:19 PM, yu shen <shenyu.sh@gmail.com> wrote:

> I tried the sql way, and not work as expected.
> According to my experiments, id is an implicit required field of solr. If I
> change id to table_id, and add field definition in schema.xml, while data
> importing, there will be an error reported.
>
> Please correct me if I am wrong.
>
> 2011/1/5 Lance Norskog <goksron@gmail.com>
>
> > This SQL syntax should do it: "select id, field as table_id, field".
> >
> > On Tue, Jan 4, 2011 at 5:59 AM, yu shen <shenyu.sh@gmail.com> wrote:
> > > Thanks for the prompt reply. Let me try. Delete is not a big deal for
> the
> > > moment.
> > >
> > > 2011/1/4 Matti Oinas <matti.oinas@gmail.com>
> > >
> > >> I managed to do that by using TemplateTransformer
> > >>
> > >> <document>
> > >>  <entity name="company"..... transformer="TemplateTransformer">
> > >>     <field column="id" name="id" template="company-${company.id}" />
> > >> ...
> > >>  <entity name="item"..... transformer="TemplateTransformer">
> > >>     <field column="id" name="id" template="item-${item.id}" />
> > >> ...
> > >> </document>
> > >>
> > >> Only problem is that delta import fails to perform delete to the
> > >> index. It seems that TemplateTransformer is not used when performing
> > >> delete so delete by id doesn't work.
> > >>
> > >>
> > >>
> > >> 2011/1/4 yu shen <shenyu.sh@gmail.com>:
> > >> > Hi All,
> > >> >
> > >> > I have a dataimporthandler config file as below. It contains
> multiple
> > >> > entities:
> > >> > <dataConfig>
> > >> >        <dataSource name="jdbc" driver="com.mysql.jdbc.Driver"
> > >> >
> > >> >
> > >>
> >
> url="jdbc:mysql://localhost:1521/changan?useUnicode=true&amp;characterEncoding=utf8&amp;autoReconnect=true"...
> > >> > />
> > >> >        <document>
> > >> >                <entity name="item" dataSource="jdbc" pk="id"
> > query="...">
> > >> >                <entity name="company" dataSource="jdbc" pk="id"
> > query="">
> > >> >                ....
> > >> >        </document>
> > >> > </dataConfig>
> > >> >
> > >> > All data are from a database. Problem is item/company and other
> entity
> > >> all
> > >> > have the field 'id', with value start from 1 to n. In this case,
> > >> > item/company etc. will step into each other.
> > >> > Is there a way to prevent is from happening. Such as designate
> > different
> > >> > entity to different partition.
> > >> >
> > >> > One way I can think of is to seperate different entity to different
> > >> > instance, which is not ideal solution IMO.
> > >> >
> > >> > Would some one point me to a reference? And also give some
> > instructions?
> > >> >
> > >>
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > goksron@gmail.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message