lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Solr Composite Unique key from existing fields in schema
Date Tue, 28 May 2013 19:31:23 GMT
The order in the ID should be purely dependent on the order of the field 
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <jack@basetechnology.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.







Mime
View raw message