lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Dynamic data model design questions
Date Tue, 16 Apr 2013 16:07:22 GMT
"'obj1/child1/prop1_ss'"

Try to stick to names that follow Java naming conventions: letter or 
underscore followed by letters, digits, and underscores. There are place in 
Solr which have limited rules for names because they support additional 
syntax.

In this case, replace your slashes with underscores.

In general, Solr is much more friendly towards static data models. Yes, you 
can use dynamic fields, but use them in moderation. The more heavily you 
lean on them, the more likely that you will eventually become unhappy with 
Solr.

How many fields are we talking about here?

The trick with Solr is not to brute-force flatten your data model (as you 
appear to be doing), but to REDESIGN your data model so that it is more 
amenable to a flat data model, and takes advantage of Solr's features. You 
can use multiple collections for different types of data. And you can 
simulate joins across tables by doing a sequence of queries (although it 
would be nice to have a SolrJ client-side method to do that in one API 
call.)

-- Jack Krupansky

-----Original Message----- 
From: Marko Asplund
Sent: Tuesday, April 16, 2013 11:17 AM
To: solr-user
Subject: Re: Dynamic data model design questions

Shawn Heisey wrote:

> Solr does have some *very* limited capability for doing joins between
indexes, but generally speaking, you need to flatten the data.

thanks!

So, using a dynamic schema I'd flatten the following JSON object graph

{
  'id':'xyz123',
  'obj1': {
    'child1': {
      'prop1': ['val1', 'val2', 'val3']
      'prop2': 123
     }
     'prop3': 'val4'
  },
  'obj2': {
    'child2': {
      'prop3': true
    }
  }
}

to a Solr document something like this?

{
'id':'xyz123',
'obj1/child1/prop1_ss': ['val1', 'val2', 'val3'],
'obj1/child1/prop2_i': 123,
'obj1/prop3_s': 'val4',
'obj2/child2/prop3_b': true
}

I'm using Java, so I'd probably push docs for indexing to Solr and do the
searches using SolrJ, right?


> Solr's ability to change your data after receiving it is fairly limited.
The schema has some ability in this regard for indexed values, > but the
stored data is 100% verbatim as Solr receives it. If you will be using the
dataimport handler, it does have some transform > capability before sending
to Solr. Most of the time, the rule of thumb is that changing the data on
the Solr side will require
> contrib/custom plugins, so it may be easier to do it before Solr receives
it.

The data import handler is a Solr server side feature and not a client side?
Does Solr or SolrJ have any support for doing transformations on the client
side?
Doing the above transformation should be fairly straight forward, so it
could be also done by code on the client side.

marko 


Mime
View raw message