lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: Dynamic data model design questions
Date Tue, 16 Apr 2013 16:23:29 GMT
On 4/16/2013 9:17 AM, Marko Asplund wrote:
> Shawn Heisey wrote:
> So, using a dynamic schema I'd flatten the following JSON object graph
>
> {
>    'id':'xyz123',
>    'obj1': {
>      'child1': {
>        'prop1': ['val1', 'val2', 'val3']
>        'prop2': 123
>       }
>       'prop3': 'val4'
>    },
>    'obj2': {
>      'child2': {
>        'prop3': true
>      }
>    }
> }
>
> to a Solr document something like this?
>
> {
> 'id':'xyz123',
> 'obj1/child1/prop1_ss': ['val1', 'val2', 'val3'],
> 'obj1/child1/prop2_i': 123,
> 'obj1/prop3_s': 'val4',
> 'obj2/child2/prop3_b': true
> }

How you flatten the data is up to you. You have to examine the data and 
how you want to use it in order to keep the number of fields to a 
manageable level but retain the flexibility you need.  Side note: I 
would not use anything in a field name other than ASCII alphanumeric and 
underscore characters.  Using special characters (like a slash) has been 
known to cause problems with some Solr features.  Because Solr uses 
HTTP, there are also potential URL escaping issues.

Within a single index, Solr uses a flat model, like a single database 
table with no relational capability.  With two indexes, there is the 
limited join feature, but I am not familiar with how it works.

> I'm using Java, so I'd probably push docs for indexing to Solr and do the
> searches using SolrJ, right?

That would be the most sensible approach.  The SolrJ API is much more 
advanced than the APIs for other languages.  This is because it is 
actually part of the Solr codebase and used by Solr internally.

> The data import handler is a Solr server side feature and not a client side?
> Does Solr or SolrJ have any support for doing transformations on the client
> side?
> Doing the above transformation should be fairly straight forward, so it
> could be also done by code on the client side.

With SolrJ, you can do anything, because you write the code.  You can do 
whatever you like to the data, then send it to Solr.

The dataimport handler is indeed a server side feature.  It is a contrib 
module included in the Solr distribution, you have to add a jar to Solr 
to activate it.

Thanks,
Shawn


Mime
View raw message