lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From solr-user <>
Subject question about schemas
Date Tue, 01 Dec 2009 23:27:10 GMT

I just started using Solr, and I am trying to figure out how to setup my
schema. I know that Solr doesn’t have JOINs, and so I am having some
difficulty figuring out how would I setup a schema for the following
fictional situation.  For example, let us say that :

-	I have a 10000+ customers, each having some specific info (StoreId , Name,
Phone, Address, City, State, Zip, etc)
-	Each customer has a subset of the 100+ products I am looking to track,
each product having some specific info (ProductId, Name, Width, Height,
Depth, Weight, Density, etc)
-	I want to be able to search by the product info but have facets return the
number of customers, rather than the number of products, that meet my
-	I want to display (and sort) customers based on my product search

In relational databases, I would simply create two tables (customer and
product) and JOIN them.  I could then craft a sql query to count the number
of distinct StoreId values in the result (something like facets).

In Solr, however, there are no joins.  As far as I can tell, my options are

-	create two Solr instances, one with customer info and one with product
info; I would search the product Solr instance and identify the StoreId
values return, and then use that info to search the customer Solr instance
to get the customer info.  The problem with this is the second query could
have ten thousand ANDs (one for each StoreId returned by the first query)
-	create a single Solr instance that contains a denormalized version of the
data where each doc would contain both the customer info and the product
info for a given product.  The problem with this is that my facets would
return the number of products, not the number of customers
-	create a single Solr instance that contains a denormalized version of the
data where each doc contains the customer info and info for ALL products
that the  customer might have (likely done via dynamicfields). The problem
with this is that my schema would be a bit messy and that my queries could
have hundreds of ANDs and Ors (one AND for each product field, and one OR
for each product); for example, q=((Width1:50 AND Density1:7) OR (Width2:50
AND Density2:7) OR …)

Does anyone have any advice on this?  Are there other schemas that might
work?  Hopefully the example makes sense.

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message