lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: [jira] Commented: (SOLR-xxxx) Join
Date Fri, 25 Feb 2011 20:44:31 GMT
> You mentioned sorting checked every document regardless if the document contains a value
for the field.

Uses memory for every document... like int[maxDoc()] for an integer sort field.

> Is the same true for querying?

Nope - that's the beauty of an inverted index.  The word you are
looking for points directly to the documents containing that word.

I still don't see the functional difficulties of just putting all the
docs of different types into one index.
The only impact would be maybe some field renaming, and
speed/resources (since things like sortfields and filters are of size
maxDoc()).

Is there a concrete example you could give (say 4 small documents)?

-Yonik
http://lucidimagination.com


On Fri, Feb 25, 2011 at 3:09 PM, Briggs Thompson (JIRA) <jira@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999524#comment-12999524
]
>
> Briggs Thompson commented on SOLR-2272:
> ---------------------------------------
>
> I was thinking more of the case where two indexes have completely different schema's;
each with multiple fields that have a one to many relationship. For example, the below schema1
maybe have 100 schema2 documents associated to it.
>
> Schema1:
> documentId : int (unique key)
> field1
> field2
> field3 ...
>
> Schema2
> productId : int  (unique key)
> documentId : int
> field1
> field2
> field3 ...
>
> I guess what would be necessary to do this within a single index schema is implement
a custom class (solr.product), then have a multivalued field of a type with your custom class.
Are there examples where something similar is implemented? I would also have to get rid of
the unique key (or create a copy field or something along those lines)
>
> You mentioned sorting checked every document regardless if the document contains a value
for the field. Is the same true for querying? I am worried that even if the above would work
the performance would be impacted substantially considering you are turning an index with
X documents to an index with 2X documents, plus the join (don't know what kind of performance
impact that has).
>
> Thanks for your help Yonik!
> Briggs
>
>> Join
>> ----
>>
>>                 Key: SOLR-2272
>>                 URL: https://issues.apache.org/jira/browse/SOLR-2272
>>             Project: Solr
>>          Issue Type: New Feature
>>          Components: search
>>            Reporter: Yonik Seeley
>>             Fix For: 4.0
>>
>>         Attachments: SOLR-2272.patch, SOLR-2272.patch
>>
>>
>> Limited join functionality for Solr, mapping one set of IDs matching a query to another
set of IDs, based on the indexed tokens of the fields.
>> Example:
>> fq={!join  from=parent_ptr to:parent_id}child_doc:query
>
> --
> This message is automatically generated by JIRA.
> -
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message