lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-4787) Join Contrib
Date Fri, 10 May 2013 17:43:16 GMT

     [ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joel Bernstein updated SOLR-4787:
---------------------------------

    Description: 
This contrib provides a place where different join implementations can be contributed to Solr.
This contrib currently includes 3 join implementations. The initial patch was generated from
the Solr 4.2.1 tag. Because of changes in the FieldCache API this patch will only build with
Solr 4.2 or above.

*PostFilterJoinQParserPlugin aka "pjoin"*

The pjoin provides a join implementation that filters results in one core based on the results
of a search in another core. This is similar in functionality to the JoinQParserPlugin but
the implementation differs in a couple of important ways.

The first way is that the pjoin is designed to work with integer join keys only. So, in order
to use pjoin, integer join keys must be included in both the to and from core.

The second difference is that the pjoin builds memory structures that are used to quickly
connect the join keys. It also uses a custom SolrCache named "join" to hold intermediate DocSets
which are needed to build the join memory structures. So, the pjoin will need more memory
then the JoinQParserPlugin to perform the join.

The main advantage of the pjoin is that it can scale to join millions of keys between cores.

Because it's a PostFilter, it only needs to join records that match the main query.

The syntax of the pjoin is the same as the JoinQParserPlugin except that the plugin is referenced
by the string "pjoin" rather then "join".

fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1

The example filter query above will search the fromCore (collection2) for "user:customer1".
This query will generate a list of values from the "from" field that will be used to filter
the main query. Only records from the main query, where the "to" field is present in the "from"
list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the pjoin.

<queryParser name="pjoin" class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>

And the join contrib jars must be registed in the solrconfig.xml.

<lib dir="../../../dist/" regex="solr-joins-\d.*\.jar" />

The solrconfig.xml in the fromcore must have the "join" SolrCache configured.

 <cache name="join"
              class="solr.LRUCache"
              size="4096"
              initialSize="1024"
              />


*JoinValueSourceParserPlugin aka vjoin*

The second implementation is the JoinValueSourceParserPlugin aka "vjoin". This implements
a ValueSource function query that can return values from a second core based on join keys.
This allows relevance data to be stored in a separate core and then joined in the main query.

The vjoin is called using the "vjoin" function query. For example:

bf=vjoin(fromCore, fromKey, fromVal, toKey)

This example shows "vjoin" being called by the edismax boost function parameter. This example
will return the "fromVal" from the "fromCore". The "fromKey" and "toKey" are used to link
the records from the main query to the records in the "fromCore".

As with the "pjoin", both the fromKey and toKey must be integers. Also like the pjoin, the
"join" SolrCache is used to hold the join memory structures.

To configure the vjoin you must register the ValueSource plugin in the solrconfig.xml as follows:

<valueSourceParser name="vjoin" class="org.apache.solr.joins.JoinValueSourceParserPlugin"
/>

*JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*

vjoin2 supports "personalized" ValueSource joins. The syntax is similar to vjoin but adds
an extra parameter so a query can be specified to join a specific record set from the fromCore.
This is designed to allow customer specific relevance information to be added to the fromCore
and then joined at query time.


Syntax:

bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)







  was:
This contrib provides a place where different join implementations can be contributed to Solr.
This contrib currently includes 3 join implementations. The initial patch was generated from
the Solr 4.2.1 tag. Because of changes in the FieldCache API this patch will only build with
Solr 4.2 or above.

*PostFilterJoinQParserPlugin aka "pjoin"*

The pjoin provides a join implementation that filters results in one core based on the results
of a search in another core. This is similar in functionality to the JoinQParserPlugin but
the implementation differs in a couple of important ways.

The first way is that the pjoin is designed to work with integer join keys only. So, in order
to use pjoin, integer join keys must be included in both the to and from core.

The second difference is that the pjoin builds memory structures that are used to quickly
connect the join keys. It also uses a custom SolrCache named "join" to hold intermediate DocSets
which are needed to build the join memory structures. So, the pjoin will need more memory
then the JoinQParserPlugin to perform the join.

The main advantage of the pjoin is that it can scale to join millions of keys between cores.

Because it's a PostFilter, it only needs to join records that match the main query.

The syntax of the pjoin is the same as the JoinQParserPlugin except that the plugin is referenced
by the string "pjoin" rather then "join".

fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1

The example filter query above will search the fromCore (collection2) for "user:customer1".
This query will generate a list of values from the "from" field that will be used to filter
the main query. Only records from the main query, where the "to" field is present in the "from"
list will be included in the results.

The solrconfig.xml in the main query core must contain the reference to the pjoin.

<queryParser name="pjoin" class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>

And the join contrib jars must be registed in the solrconfig.xml.

<lib dir="../../../dist/" regex="solr-joins-\d.*\.jar" />

The solrconfig.xml in the fromcore must have the "join" SolrCache configured.

 <cache name="join"
              class="solr.LRUCache"
              size="4096"
              initialSize="1024"
              />


*JoinValueSourceParserPlugin aka vjoin*

The second implementation is the JoinValueSourceParserPlugin aka "vjoin". This implements
a ValueSource function query that can return values from a second core based on join keys.
This allows relevance data to be stored in a separate core and then joined in the main query.

The vjoin is called using the "vjoin" function query. For example:

bf=vjoin(fromCore, fromKey, fromVal, toKey)

This example shows "vjoin" being called by the edismax boost function parameter. This example
will return the "fromVal" from the "fromCore". The "fromKey" and "toKey" are used to link
the records from the main query to the records in the "fromCore".

As with the "pjoin", both the fromKey and toKey must be integers. Also like the pjoin, the
"join" SolrCache is used to hold the join memory structures.

To configure the vjoin you must register the ValueSource plugin in the solrconfig.xml as follows:

<valueSourceParser name="vjoin" class="org.apache.solr.joins.JoinValueSourceParserPlugin"
/>

*JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*

vjoin2 supports "personalized" ValueSource joins. The syntax is similar to vjoin but adds
an extra parameter so a query can be specified to join a specific record set from the fromCore.
This is designed to allow customer specific relevance information to be added to fromCore
and then joined at query time.


Syntax:

bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)







    
> Join Contrib
> ------------
>
>                 Key: SOLR-4787
>                 URL: https://issues.apache.org/jira/browse/SOLR-4787
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 4.2.1
>            Reporter: Joel Bernstein
>            Priority: Minor
>             Fix For: 4.2.1
>
>         Attachments: SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch
>
>
> This contrib provides a place where different join implementations can be contributed
to Solr. This contrib currently includes 3 join implementations. The initial patch was generated
from the Solr 4.2.1 tag. Because of changes in the FieldCache API this patch will only build
with Solr 4.2 or above.
> *PostFilterJoinQParserPlugin aka "pjoin"*
> The pjoin provides a join implementation that filters results in one core based on the
results of a search in another core. This is similar in functionality to the JoinQParserPlugin
but the implementation differs in a couple of important ways.
> The first way is that the pjoin is designed to work with integer join keys only. So,
in order to use pjoin, integer join keys must be included in both the to and from core.
> The second difference is that the pjoin builds memory structures that are used to quickly
connect the join keys. It also uses a custom SolrCache named "join" to hold intermediate DocSets
which are needed to build the join memory structures. So, the pjoin will need more memory
then the JoinQParserPlugin to perform the join.
> The main advantage of the pjoin is that it can scale to join millions of keys between
cores.
> Because it's a PostFilter, it only needs to join records that match the main query.
> The syntax of the pjoin is the same as the JoinQParserPlugin except that the plugin is
referenced by the string "pjoin" rather then "join".
> fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
> The example filter query above will search the fromCore (collection2) for "user:customer1".
This query will generate a list of values from the "from" field that will be used to filter
the main query. Only records from the main query, where the "to" field is present in the "from"
list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the pjoin.
> <queryParser name="pjoin" class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
> <lib dir="../../../dist/" regex="solr-joins-\d.*\.jar" />
> The solrconfig.xml in the fromcore must have the "join" SolrCache configured.
>  <cache name="join"
>               class="solr.LRUCache"
>               size="4096"
>               initialSize="1024"
>               />
> *JoinValueSourceParserPlugin aka vjoin*
> The second implementation is the JoinValueSourceParserPlugin aka "vjoin". This implements
a ValueSource function query that can return values from a second core based on join keys.
This allows relevance data to be stored in a separate core and then joined in the main query.
> The vjoin is called using the "vjoin" function query. For example:
> bf=vjoin(fromCore, fromKey, fromVal, toKey)
> This example shows "vjoin" being called by the edismax boost function parameter. This
example will return the "fromVal" from the "fromCore". The "fromKey" and "toKey" are used
to link the records from the main query to the records in the "fromCore".
> As with the "pjoin", both the fromKey and toKey must be integers. Also like the pjoin,
the "join" SolrCache is used to hold the join memory structures.
> To configure the vjoin you must register the ValueSource plugin in the solrconfig.xml
as follows:
> <valueSourceParser name="vjoin" class="org.apache.solr.joins.JoinValueSourceParserPlugin"
/>
> *JoinValueSourceParserPlugin2 aka vjoin2 aka Personalized ValueSource Join*
> vjoin2 supports "personalized" ValueSource joins. The syntax is similar to vjoin but
adds an extra parameter so a query can be specified to join a specific record set from the
fromCore. This is designed to allow customer specific relevance information to be added to
the fromCore and then joined at query time.
> Syntax:
> bf=vjoin2(fromCore,fromKey,fromVal,toKey,query)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message