lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-3857) DIH: SqlEntityProcessor with "simple" cache broken
Date Fri, 15 Mar 2013 20:12:13 GMT

     [ https://issues.apache.org/jira/browse/SOLR-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Dyer updated SOLR-3857:
-----------------------------

    Attachment: SOLR-3857.patch

Here is a working patch based on the fix Sudheer Prem suggested on SOLR-4561.  All tests pass
and it restores pre-3.6 functionality.

The way this feature works (and always has) is by creating a new cache for every key.  If
using the default cache impl, this means a 1-element SortedMap in memory in addition to your
data.  In addition all of these 1-element caches are kept in a map, keyed by the query text
with tokens replaced.  This is why Sudheer's fix needs to replace tokens first and then see
if the cache exists second, because each version of the query gets its own cache.  Using SortedMapBackedCache
(the default), this is merely a memory waste (and possibly a net gain if you are caching far
less data).  But the point of the recent cache refactorings is to allow for pluggable cache
implementations, including those that persist data to disk.  Clearly this behavior is not
going to work for the general case.

While the way it ought to work is easy to conceptualize, the DIH structure doesn't make it
easy.  The query's tokens get replaced several calls up the stack from the cache layer.

Those who want this functionality can apply and build with this patch.  But perhaps a better
way is simply to put a subselect in your child entity query.  For instance:

{code:xml}
<entity name="parent" query="SELECT * FROM PARENT" pk=ID">
 <entity name="child" cacheImpl="SortedMapBackedCache" query="SELECT * FROM CHILD WHERE
CHILD_ID IN (SELECT CHILD_ID FROM PARENT)" />
</entity>
{code} 

Although this does not give you lazy loading, it does cause only the needed data to be cached.
                
> DIH: SqlEntityProcessor with "simple" cache broken
> --------------------------------------------------
>
>                 Key: SOLR-3857
>                 URL: https://issues.apache.org/jira/browse/SOLR-3857
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.6.1, 4.0-BETA
>            Reporter: James Dyer
>         Attachments: SOLR-3857.patch
>
>
> The wiki describes a usage of CachedSqlEntityProcessor like this:
> {code:xml}
> <entity name="y" query="select * from y where xid=${x.id}" processor="CachedSqlEntityProcessor">
> {code}
> This creates what the code refers as a "simple" cache.  Rather than build the entire
cache up-front, the cache is built on-the-go.  I think this has limited use cases but it would
be nice to preserve the feature if possible.
> Unfortunately this was not included in any (effective) unit tests, and SOLR-2382 entirely
broke the functionality for 3.6/4.0-alpha+ .  At a first glance, the fix may not be entirely
straightforward.
> This was found while writing tests for SOLR-3856.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message