openjpa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pinaki Poddar" <ppod...@bea.com>
Subject RE: Extension to OpenJPA for distributed databases
Date Wed, 06 Feb 2008 21:07:20 GMT
This will require minimal change in code but some change in Docs. 

-----Original Message-----
From: Patrick Linskey [mailto:plinskey@gmail.com] 
Sent: Wednesday, February 06, 2008 2:34 PM
To: dev@openjpa.apache.org
Subject: Re: Extension to OpenJPA for distributed databases

>    <property name="slice.One.ConnectionURL" value="jdbc://URL1"/>
>    <property name="slice.Two.ConnectionURL" value="jdbc://URL2"/>

I think that we want to keep our properties in the OpenJPA namespace:

    <property name="openjpa.slice.<name>.<property>" value="..."/>

-Patrick

On Jan 31, 2008 4:16 PM, Pinaki Poddar <ppoddar@bea.com> wrote:
> Hi Kevin,
>     Thank you for your interest and valued suggestion.
>
>    > Is this support meant for databases that do not support 
> partitions directly?
> Slice is targeted for environments with multiple stand-alone database 
> instances, possibly even heterogeneous. If an application wants to 
> bring data from these database instances into a *single* in-memory 
> persistence context then Slice can be useful.
> For database vendor that supports horizontal partitioning, one will be

> better off with standard OpenJPA, and of course, data distribution 
> then becomes a decision around partition key rather than a 
> user-defined policy plug-in.
>
>  > The DistributionPolicy interface seems a bit limiting.
> The contract is Slice calls back with list of configured slice and a 
> newly persistence-capable instance X, user tells which slice should 
> store X.
>
> > The slice names in the configuration can not change without a
> corresponding change in the DistributionPolicy callback.
>   Yes and No. I am thinking what to do with this issue and thank you 
> for your input. However, one guiding principle I will like to adhere 
> to  "Entity classes must be agonistic of the partitioned database 
> environment".
>
>   Why I said No: Let us consider a concrete example. I am going to 
> store all Person whose name is less than 'John Doe' in the first slice

> and rest in another. So my DistributionPolicy implementaion looks like
>
>   String distribute(Object pc, List<String> slices, Object ctx) {
>    if (((Person)pc).getName().compareTo("John Doe") > 0)
>       return slices.get(0);
>    return slices.get(1);
>
>  In my configuration how the slices are logically named is immaterial 
> in such a case. I can call them
>    <property name="slice.One.ConnectionURL" value="jdbc://URL1"/>
>    <property name="slice.Two.ConnectionURL" value="jdbc://URL2"/>
>
>  And later edit them to
>    <property name="slice.ABC.ConnectionURL" value="jdbc://URL1"/>
>    <property name="slice.XYZ.ConnectionURL" value="jdbc://URL2"/>
>
> without any change in application behavior.
>
>
>   > Maybe the callback could return an opaque Object based on whatever
> (key?) that could then be used by our runtime to determine the proper 
> slice?  With ObjectGrid, we did this via a PartitionableKey interface 
> that the primary key would have to implement.
>
>    "via a PartitionableKey interface that the primary key would have 
> to implement." -- this is what possibly violates my guiding principle.
> But may be I need to understand your suggested solution.
>
>  > When you mention possible "parallel execution", are you assuming 
> the use of the openjpa "multithreaded" property for the
EntityManagers?  Or,
>    would this parallel execution utilize separate EntityManagers?
>
>  Neither. A single EntityManager E uses a DistributedStoreManager DM 
> which in turn holds connection to many database DB1,DB2 etc. Now when 
> JPQL query Q is issued by E, DM runs the same SQL query against DB1, 
> DB2
> -- but each SQL query is executed on separate thread drawn from a
pool.
> The results of each query is collected, merged with ordering and 
> returned to the caller as a single result list.
>
>   > On first read, this support looks to be very cool for top-down 
> development.  Depending on your response to the first bullet, I find
it
>   > harder to understand how a customer might already have a 
> poor-man's version of partitioning and work upwards.  Just thinking
outloud...
>
>   We have to wait for people to use it to know whether this makes
sense.
> Andy Schlaikjer is our first user trying it on 100 database instances.
> May be Andy should comment.
>
>
>   Regards and thanks again for your interest --
>
>
> Pinaki
>
>
>
>
>
> -----Original Message-----
> From: Kevin Sutter [mailto:kwsutter@gmail.com]
> Sent: Thursday, January 31, 2008 4:34 PM
> To: dev@openjpa.apache.org
> Subject: Re: Extension to OpenJPA for distributed databases
>
> Pinaki,
> I like the idea.  I used to be involved with the ObjectGrid project 
> here at IBM and we used a similar technique for partitioning our 
> in-memory cache.  I have a few questions about Slice, but for the most

> part, I am in favor of including it in the OpenJPA deliverable.
>
>    - Basic question.  Is this support meant for databases that do not
>    support partitions directly?  My experience has been that if a 
> database
>    supports partitioning directly, then the interaction with the 
> database
>    doesn't change at all.  That is, the application (or openjpa 
> runtime in this
>    case) does not have to change to take advantage of the
partitioning.
> It's
>    transparent.  But, your documentation seems to indicate required 
> slice
>    configuration and callbacks.  I'm just trying to understand how you

> see this
>    support fitting into the partitioned database landscape.
>    - The DistributionPolicy interface seems a bit limiting.  The
>    application code is now very tightly linked with the configuration.
> The
>    slice names in the configuration can not change without a 
> corresponding
>    change in the DistributionPolicy callback.  I would prefer 
> something more
>    general.  Maybe the callback could return an opaque Object based on

> whatever
>    (key?) that could then be used by our runtime to determine the
proper
>    slice?  With ObjectGrid, we did this via a PartitionableKey 
> interface that
>    the primary key would have to implement.  We would then callback on

> the
>    getPartition() method to get the Object value which we would then 
> use to
>    determine the partition.  This could be a String value, if so 
> desired.  But,
>    it also allowed other Object types as well.
>    - When you mention possible "parallel execution", are you assuming 
> the
>    use of the openjpa "multithreaded" property for the EntityManagers?
> Or,
>    would this parallel execution utilize separate EntityManagers?
>    - On first read, this support looks to be very cool for top-down
>    development.  Depending on your response to the first bullet, I 
> find it
>    harder to understand how a customer might already have a poor-man's

> version
>    of partitioning and work upwards.  Just thinking outloud...
>
> Like I said up-front, I like the basic idea of Slice.  I think we 
> probably need a bit more discussion on how this fits into the overall 
> database landscape and architecture, but eventually I would like to 
> see this become part of OpenJPA.  Thanks and nice work.
>
> Kevin
>
>
> On Jan 30, 2008 5:44 PM, Pinaki Poddar <ppoddar@bea.com> wrote:
>
> > Hi,
> >  I would like to add an extension of OpenJPA that allows an 
> > application to transact against a set of distributed, possibly 
> > hetereogenous, horizontally-partitioned databases [2]. The project 
> > is named as Slice and is similar in scope to Hibernate Shards.
> >  The development codebase so far been maintained in Apache Lab 
> > repository and given its current state I propose to add the codebase

> > to a new openajpa-slice module.
> >
> >  I request you to review current state of its implementaion [1] and 
> > express your opinion/views on feasibility of my proposal.
> >
> >  Regards --
> >
> > Pinaki
> >
> > [1] Slice website:
> >
http://people.apache.org/~ppoddar/slice/site/index.html<http://people.
> > apache.org/%7Eppoddar/slice/site/index.html>
> > [2] dev2dev blog:
> > http://dev2dev.bea.com/blog/pinaki.poddar/archive/2008/01/slice_open
> > jp
> > a_
> > f_1.html
> >
> > Notice:  This email message, together with any attachments, may 
> > contain information  of  BEA Systems,  Inc.,  its subsidiaries  and 
> > affiliated entities,  that may be confidential,  proprietary, 
> > copyrighted  and/or legally privileged, and is intended solely for 
> > the
>
> > use of the individual or entity named in this message. If you are 
> > not the intended recipient, and have received this message in error,

> > please immediately return this by email and then delete it.
> >
>
> Notice:  This email message, together with any attachments, may
contain information  of  BEA Systems,  Inc.,  its subsidiaries  and
affiliated entities,  that may be confidential,  proprietary,
copyrighted  and/or legally privileged, and is intended solely for the
use of the individual or entity named in this message. If you are not
the intended recipient, and have received this message in error, please
immediately return this by email and then delete it.
>



--
Patrick Linskey
202 669 5907

Notice:  This email message, together with any attachments, may contain information  of  BEA
Systems,  Inc.,  its subsidiaries  and  affiliated entities,  that may be confidential,  proprietary,
 copyrighted  and/or legally privileged, and is intended solely for the use of the individual
or entity named in this message. If you are not the intended recipient, and have received
this message in error, please immediately return this by email and then delete it.

Mime
View raw message