polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhuangmz08" <zhuangm...@qq.com>
Subject 回复: 回复: Large Scale Entity Store Database?
Date Wed, 15 Jun 2016 09:51:09 GMT
Hi, 
I've trying out elasticsearch index/query engine. However, I meet this exception: MaxBytesLengthExceededException[bytes
can be at most 32766 in length; got 79504]
I found an answer in stackoverflow (https://stackoverflow.com/questions/24019868/utf8-encoding-is-longer-than-the-max-length-32766)
says that elasticsearch only support short text.
Maybe I have to split the long field into small fields?




------------------ 原始邮件 ------------------
发件人: "Niclas Hedhman";<niclas@hedhman.org>;
发送时间: 2016年6月14日(星期二) 晚上6:26
收件人: "dev"<dev@zest.apache.org>; 

主题: Re: 回复: Large Scale Entity Store Database?



Another way to do it, I just realized, is to create your own "Delegating
EntityStore", which implements EntityStore and uses some algorithm for
looking up one out many stores and delegates the call accordingly. That
might actually be a lot easier to do.

Cheers
Niclas

On Tue, Jun 14, 2016 at 6:24 PM, Niclas Hedhman <niclas@hedhman.org> wrote:

> Ok, some research done...
>
> The "entity store lookup" disappeared a long time ago. The algorithm is
> approx. like this;
>
> 1. When UnitOfWork.get(Class<T> type, String identity) (and other methods)
> the given type is searched in all visible Modules from the Module where the
> call is being made.
>
> 2. If found in a Module, an EntityStore is looked up from that Module's
> Visibility.
>
> 3. Entity is tried to be loaded.
>
> 4. If not successful, continue with next type found in search in step 1.
>
> 5. Return found entity, or NoSuchEntityException is thrown (might become
> different type at API level).
>
> AND since EntityStores are only looked up by Service Type (i.e. the
> EntityStore interface), you must ensure that only one ES is visible from
> each Module with different ES. Possibly do it within the Module, although
> that might be a bad idea (depends on where you are using this), so you
> might want additional layers...
>
> Cheers
> Niclas
>
> On Tue, Jun 14, 2016 at 6:04 PM, Niclas Hedhman <niclas@hedhman.org>
> wrote:
>
>> Good question....
>>
>> I vaguely recall that there was some explicit "By Type" support, at least
>> somewhere in the past, but not sure if that is still the case. I would need
>> to look at code, and possibly mail archives to clarify that.
>>
>> But yes, Visibility is definitely possible, although can be tricky to set
>> that up.
>>
>> Niclas
>>
>> On Tue, Jun 14, 2016 at 5:58 PM, zhuangmz08 <zhuangmz08@qq.com> wrote:
>>
>>> How can I use multi-ES in my app, by setting diffrent visibility to ES
>>> service?
>>> If I want to place 3 kinds of entity into 3 ES service, I must define 3
>>> diffrent zest module each expose to one ES service?
>>>
>>>
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "Niclas Hedhman";<niclas@hedhman.org>;
>>> 发送时间: 2016年6月14日(星期二) 下午5:39
>>> 收件人: "dev"<dev@zest.apache.org>;
>>>
>>> 主题: Re: 回复: Large Scale Entity Store Database?
>>>
>>>
>>>
>>> If you look at
>>> https://zest.apache.org/java/develop/thirty-minutes-intro.html
>>>
>>>
>>> QueryBuilder<Order> builder = queryBuilderFactory.newQueryBuilder( Order.
>>> class );
>>>
>>> Calendar cal = Calendar.getInstance();
>>> cal.setTime( new Date() );
>>> cal.roll( Calendar.DAY_OF_MONTH, -90 );
>>> Date last90days = cal.getTime();
>>> Order template = templateFor( Order.class );
>>> builder.where( gt( template.createdDate(), last90days ) );
>>> Query<Order> query = uow.newQuery(builder);
>>>
>>> for( Order order : query )
>>> {
>>>     report.addOrderToReport( order );
>>> }
>>>
>>>
>>> As you can see, the Query will return the Order instances from the point
>>> of
>>> view of the programmer (you), but under the hood, the Query executed
>>> (once)
>>> will return an Identity collection, but disguised as the objects, and
>>> when
>>> you access them, they will be retrieved from the entity store.
>>>
>>> A consequence I forgot to mention is that you can possibly have multiple
>>> Entity Stores in your application, and yet only have a single Indexer,
>>> query across those stores, and still get it working. Say, you have an
>>> LDAP
>>> store for User and something else for Order, the query could still be
>>> something like;
>>>
>>> QueryBuilder<Order> builder = queryBuilderFactory.newQueryBuilder( Order.
>>> class );
>>>
>>> Order template = templateFor( Order.class );
>>> builder.where( eq( template.enteredBy(), userOfInterest ) );
>>> Query<Order> query = uow.newQuery(builder);
>>>
>>> for( Order order : query )
>>> {
>>>     report.addOrderToReport( order );
>>> }
>>>
>>>
>>> where "userOfInterest" is an Entity in the LDAP store. That should
>>> work...
>>>
>>>
>>> Cheers
>>> Niclas
>>>
>>> On Tue, Jun 14, 2016 at 3:33 PM, zhuangmz08 <zhuangmz08@qq.com> wrote:
>>>
>>> > Hi, Paul,
>>> >
>>> >
>>> > Thanks for your sharing.
>>> >
>>> >
>>> > 1.
>>> > I've been tesing :)
>>> > Mongo ES and RDF File engine.
>>> > It took 101 seconds to write and index 10000 quite symple entities and
>>> > took 6 second to read this entities. [i5-4200U, 8GB RAM, SSD on Win10]
>>> > I feel it's too slow..
>>> >
>>> >
>>> > 2.
>>> > So, index/query engine just figure out the Identities accoring to my
>>> > query. And then ask the ES to get the extract entities accoring to the
>>> > Identities.
>>> >
>>> >
>>> > ------------------ 原始邮件 ------------------
>>> > 发件人: "Paul Merlin";<paul@nosphere.org>;
>>> > 发送时间: 2016年6月14日(星期二) 下午3:17
>>> > 收件人: "dev"<dev@zest.apache.org>;
>>> >
>>> > 主题: Re: 回复: Large Scale Entity Store Database?
>>> >
>>> >
>>> >
>>> > Hi,
>>> >
>>> > zhuangmz08 a écrit :
>>> > > Hi,
>>> > >
>>> > >
>>> > > OK, writing entities and reading entities are separated both theroy
>>> and
>>> > physical implementation.
>>> >
>>> > Entities are written *and* fetched from EntityStores.
>>> > Entities are indexed into index/query engines.
>>> > Queries are resolved by index/query engines that only returns
>>> > identities, used to fetch the actual entities from EntityStores.
>>> >
>>> > > 1. It's acceptable to occupy large storage space (Disk is cheap).
>>> > > All entities are stored in a SINGLE table of the SQL database or in
a
>>> > SINGLE collection of the SINGLE database in Mongo.
>>> > > What's the key factors on writing? Which MapEntityStore is faster in
>>> > writing entities? I mean, which is better for production use.
>>> > Just like Niclas said, most of the EntityStores are based on
>>> > JSONMapEntityStore, so they are built as simple key/value stores
>>> > whichever is the underlying storage system. SQL ES use a single table,
>>> > Mongo ES use a single collection and so on. Which one is best for your
>>> > use case depends on your application and deployment constraints. I
>>> > successfuly used File ES, SQL ES, Mongo ES and Redis ES in production,
>>> > YMMV.
>>> >
>>> > > 2. Reading speed is related to the Indexer?  I know something about
>>> > search engine (Apache Solr). Could you explain more about the querying.
>>> > When the query string matched some index, how will they interact with
>>> the
>>> > entity database? Do we need to query the Entity database internally? I
>>> > would like to know the factors impacting read speed.
>>> > > Which is better for production use, OpenRDF or ElasticSearch?
>>> >
>>> > Indexing and querying speed is related to the Index/Query engine.
>>> > Fetching speed is related to the EntityStore.
>>> >
>>> > In any case, I'd suggest that you run speed/load tests on your
>>> > application. Zest strength here is that you can cheaply change your
>>> > EntityStore / Index/Query engines.
>>> >
>>> > HTH
>>> >
>>> > /Paul
>>> >
>>>
>>>
>>>
>>> --
>>> Niclas Hedhman, Software Developer
>>> http://zest.apache.org - New Energy for Java
>>>
>>
>>
>>
>> --
>> Niclas Hedhman, Software Developer
>> http://zest.apache.org - New Energy for Java
>>
>
>
>
> --
> Niclas Hedhman, Software Developer
> http://zest.apache.org - New Energy for Java
>



-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message