polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhuangmz08" <zhuangm...@qq.com>
Subject 回复: Large Scale Entity Store Database?
Date Tue, 14 Jun 2016 04:24:17 GMT

OK, writing entities and reading entities are separated both theroy and physical implementation.

1. It's acceptable to occupy large storage space (Disk is cheap).
All entities are stored in a SINGLE table of the SQL database or in a SINGLE collection of
the SINGLE database in Mongo.
What's the key factors on writing? Which MapEntityStore is faster in writing entities? I mean,
which is better for production use.

2. Reading speed is related to the Indexer?  I know something about search engine (Apache
Solr). Could you explain more about the querying. When the query string matched some index,
how will they interact with the entity database? Do we need to query the Entity database internally?
I would like to know the factors impacting read speed.
Which is better for production use, OpenRDF or ElasticSearch?

Thanks a lot.

------------------ 原始邮件 ------------------
发件人: "Niclas Hedhman";<hedhman@gmail.com>;
发送时间: 2016年6月14日(星期二) 中午11:02
收件人: "dev"<dev@zest.apache.org>; 

主题: Re: Large Scale Entity Store Database?

In Zest, storage/retrieval and indexing/query are separated concerns. (Disk
is cheap)
Just like it is on the world-wide web.

Now, the relatively simple Entity Stores that are based on the
MapEntitStore might be particularly wasteful with storage space, depending
on the underlying engine. However, nothing stops you from creating a
"native" ES for your favorite storage engine.

The Indexing/Query systems are much more complex (compare a website's
store/retrieve with Google's Search) and it is not trivial to make an
indexing extension that is complete (native queries are available as a

In Zest 2.x and earlier, the default is to index all properties, and you
can turn some of them off. In 3.x we intend to change the default to off,
and you indicate what needs indexing.

Final note, the requirements on the entity stores are that any "unknown"
state is preserved so that an update will not modify such state. This is
due to the fact that entities of the same identity can have more than one
(possibly incompatible) type. This complicates traditional ORM techniques
quite a bit.

On Jun 14, 2016 09:06, "zhuangmz08" <zhuangmz08@qq.com> wrote:

> Hi, I dig into the Postgres table, and I find that entities are actually
> stored as JSON-format strings, which seems to use SQL database as a
> Document database. I'm wondering how efficient queries are achieved? I'm
> going to insert and query millions of entities. Have you ever tested the
> performance? Should I use Mongo-support Entity Store instead? Thanks a lot.
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message