hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luangsay Sourygna <luang...@gmail.com>
Subject Re: rules engine with Hadoop
Date Sat, 20 Oct 2012 18:03:42 GMT
Thanks for all the information. Many papers/book to read in my free time :)...

Just to get an idea, what is the maximum memory consumed by a rule engine
you have ever seen and what were its characteristic (how many facts
loaded at the same
time, how many rules and joins?) ?

On Sat, Oct 20, 2012 at 4:38 PM, Peter Lin <woolfel@gmail.com> wrote:
> All RETE implementations use RAM these days.
> There are older rule engines that used databases or file systems when
> there wasn't enough RAM. The key to efficient scale of rulebase
> systems or expert systems is loading only the data you need. An expert
> system is inference engine + rules + functions + facts. Some products
> shameless promote their rule engine as an expert system, when they
> don't understand what the term means. Some rule engines are expert
> systems shells, which provide a full programming environment without
> needing IDE and a bunch of other stuff. For example CLIPS, JESS and
> Haley come to mind.
> I would suggest reading Gary Riley's book
> http://www.amazon.com/Expert-Systems-Principles-Programming-Fourth/dp/0534384471/ref=sr_1_1?s=books&ie=UTF8&qid=1350743551&sr=1-1&keywords=giarratano+and+riley+expert+systems
> In terms of nodes, that actually doesn't matter much due to the
> discrimination network produced by RETE algorithm. What matters more
> is the number of facts and % of the facts that match some of the
> patterns declared in the rules.
> Most RETE implementations materialize the joins results, so that is
> the biggest factor in memory consumption. For example, if you had 1000
> rules, but only 3 have joins, they it doesn't make much difference. In
> contrast, if you had 200 rules and each has 4 joins, it will consume
> more memory for the same dataset.
> Proper scaling of rulebase systems requires years of experience and
> expertise, so it's not something one should rush. It's best to study
> the domain and methodically develop the rulebase so that it is
> efficient. I would recommend you use JESS. Feel free to email me
> directly if your company wants to hire experienced rule developer to
> assist with your project.
> RETE rule engines are powerful tools, but it does require experience
> to scale properly.

View raw message