nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: What kind of way do you recommend for the load balancing? (RPG/HTTP processors)
Date Wed, 21 Jun 2017 14:15:36 GMT
Hello,

Generally the RPG approach should be better in that it handles load
balancing and failover for you.

For testing the RPG, you shouldn't need to use the DistributeLoad
processor with the RPG. You should be able to have GenerateFlowFile ->
RPG (with URL of any node)  and then an Input Port -> some processor.
The RPG will figure out all of the nodes in the cluster and send data
to all of them.

On the RPG, if you go into the remote ports menu, each port has some
settings that will control how many flow files get sent per
transaction. Generally you will probably get a more even distribution
with a smaller batch size, but you will get much better performance
with a larger batch size. The RPG also factors in the # of flow files
on each node when determining where to send them, so if node 1 is the
primary node and has more flow files queued, then the RPG will likely
send more flow files to node 2.

Hope that helps.

-Bryan


On Mon, Jun 19, 2017 at 10:38 PM, 진유리 <yuriticon@gmail.com> wrote:
> Hi All,
> I have some questions about load balancing for clustered NiFi v1.0.0 (2
> nodes)
>
>
> I consider 2 ways.
>
> 1) RPG way : Remote Processor Group (HTTP/RAW) + InputPort
> 2) HTTP way : PostHTTP + ListenHTTP
>
>
>
> Could you tell me which one is better and explain why ?
> On my test case, HTTP way is faster than RPG way but HTTP way have a
> disadvantage to assign unique port number for each ListenHTTP processor.
> (Actually, I don't understand why HTTP way is faster than  RPG way..)
>
>
>
> Moreover, I found some strange things on my workflow.
> This in my NiFi workflow to compare performance between PRG way and HTTP way
>
>
>
> ----------------------------------------------------------------------------------------------------------------------------------
> GenerateFlowFile(On Primary Node) -> DistributeLoad -> RPG (to node 1)
>
> -> RPG (to node 2)
>                                                             ->
> DistributeLoad -> PostHTTP (to node 1)
>
> -> PostHTTP (to node 2)
> ----------------------------------------------------------------------------------------------------------------------------------
> InputPort     -> PutFile
> ListenHTTP-> PutFile
> ----------------------------------------------------------------------------------------------------------------------------------
>
> First, I got Socket Exception on 'PostHTTP' processor.
> (java.net.SocketException: Connection reset, Broken pipe (Write failed))
> I guess  it cause that calling this too many times.
>
> PostHTTP processor shows error mark and logs but RPG does not show anything.
> I guess both have same problem because both don't work anymore.
>
>
>
>
>
> And I select 'Round Robin' Strategy all 'DistributeLoad' Processors.
> But result of above 2 ways is different for each node.
>
> - RPG way : One node wrote 2 times more files than another node
> - HTTP way : Each node wrote almost same number files
>
>
> Please share your opinions or tips for load balancing.
>
> Thanks
> -Yuri Jin-
>

Mime
View raw message