hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mailing-list-recv <mailing-list-r...@sequoiadb.com>
Subject Is there performance benchmark for PXF?
Date Thu, 29 Oct 2015 08:46:56 GMT
Hey guys,

Is there any performance benchmark for PXF interface? I would like to study what is the overhead
when performing a big tablescan by communicating through PXF REST interface.

It seems there's no Parquet HDFS plugin, so there's no direct way to do head-to-head comparison
with/without PXF framework.

Is there any internal benchmark result to share?

Also, since I haven't seen any detailed documents about how exactly PXF works, can you correct
me if I'm wrong?
In my understanding, bankend/access/external is the main component to handle PXF calls, so
any external table access will invoke this module to send request to local PXF-SERVICE ( where
the master node locate ). PXF-SERVICE is responsible to pickup the correct java libraries
and construct filters. It will first attempt to get fragments, and then assign fragments to
each Segment process ( try to match the hostname for data locality ), each Segment process
is going to talk with local PXF-SERVER and calls Accessor class in order to fetch data from
external storage, then pass back the result to Segment process through REST API.

Is my understanding correct?

View raw message