accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ara Ebrahimi <ara.ebrah...@argyledata.com>
Subject Re: OfflineScanner
Date Thu, 19 Feb 2015 17:51:59 GMT
OfflineScanner is package protected. So I'll need to hack it. If it proves to be faster at
least 20% then it's worth having it in the public Ali, perhaps even let user use it by a asking
specific file to be scanned rather than directing scan by carefully defining the range to
touch the intended file.

Ara.

On Feb 19, 2015, at 8:15 AM, Keith Turner <keith@deenlo.com<mailto:keith@deenlo.com>>
wrote:



On Thu, Feb 19, 2015 at 12:57 AM, Ara Ebrahimi <ara.ebrahimi@argyledata.com<mailto:ara.ebrahimi@argyledata.com>>
wrote:
Hi,

I'm trying to optimize a connector we've written for Presto. In some cases we need to perform
full table scans. This happens across all the nodes but each node is assigned to process only
a sharded subset of data. Each shard is hosted by only 1 RFile. I'm looking at the AbstractInputFormat
and OfflineIterator and it seems like the code is not that hard to use for this case. Is there
any drawback? It seems like if the table is offline then OfflineIterator is used which apparently
reads the RFiles directly and doesn't involve any RPC and I think should be significantly
faster. Is it so? Is there any drawback to using this while the table is not offline but no
other app is messing with the table?

The code will throw an exception if the table is not offline (intent is to ensure the files
are stable and not garbage collected). As others have stated you can clone.

Currently offline scanning is only supported in the public API w/ Map Reduce.  Curious, would
you be interested in seeing this in the client public API?


Thanks,
Ara.



________________________________

This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

________________________________



________________________________

This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

________________________________

Mime
View raw message