accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: 回复:how can i optimize scan speed when use batch scan ?
Date Wed, 14 Jan 2015 02:32:08 GMT
You might need to set tserver.cache.data.size to a larger value. 
Depending on the amount of data, you might just churn through the cache 
without getting much benefit. I think you have to restart Accumulo after 
changing this property.

Can you show us the code you used to try to scan for a row ID and the 
data in the table you expected to be returned that wasn't?

覃璐 wrote:
> Yes,I received all results what I want when the program end.
>
> But I do not know why the scan received 0 result when I ensure a exists
> row id?
>
> I config the table.cache.block.enable=true,but I do not found distinct
> change.
>
> Thanks
>
>
> 原始邮件
> *发件人:* Eric Newton<eric.newton@gmail.com>
> *收件人:* user@accumulo.apache.org<user@accumulo.apache.org>
> *发送时间:* 2015年1月14日(周三) 00:17
> *主题:* Re: 回复:how can i optimize scan speed when use batch scan ?
>
> You should have received at least 1390 Key/Value pairs (#results=1390).
>
> If your application has many exact RowID look-ups, you may want to
> investigate Bloom filters.
>
> Consider turning on data block caching to reduce latency on future look-ups.
>
> -Eric
>
>
> On Mon, Jan 12, 2015 at 8:15 PM, 覃璐 <luq.java@gmail.com
> <mailto:luq.java@gmail.com>> wrote:
>
>     i am sorry i do not know about the image.
>
>     the log is this:
>
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
>     [21521] - tid=65 oid=675 Continuing multi scan,
>     scanid=-152589127623326551
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
>     [21544] - tid=65 oid=675 Got more multi scan results, #results=1390
>     scanID=-152589127623326551 in 0.023 secs
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
>     [21546] - tid=65 oid=676 Continuing multi scan,
>     scanid=-152589127623326551
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
>     [21555] - tid=45 oid=644 Got more multi scan results, #results=0
>     scanID=-4477962012178388198 in 1.002 secs
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
>     [21555] - tid=45 oid=677 Continuing multi scan,
>     scanid=-4477962012178388198
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
>     [21596] - tid=57 oid=645 Got more multi scan results, #results=0
>     scanID=-8718025066902358141 in 1.003 secs
>
>     [17:50:38] TRACE
>     [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
>     [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
>     [21596] - tid=57 oid=678 Continuing multi scan,
>     scanid=-8718025066902358141
>
>
>     the scan spend long time but has no result.
>
>
>     i use 1.6.1,and the config output is this:
>
>
>     default | table.balancer ............................ |
>     org.apache.accumulo.server.master.balancer.DefaultLoadBalancer
>
>     default | table.bloom.enabled ....................... | false
>
>     default | table.bloom.error.rate .................... | 0.5%
>
>     default | table.bloom.hash.type ..................... | murmur
>
>     default | table.bloom.key.functor ................... |
>     org.apache.accumulo.core.file.keyfunctor.RowFunctor
>
>     default | table.bloom.load.threshold ................ | 1
>
>     default | table.bloom.size .......................... | 1048576
>
>     default | table.cache.block.enable .................. | false
>
>     default | table.cache.index.enable .................. | true
>
>     default | table.classpath.context ................... |
>
>     default | table.compaction.major.everything.idle .... | 1h
>
>     default | table.compaction.major.ratio .............. | 3
>
>     default | table.compaction.minor.idle ............... | 5m
>
>     default | table.compaction.minor.logs.threshold ..... | 3
>
>     table | table.constraint.1 ........................ |
>     org.apache.accumulo.core.constraints.DefaultKeySizeConstraint
>
>     default | table.failures.ignore ..................... | false
>
>     default | table.file.blocksize ...................... | 0B
>
>     default | table.file.compress.blocksize ............. | 100K
>
>     default | table.file.compress.blocksize.index ....... | 128K
>
>     default | table.file.compress.type .................. | gz
>
>     default | table.file.max ............................ | 15
>
>     default | table.file.replication .................... | 0
>
>     default | table.file.type ........................... | rf
>
>     default | table.formatter ........................... |
>     org.apache.accumulo.core.util.format.DefaultFormatter
>
>     default | table.groups.enabled ...................... |
>
>     default | table.interepreter ........................ |
>     org.apache.accumulo.core.util.interpret.DefaultScanInterpreter
>
>     table | table.iterator.majc.vers .................. |
>     20,org.apache.accumulo.core.iterators.user.VersioningIterator
>
>     table | table.iterator.majc.vers.opt.maxVersions .. | 1
>
>     table | table.iterator.minc.vers .................. |
>     20,org.apache.accumulo.core.iterators.user.VersioningIterator
>
>     table | table.iterator.minc.vers.opt.maxVersions .. | 1
>
>     table | table.iterator.scan.vers .................. |
>     20,org.apache.accumulo.core.iterators.user.VersioningIterator
>
>     table | table.iterator.scan.vers.opt.maxVersions .. | 1
>
>     default | table.majc.compaction.strategy ............ |
>     org.apache.accumulo.tserver.compaction.DefaultCompactionStrategy
>
>     default | table.scan.max.memory ..................... | 512K
>
>     default | table.security.scan.visibility.default .... |
>
>     default | table.split.threshold ..................... | 1G
>
>     default | table.walog.enabled ....................... | true
>
>
>     and my tablet server is 4 core,32G.
>
>
>     Thanks
>
>
>     原始邮件
>     *发件人:* Josh Elser<josh.elser@gmail.com <mailto:josh.elser@gmail.com>>
>     *收件人:* user<user@accumulo.apache.org
>     <mailto:user@accumulo.apache.org>>
>     *发送时间:* 2015年1月12日(周一) 23:52
>     *主题:* Re: 回复:how can i optimize scan speed when use batch scan ?
>
>     FYI, images don't (typically) come across on the mailing list. Use some
>     external hosting and provide the link if it's important, please.
>
>     How many tabletservers do you have? What version of Accumulo are you
>     running? Can you share the output of `config -t your_table_name`?
>
>     Thanks.
>
>     覃璐 wrote:
>     >  i look the trace log
>     >
>     >
>     >  why it receive 0 result and spend so long?
>     >
>     >
>     >  原始邮件
>     >  *发件人:* 覃璐<luq.java@gmail.com  <mailto:luq.java@gmail.com>>
>     >  *收件人:* user<user@accumulo.apache.org  <mailto:user@accumulo.apache.org>>
>     >  *发送时间:* 2015年1月12日(周一) 17:05
>     >  *主题:* how can i optimize scan speed when use batch scan ?
>     >
>     >  hi all.
>     >
>     >  now i have code like this:
>     >
>     >  List<Range>  rangeList=…..;
>     >  BatchScanner bs=conn.createBatchScanner();
>     >  bs.setRanges(rangeList);
>     >
>     >
>     >  the rangeList has many ranges about 1000,and every range has a random
>     >  row id when i use Range.exact(new Text(…)),
>     >  but the speed is so slowly,it maybe spend 2-3s,how can i optimize it ?
>     >
>     >  thanks
>
>

Mime
View raw message