accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: init method being called multiple times of WrappingIterator.
Date Fri, 03 Apr 2015 20:12:28 GMT
I think there's only one difference between invocation of an iterator 
via scans and major compactions: the batching of Key Values being 
returned to the clients. A side effect of this is that after a batch of 
data it returned from the server to the client, it's common that a new 
instance of the Iterator will be instantiated. You could see if a lot of 
instances of your iterator are being created.

Anything unique about the distribution of data? Very large values?

Depending on how you did your timings (at the client or within the 
iterator itself), you might have noticed extra time spent in what Thrift 
is doing (extra serialization).

If you issued the major compaction through the client API, there is an 
boolean option that will wait for the compaction to finish. Otherwise, 
compactions are asynchronous.

shweta.agrawal wrote:
> On Tuesday 31 March 2015 06:00 PM, shweta.agrawal wrote:
>> On Monday 30 March 2015 08:03 PM, Josh Elser wrote:
>>> Why are you using a print writer to get output from your iterator?
>>> Just use a logger and look in
>>> $ACCUMULO_HOME/logs/tserver_$hostname.debug.log (or wherever you
>>> configured logging). Create a log4j or slf4j Logger and use that
>>> instead of a print writer. (It's possible that your print writer is
>>> also what is slowing things down)
>>>
>>> In most real deployments, iterators should be faster on the server
>>> side than your client because you have N servers performing the work
>>> instead of your one client.
>>>
>>> It's not unheard of that a programming error is slowing down your
>>> iterator. Looking at what your iterator does (via logging) should
>>> help. Alternatively, you can use a remote debugger, connect a the
>>> tabletserver, and set breakpoints inside your iterator.
>>>
>>> shweta.agrawal wrote:
>>>> On Monday 30 March 2015 09:58 AM, shweta.agrawal wrote:
>>>>> Hi,
>>>>>
>>>>> Actually i am working on iterator, which i ran on server side by
>>>>> making jar and also on client side on same data, but on server side
>>>>> jar which i made is working slow than on client side. I am not able to
>>>>> find what went wrong. is it possible to work same logic more fast on
>>>>> client side than on accumulo iterators?
>>>>>
>>>>> time on client side:8s
>>>>> time on server side:30s
>>>>>
>>>>> And to get the output i am writing output on text file through print
>>>>> writer. To perform my task, i am calling my method on next method and
>>>>> i am writing output to a file in next method. So actually i want to
>>>>> know the final method which is called, so that i can write my output
>>>>> to a file after performing all the task.
>>>>>
>>>>> Thanks and Regards
>>>>> Shweta
>>>>
>>
>> Hi,
>>
>> Without print writer also it is taking the same time. And i am trying
>> to use remote debugger as you suggested but i am facing problem.
>>
>> To enable remote debugger i changed this in accumulo-env.sh file:
>> test -z "$ACCUMULO_TSERVER_OPTS" && export
>> ACCUMULO_TSERVER_OPTS="${POLICY} -Xmx384m -Xms384m -Xdebug
>> -Xrunjdwp:transport=dt_socket,server=y,address=50095"
>>
>> But after changing this accumulo is not working. In terminal its
>> showing started and when i am going to accumulo shell its saying there
>> are no tablet servers. So please help me out in this. am i doing
>> something wrong?
>>
>> Monitor and tserver is not starting their logs are:
>> Monitor Logs:
>> 2015-03-31 17:36:09,516 [mortbay.log] INFO : Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2015-03-31 17:36:09,535 [mortbay.log] INFO : jetty-6.1.26
>> 2015-03-31 17:36:09,607 [mortbay.log] WARN : failed
>> SocketConnector@shweta:50095: java.net.BindException: Address already
>> in use
>> 2015-03-31 17:36:09,608 [mortbay.log] WARN : failed Server@6555694:
>> java.net.BindException: Address already in use
>> 2015-03-31 17:36:09,608 [mortbay.log] INFO : Stopped
>> SocketConnector@shweta:50095
>>
>> Tserver Logs:
>> 2015-03-31 17:28:49,206 [tabletserver.TabletServer] INFO : unloaded
>> !0;~;!0<
>> 2015-03-31 17:28:49,298 [tabletserver.TabletServer] INFO : unloaded !0<;~
>> 2015-03-31 17:28:50,074 [tabletserver.TabletServer] INFO : unloaded
>> !0;!0<<
>> 2015-03-31 17:28:50,121 [tabletserver.TabletServer] FATAL: Lost tablet
>> server lock (reason = LOCK_DELETED), exiting.
>> 2015-03-31 17:28:50,122 [tabletserver.TabletServer] INFO : Master
>> requested tablet server halt
>>
>>
>> Thanks and Regards
>> Shweta
>>
> Hi,
>
> Thanks for all your help. I got the logs from
> $ACCUMULO_HOME/logs/tserver_$hostname.debug.log. Upon analysing them and
> setting the iterator to work at Major compaction scope, I found out that
> the iterator speeds up and I was able to complete the computation in 887
> ms. So now I want to ask that why is there a difference in execution
> times when I run the same iterator at major compaction scope and scan
> scope? Also is there a way to detect the end of a Major Compaction
> programmatically?
>
> Thanks and Regards
> Shweta

Mime
View raw message