incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijit Dhar <abhijit.dhar...@gmail.com>
Subject Missing logs in hbase because of same timestamp
Date Sat, 21 Jan 2012 04:20:27 GMT
I noticed that TsProcessor is using the timestamp as the key for putting logs
into hbase. But, my logs are coming in so fast that they have same timestamp
like this:

2012-01-20 20:03:14,041 [INFO] [communication thread]
[org.apache.hadoop.mapred.LocalJobRunner.statusUpdate()] 10 threads, 28
requests, 0 errors, 0 forbidden, 0.6 pages/s, 80 kb/s, 
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.doWork()] -activeThreads=10,
spinWaiting=7, fetchQueues.totalSize=649
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.feedQueueManager()] feeding 649 input
urls ...
2012-01-20 20:03:14,852 [INFO] [Thread-274]
[jcrawler.fetch.mapreduce.FetchMapper.logHeapUsage()] Fetcher feeding queue
manager. Heap usage: 327668152 out of 932118528 bytes.

I think because of this, they are getting reduced and takes only one log for
a given timestamp.
Any idea how to fix this?

Thanks,

--
View this message in context: http://apache-chukwa.679492.n3.nabble.com/Missing-logs-in-hbase-because-of-same-timestamp-tp3677271p3677271.html
Sent from the Chukwa - Users mailing list archive at Nabble.com.

Mime
View raw message