cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pob <peterob...@gmail.com>
Subject Re: pycassa + celery
Date Tue, 19 Apr 2011 23:29:44 GMT
Hello,

yeah, the bug was in my code because i use CL.ONE (so sometimes i
got incomplete data)

Thanks.

2011/4/14 aaron morton <aaron@thelastpickle.com>

> This is going to be a bug in your code, so it's a bit tricky to know but...
>
> How / when is the email added to the DB?
> What does the rawEmail function do ?
> Set a break point, what are the two strings you are feeding into the hash
> functions ?
>
> Aaron
> On 15 Apr 2011, at 03:50, pob wrote:
>
> Hello,
>
> I'm experiencing really strange problem. I wrote data into cassandra
> cluster. I'm trying to check if data inserted then fetched are equally to
> source data (file).  Code below is the task for celery that does
> the comparison with sha1(). The problem is that celery worker returning
> since time to time during the comparison output like that:
>
> 2011-04-14 17:24:33,225: INFO/PoolWorker-134]
> tasks.insertData[f377efdb-33a2-48f4-ab00-52b1898e216c]: [Error/EmailTest]
> Email corrupted.]
>
> If i execute the task code manually the output is correct ,[Email data
> test: OK].
>
> I thought that possible bug is in multi threading but i start celery
> workers with only one thread to  remove that case.
>
>
> Another problem that is occurring often is :
>
> [2011-04-14 12:46:49,682: INFO/PoolWorker-1] Connection 17810000 (IP:9160)
> in ConnectionPool (id = 15612176) failed: timed out
> [2011-04-14 12:46:49,844: INFO/PoolWorker-1] Connection 17810000 (IP:9160)
> in ConnectionPool (id = 15612176) failed: UnavailableException()
>
>
> I'm using pycassa connection pooling  with parameters pool_size=15 (5*
> number of nodes), max_retries=30, max_overflow=5, timeout=4
>
>
> Any ideas where should be problems? The client is pycassa 1.0.8, and I
> tried it with 1.0.6 too.
>
>
> Thanks
>
>
> Best,
> Peter
>
> #######
>
> @task(ignore_result=True)
> def checkData(key):
>
>
>     logger = insertData.get_logger()
>     logger.info("Reading email %s" % key)
>     logger.info("Task id %s" %  checkData.request.id)
>
>     f = open(key, 'r')
>     sEmail = f.readlines()
>     f.close()
>
>     m = hashlib.sha1()
>     m.update(''.join(sEmail))
>     sHash = m.hexdigest()
>
>     #fetch email from DB
>     email = rawEmail(key)
>
>
>     m = hashlib.sha1()
>     m.update(email)
>     dHash = m.hexdigest()
>
>     if sHash != dHash:
>         logger.info("[Error/EmailTest] Email corrupted.] < %s >" % key)
>     else:
>         logger.info("[Email data test: OK]")
>
>
>

Mime
View raw message