incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: pycassa + celery
Date Thu, 14 Apr 2011 20:16:53 GMT
This is going to be a bug in your code, so it's a bit tricky to know but...

How / when is the email added to the DB?
What does the rawEmail function do ?
Set a break point, what are the two strings you are feeding into the hash functions ? 

Aaron
On 15 Apr 2011, at 03:50, pob wrote:

> Hello,
> 
> I'm experiencing really strange problem. I wrote data into cassandra cluster. I'm trying
to check if data inserted then fetched are equally to source data (file).  Code below is the
task for celery that does the comparison with sha1(). The problem is that celery worker returning
since time to time during the comparison output like that:
> 
> 2011-04-14 17:24:33,225: INFO/PoolWorker-134] tasks.insertData[f377efdb-33a2-48f4-ab00-52b1898e216c]:
[Error/EmailTest] Email corrupted.] 
> 
> If i execute the task code manually the output is correct ,[Email data test: OK].
> 
> I thought that possible bug is in multi threading but i start celery workers with only
one thread to  remove that case. 
> 
> 
> Another problem that is occurring often is :
> 
> [2011-04-14 12:46:49,682: INFO/PoolWorker-1] Connection 17810000 (IP:9160) in ConnectionPool
(id = 15612176) failed: timed out
> [2011-04-14 12:46:49,844: INFO/PoolWorker-1] Connection 17810000 (IP:9160) in ConnectionPool
(id = 15612176) failed: UnavailableException()
> 
> 
> I'm using pycassa connection pooling  with parameters pool_size=15 (5* number of nodes),
max_retries=30, max_overflow=5, timeout=4
> 
> 
> Any ideas where should be problems? The client is pycassa 1.0.8, and I tried it with
1.0.6 too.
> 
> 
> Thanks
> 
> 
> Best,
> Peter
> 
> #######
> 
> @task(ignore_result=True)
> def checkData(key):
> 
> 
>     logger = insertData.get_logger()
>     logger.info("Reading email %s" % key)
>     logger.info("Task id %s" %  checkData.request.id)
> 
>     f = open(key, 'r')
>     sEmail = f.readlines()
>     f.close()
> 
>     m = hashlib.sha1()
>     m.update(''.join(sEmail))
>     sHash = m.hexdigest()
> 
>     #fetch email from DB
>     email = rawEmail(key)
> 
> 
>     m = hashlib.sha1()
>     m.update(email)
>     dHash = m.hexdigest()
>  
>     if sHash != dHash:
>         logger.info("[Error/EmailTest] Email corrupted.] < %s >" % key)
>     else:
>         logger.info("[Email data test: OK]")
> 


Mime
View raw message