incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <cscetbon....@orange.com>
Subject Re: Hadoop jobs and data locality
Date Tue, 07 May 2013 09:43:50 GMT
I tried to use your quick workaround but the task is lasting really longer than before even
if it uses 2 mappers in //. The fact is that there are 1000 tasks. Are you using vnodes ?
I didn't try to disable them.

Kind    % Complete      Num Tasks       Pending Running Complete        Killed  Failed/Killed
Task Attempts<http://107.21.43.255:50030/jobfailures.jsp?jobid=job_201305070843_0001>
map<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=map&pagenum=1>
    7.64%

        1025    945<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=map&pagenum=1&state=pending>
      2<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=map&pagenum=1&state=running>
78<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=map&pagenum=1&state=completed>
     0       0 / 0
reduce<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=reduce&pagenum=1>
      2.53%

        1       0       1<http://107.21.43.255:50030/jobtasks.jsp?jobid=job_201305070843_0001&type=reduce&pagenum=1&state=running>
     0       0       0 / 0




--
Cyril SCETBON

On May 5, 2013, at 8:45 AM, Shamim <srecon@yandex.ru<mailto:srecon@yandex.ru>>
wrote:

Hello,
  We have also came across this issue in our dev environment, when we upgrade Cassandra from
1.1.5 to 1.2.1 version. I have mentioned this issue in few times in this forum but haven't
got any answer yet. For quick work around you can use pig.splitCombination false in your pig
script to avoid this issue, but it will make one of your task with a very big amount of data.
I can't figure out why this happening in newer version of Cassandra, strongly guess some thing
goes wrong in Cassandra implementation of LoadFunc or in Murmur3Partition (it's my guess).
Here is my earliar post
http://www.mail-archive.com/user@cassandra.apache.org/msg28016.html
http://www.mail-archive.com/user@cassandra.apache.org/msg29425.html

Any comment from authors will be highly appreciated
P.S. please keep me in touch with any solution or hints.

--
Best regards
  Shamim A.



03.05.2013, 19:25, "cscetbon.ext@orange.com" <cscetbon.ext@orange.com>:
Hi,
I'm using Pig to calculate the sum of a columns from a columnfamily (scan of all rows) and
I've read that input data locality is supported at http://wiki.apache.org/cassandra/HadoopSupport
However when I execute my Pig script Hadoop assigns only one mapper to the task and not one
mapper on each node (replication factor = 1).  FYI, I've 8 mappers available (2 per node).
Is there anything that can disable the data locality feature ?

Thanks
--
Cyril SCETBON

_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees
et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom
- Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may
be protected by law; they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message
and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages
that have been modified, changed or falsified. Thank you.


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees
et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par
erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant
susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may
be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message
and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.


Mime
View raw message