hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Goeke <goeke.matt...@gmail.com>
Subject Re: Oozie apparent concurrency deadlocking
Date Thu, 15 Nov 2012 15:31:04 GMT
Andy,

Are you using the fairscheduler or default FIFO? This problem can be
partially alleviated by routing the MR actions and the Launcher jobs to
seperate queues/pools. The reason for this is if they are both competing
for the same resources you can run into a situation where all of the
available slots are taken up by the launcher actions and
thus permanent deadlock. I am guessing based on the numbers you threw out
there that your overall slot capacity is small (less than 10 mappers
total?) but if this isn't the case then something else is probably going on
as well. The way to specify it if you are looking to do it in a sqoop node
is below:

<action name="sqoop-node">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${JOB_TRACKER}</job-tracker>
            <name-node>${NAME_NODE}</name-node>
            <prepare>
                <delete path="${NAME_NODE}/tmp/blah"/>
            </prepare>
            <configuration>
                <property>
                    <name>oozie.launcher.mapred.fairscheduler.pool</name>
                    <value>${LAUNCHER_POOL}</value>
                </property>
            </configuration>
...
</action>

I have seen the oozie.service.CallableQueueService.callable.concurrency
property fix mentioned before as well but I thought that was only for
internalized Oozie nodes (e.g. forks/decisions/etc).

Hope this helps

--
Matt


On Thu, Nov 15, 2012 at 8:44 AM, Kartashov, Andy <Andy.Kartashov@mpac.ca>wrote:

>  Guys,
>
>
>
> Have struggled for the last four days with this and still cannot find an
> answer even after hours of searching the web.
>
>
>
> I tried oozie workflow to execute my consecutive sqoop jobs in parallel.
>  I use forking that executes 9 sqoop-action-nodes.
>
>
>
> I had no problem executing the job on a pseudo-distributed cluster but
> with an added DN/TT node I ran into (what seems like) deadlocking.  Oozie
> web interface displays those jobs as “Running” indefinitely until I
> eventually kill the workflow.
>
>
>
> What I did noticed wasd that if I was to reduce the number of
> sqoop-action-nodes to 3, all works fine.
>
>
>
> I found somewhere about
> oozie.service.CallableQueueService.callable.concurrency property to be set
> by default to 3 and it hinted me that this must be it them. I tried to
> over-ride this property by increasing this number to 5 in oozie-site.xml
> and restart oozie server and then run 4 sqoop-action-nodes in fork but the
> result is the same. 2 out of 4 nodes execute successfully (not in the same
> order every time) but the other 2 get hung in indefinite “Running…”.
>
>
>
> There were some suggestion about changing queue name from default but
> nothing was clear as to what it change it to and where.
>
>
>
> In case someone found a solution to this please do share. I will greatly
> appreciate it.
>
>
>
> Thanks,
>
> AK47
>  NOTICE: This e-mail message and any attachments are confidential, subject
> to copyright and may be privileged. Any unauthorized use, copying or
> disclosure is prohibited. If you are not the intended recipient, please
> delete and contact the sender immediately. Please consider the environment
> before printing this e-mail. AVIS : le présent courriel et toute pièce
> jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur
> et peuvent être couverts par le secret professionnel. Toute utilisation,
> copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le
> destinataire prévu de ce courriel, supprimez-le et contactez immédiatement
> l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent
> courriel
>

Mime
View raw message