hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Using REST to get ApplicationMaster info (Issue solved)
Date Thu, 26 Jul 2012 18:29:19 GMT
OK I think I understand it now.  You probably have ACLs enabled, but no
web filter on the RM to let you sign in as a given user.  As such the
default filter is making you be Dr. Who, or whomever else it is, but the
ACL check in the web service is rejecting Dr Who, because that is not the
correct user.  You will probably run into this issue again if anyone else
but you runs something.  You could fix this by either disabling the ACL
check, which makes a lot of since for a cluster without security, or you
could implement a servlet Filter for the RM that would let you sign on as
a given user.

--Bobby Evans


On 7/26/12 12:48 AM, "Prajakta Kalmegh" <pkalmegh@gmail.com> wrote:

>Hi Bobby
>
>Thanks for the reply. My REST calls are working fine since I set the
>'hadoop.http.staticuser.user' property to 'prajakta' instead of Dr.Who in
>core-site.xml . I didn't get time to figure out the reason behind it as I
>just moved on to further coding :)
>
>Thanks,
>Prajakta
>
>
>
>On Thu, Jul 26, 2012 at 1:40 AM, Robert Evans <evans@yahoo-inc.com> wrote:
>
>> Hmm, that is very odd.  It only checks the user if security is enabled
>>to
>> warn the user about potentially accessing something unsafe.  I am not
>>sure
>> why that would cause an issue.
>>
>> --Bobby Evans
>>
>> On 7/9/12 6:07 AM, "Prajakta Kalmegh" <pkalmegh@gmail.com> wrote:
>>
>> >Hi Robert
>> >
>> >I figured out the problem just now. To avoid the below error, I had to
>>set
>> >the 'hadoop.http.staticuser.user' property in core-site.xml (defaults
>>to
>> >dr.who). I can now get runtime data from AppMaster using *curl* as
>>well as
>> >in GUI.
>> >
>> >I wonder if we have to set this property even when we are not
>>specifying
>> >the yarn web-proxy address (when it runs as part of RM by default) as
>> >well.
>> >If yes, was it documented somewhere which I failed to see? :(
>> >
>> >Anyways, thanks for your response so far.
>> >
>> >Regards,
>> >Prajakta
>> >
>> >
>> >
>> >On Mon, Jul 9, 2012 at 3:29 PM, Prajakta Kalmegh <pkalmegh@gmail.com>
>> >wrote:
>> >
>> >> Hi Robert
>> >>
>> >> I started the proxyserver explicitly by specifying a value for the
>> >> yarn.web-proxy.address in yarn-site.xml. The proxyserver did start
>>and I
>> >> tried getting the JSON response using the following command :
>> >>
>> >> curl --compressed -H "Accept: application/json" -X GET "
>> >>
>> >>
>>
>>http://localhost:8090/proxy/application_1341823967331_0001/ws/v1/mapreduc
>> >>e/jobs/job_1341823967331_0001
>> >> "
>> >>
>> >> However, it refused connection and below is the excerpt from the
>> >> Proxyserver logs:
>> >> ---------
>> >> 2012-07-09 14:26:40,402 INFO org.mortbay.log: Extract
>> >>
>>
>>>>jar:file:/home/prajakta/Projects/IRL/hadoop-common/hadoop-dist/target/h
>>>>ad
>>
>>>>oop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-yarn-common-3.0.0-SNAP
>>>>SH
>> >>OT.jar!/webapps/proxy
>> >> to /tmp/Jetty_localhost_8090_proxy____.ak3o30/webapp
>> >> 2012-07-09 14:26:40,992 INFO org.mortbay.log: Started
>> >> SelectChannelConnector@localhost:8090
>> >> 2012-07-09 14:26:40,993 INFO
>> >> org.apache.hadoop.yarn.service.AbstractService:
>> >> Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxy is
>>started.
>> >> 2012-07-09 14:26:40,993 INFO
>> >> org.apache.hadoop.yarn.service.AbstractService:
>> >> Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer is
>> >>started.
>> >> 2012-07-09 14:33:26,039 INFO
>> >> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is
>> >> accessing unchecked
>> >> http://prajakta:44314/ws/v1/mapreduce/jobs/job_1341823967331_0001
>>which
>> >> is the app master GUI of application_1341823967331_0001 owned by
>> >>prajakta
>> >> 2012-07-09 14:33:29,277 INFO
>> >> org.apache.commons.httpclient.HttpMethodDirector: I/O exception
>> >> (org.apache.commons.httpclient.NoHttpResponseException) caught when
>> >> processing request: The server prajakta failed to respond
>> >> 2012-07-09 14:33:29,277 INFO
>> >> org.apache.commons.httpclient.HttpMethodDirector: Retrying request
>> >> 2012-07-09 14:33:29,284 WARN org.mortbay.log:
>> >>
>>
>>>>/proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_13418239
>>>>67
>> >>331_0001:
>> >> java.net.SocketException: Connection reset
>> >> 2012-07-09 14:37:33,834 INFO
>> >> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is
>> >> accessing unchecked
>> >>
>> >>
>>
>>http://prajakta:19888/jobhistory/job/job_1341823967331_0001/jobhistory/jo
>> >>b/job_1341823967331_0001which is the app master GUI of
>> >>application_1341823967331_0001 owned by
>> >> prajakta
>> >> ---------------
>> >>
>> >> I am not sure why http request object is setting my remoteUser to
>> >>dr.who.
>> >> :(
>> >>
>> >> I gather from <https://issues.apache.org/jira/browse/MAPREDUCE-2858>
>> >>that
>> >> this warning is posted only in case where security is disabled. I
>>assume
>> >> that the proxy server is not disabled if security is disabled.
>> >>
>> >> Any idea what could be the reason for this I/O exception? Am I
>>missing
>> >> setting any property for proper access. Please let me know.
>> >>
>> >> Regards,
>> >> Prajakta
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Jul 6, 2012 at 10:59 PM, Prajakta Kalmegh
>> >><pkalmegh@gmail.com>wrote:
>> >>
>> >>> I am using hadoop trunk (forked from github). It supports RESTful
>>APIs
>> >>>as
>> >>> I am able to retrieve JSON objects for RM (cluster/nodes info)+
>> >>> Historyserver. The only issue is with AppMaster REST API.
>> >>>
>> >>> Regards,
>> >>> Prajakta
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Jul 6, 2012 at 10:55 PM, Robert Evans
>> >>><evans@yahoo-inc.com>wrote:
>> >>>
>> >>>> What version of hadoop are you using?  It could be that the version
>> >>>>you
>> >>>> have does not have the RESTful APIs in it yet, and the proxy is
>> >>>>working
>> >>>> just fine.
>> >>>>
>> >>>> --Bobby Evans
>> >>>>
>> >>>> On 7/6/12 12:06 PM, "Prajakta Kalmegh" <pkalmegh@gmail.com>
wrote:
>> >>>>
>> >>>> >Robert , Thanks for the response. If I do not provide any explicit
>> >>>> >configuration for the proxy server, do I still need to start
it
>>using
>> >>>> the
>> >>>> >'yarn start proxy server'? I am currently not doing it.
>> >>>> >
>> >>>> >Also, I am able to access the html page for proxy using the
>> >>>> ><http://localhost:8088/proxy/{appid}/mapreduce/jobs> URL.
(Note
>>this
>> >>>> url
>> >>>> >does not have the '/ws/v1/ part in it. I get the html response
>>when I
>> >>>> >query
>> >>>> >for this URL in runtime.
>> >>>> >
>> >>>> >So I assume the proxy server must be starting fine since I am
>>able to
>> >>>> >access this URL. I will try logging more details tomorrow from
my
>> >>>>office
>> >>>> >machine and will let you know the result.
>> >>>> >
>> >>>> >Regards,
>> >>>> >Prajakta
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >On Fri, Jul 6, 2012 at 10:22 PM, Robert Evans
>><evans@yahoo-inc.com>
>> >>>> wrote:
>> >>>> >
>> >>>> >> Sorry I did not respond sooner.  The default behavior is
to have
>> >>>>the
>> >>>> >>proxy
>> >>>> >> server run as part of the RM.  I am not really sure why
it is
>>not
>> >>>> doing
>> >>>> >> this in your case.  If you set the config yourself to be
a URI
>> >>>>that is
>> >>>> >> different from that of the RM then you need to launch a
>>standalone
>> >>>> proxy
>> >>>> >> server.  You can do this by running
>> >>>> >>
>> >>>> >> yarn start proxy server
>> >>>> >>
>> >>>> >> Without sitting down with you it is going to be somewhat
>>difficult
>> >>>>to
>> >>>> >> debug why this is happening.  However, in retrospect it
would be
>> >>>>nice
>> >>>> to
>> >>>> >> add in some extra logging to help indicate why the proxy
server
>>is
>> >>>>not
>> >>>> >> functioning as desired.  If you could file a JIRA to add
in the
>> >>>> logging
>> >>>> >>I
>> >>>> >> would be happy to provide a patch to you and we can try
and
>>debug
>> >>>>the
>> >>>> >> issue further.  Please file it under the MAPREDUCE JIRA
project.
>> >>>> >>
>> >>>> >> --Bobby
>> >>>> >>
>> >>>> >> On 7/6/12 3:29 AM, "Prajakta Kalmegh" <pkalmegh@gmail.com>
>>wrote:
>> >>>> >>
>> >>>> >> >Re-posting as I haven't got a solution yet. Sorry for
>>spamming. I
>> >>>> >>won't be
>> >>>> >> >able to proceed in my code until I get a JSON response
using
>> >>>> AppMaster
>> >>>> >> >REST
>> >>>> >> >URL. :(
>> >>>> >> >
>> >>>> >> >Thanks,
>> >>>> >> >Prajakta
>> >>>> >> >
>> >>>> >> >
>> >>>> >> >On Wed, Jul 4, 2012 at 5:55 PM, Prajakta Kalmegh
>> >>>><pkalmegh@gmail.com
>> >>>> >
>> >>>> >> >wrote:
>> >>>> >> >
>> >>>> >> >> Hi Robert/Harsh
>> >>>> >> >>
>> >>>> >> >> Thanks for your reply.
>> >>>> >> >>
>> >>>> >> >> My RM is starting just fine. The problem is with
the use of
>> >>>> >> >>http://<proxy httpddress:port>/proxy/{appid}/ws/v1/mapreduce
>> >>>> >> >> to get the JSON response.
>> >>>> >> >>
>> >>>> >> >> As I said before, I had not configured the
>> >>>>yarn.web-proxy.address
>> >>>> >> >>property in yarn-site.xml. I assumed it will use
the RM's
>> >>>> >> >>yarn.resourcemanager.webapp.address property value
as default.
>> >>>> >>However,
>> >>>> >> >>it gives me a '404-Page not found error'.  Today
I tried
>> >>>>specifying
>> >>>> a
>> >>>> >> >>value explicitly for the yarn.web-proxy.address
property.
>> >>>> >> >>
>> >>>> >> >> On running the wordcount example, it even gives
a url
>> >>>> >> >><http://localhost:8090>/proxy/{appid}/>
to track the App Mast
>> >>>>info.
>> >>>> >> >>However, I am still not able to get a json response.
>> >>>> >> >>
>> >>>> >> >> Also, I tried to get the data from historyserver
instead of
>> >>>>runtime
>> >>>> >> >>using the instructions given on page
>> >>>> >> >><
>> >>>> >>
>> >>>>
>> >>>>
>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-ya
>> >>>>r
>> >>>> >> >>n-site/HistoryServerRest.html>
>> >>>> >> >>
>> >>>> >> >> HistoryServer REST response does not give me jobids
>> >>>>corresponding
>> >>>> to
>> >>>> >>an
>> >>>> >> >>application. It just lists all the jobs run until
now. By the
>> >>>>way,
>> >>>> the
>> >>>> >> >>documentation does say
>> >>>> >> >>
>> >>>> >> >> ----------
>> >>>> >> >>
>> >>>> >> >> "Both of the following URI's give you the history
server
>> >>>> information,
>> >>>> >> >>from an application id identified by the appid
value.
>> >>>> >> >>   * http://<history server http address:port>/ws/v1/history
>> >>>> >> >>   * http://<history server http
>> >>>>address:port>/ws/v1/history/info"
>> >>>> >> >> ---------
>> >>>> >> >>
>> >>>> >> >> But there is no provision to specify the application
id with
>> >>>>these
>> >>>> >>REST
>> >>>> >> >>URLs.
>> >>>> >> >>
>> >>>> >> >> Any idea how I can get the Application Master
REST working
>>and
>> >>>>also
>> >>>> >> >>linking jobids to application id using the HistoryServerREST
>>API?
>> >>>> >> >>
>> >>>> >> >> Any help is appreciated. Thanks in advance.
>> >>>> >> >> Regards,
>> >>>> >> >> Prajakta
>> >>>> >> >>
>> >>>> >> >>
>> >>>> >> >>
>> >>>> >> >>
>> >>>> >> >> On Fri, Jun 29, 2012 at 8:55 PM, Robert Evans
>> >>>><evans@yahoo-inc.com
>> >>>> >
>> >>>> >> >>wrote:
>> >>>> >> >>
>> >>>> >> >>> Please don't file that JIRA.  The proxy server
is intended
>>to
>> >>>> front
>> >>>> >>the
>> >>>> >> >>> web server for all calls to the AM.  This
is so you only
>>have
>> >>>>to
>> >>>> go
>> >>>> >>to
>> >>>> >> >>>a
>> >>>> >> >>> single location to get to any AM's web service.
 The proxy
>> >>>>server
>> >>>> >>is a
>> >>>> >> >>> very simple proxy and just forwards the extra
part of the
>>path
>> >>>>on
>> >>>> to
>> >>>> >> >>>the
>> >>>> >> >>> AM.
>> >>>> >> >>>
>> >>>> >> >>> If you are having issues with this please
include the
>>version
>> >>>>you
>> >>>> >>are
>> >>>> >> >>> having problems with.  Also please look at
the logs for the
>>RM
>> >>>>on
>> >>>> >> >>>startup
>> >>>> >> >>> to see if there is anything there indicating
why it is not
>> >>>> starting
>> >>>> >>up.
>> >>>> >> >>>
>> >>>> >> >>> --Bobby Evans
>> >>>> >> >>>
>> >>>> >> >>> On 6/28/12 9:46 AM, "Harsh J" <harsh@cloudera.com>
wrote:
>> >>>> >> >>>
>> >>>> >> >>> >As far as I can tell, the MR WebApp, as
the name itself
>> >>>>indicates
>> >>>> >>on
>> >>>> >> >>> >its doc page, starts only at the MR AM
(which may be
>>running
>> >>>>at
>> >>>> any
>> >>>> >> >>> >NM), and it starts as an ephemeral port
logged at in the AM
>> >>>>logs
>> >>>> >> >>> >usually as:
>> >>>> >> >>> >
>> >>>> >> >>> >INFO Web app /mapreduce started at [PORT]
>> >>>> >> >>> >
>> >>>> >> >>> >That it starts its own server with an
ephemeral access
>>point
>> >>>> makes
>> >>>> >> >>> >sense, since each job uses its own AM
and having a common
>> >>>> location
>> >>>> >>may
>> >>>> >> >>> >not work with the form of REST API documented
at your link.
>> >>>>Can
>> >>>> you
>> >>>> >> >>> >please file a JIRA to fix the doc and
remove the proxy
>>server
>> >>>> refs,
>> >>>> >> >>> >which are misleading?
>> >>>> >> >>> >
>> >>>> >> >>> >Do correct me if I'm wrong.
>> >>>> >> >>> >
>> >>>> >> >>> >On Thu, Jun 28, 2012 at 6:13 PM, Prajakta
Kalmegh
>> >>>> >><pkalmegh@gmail.com
>> >>>> >> >
>> >>>> >> >>> >wrote:
>> >>>> >> >>> >> Hi
>> >>>> >> >>> >>
>> >>>> >> >>> >> I am trying to get the ApplicationMaster
info using the
>> >>>> >> >>><http://<proxy
>> >>>> >> >>> >>http
>> >>>> >> >>> >> address:port>/proxy/{appid}/ws/v1/mapreduce/info>
link as
>> >>>> >>described
>> >>>> >> >>>on
>> >>>> >> >>> >>the <
>> >>>> >> >>> >>
>> >>>> >> >>> >>
>> >>>> >> >>>
>> >>>> >> >>>
>> >>>> >>
>> >>>>
>> >>>>
>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-ya
>> >>>>r
>> >>>> >> >>>n
>> >>>> >> >>> >>-site/MapredAppMasterRest.html>
>> >>>> >> >>> >> page.
>> >>>> >> >>> >>
>> >>>> >> >>> >> I am able to access and retrieve
JSON response for other
>> >>>> modules
>> >>>> >> >>> >> (ResourceManager, NodeManager and
HistoryServer).
>>However,
>> >>>>I am
>> >>>> >> >>>getting
>> >>>> >> >>> >> 'Page not found' when I try to use
my ResourceManager
>>Http
>> >>>> >>address
>> >>>> >> >>>to
>> >>>> >> >>> >> access the ApplicationMaster info.
I am using <
>> >>>> >> >>> >> http://localhost:8088/proxy/{appid}/ws/v1/mapreduce/info>
>> to
>> >>>> >> >>>retrieve
>> >>>> >> >>> >>JSON
>> >>>> >> >>> >> response.
>> >>>> >> >>> >>
>> >>>> >> >>> >> The instructions say "The application
master should be
>> >>>>accessed
>> >>>> >>via
>> >>>> >> >>>the
>> >>>> >> >>> >> proxy. This proxy is configurable
to run either on the
>> >>>>resource
>> >>>> >> >>>manager
>> >>>> >> >>> >>or
>> >>>> >> >>> >> on a separate host."
>> >>>> >> >>> >>
>> >>>> >> >>> >> My yarn-default.xml contains:
>> >>>> >> >>> >>  <property>
>> >>>> >> >>> >>    <description>The address
for the web proxy as
>>HOST:PORT,
>> >>>>if
>> >>>> >>this
>> >>>> >> >>>is
>> >>>> >> >>> >>not
>> >>>> >> >>> >>     given then the proxy will run
as part of the
>> >>>> RM</description>
>> >>>> >> >>> >>     <name>yarn.web-proxy.address</name>
>> >>>> >> >>> >>     <value/>
>> >>>> >> >>> >>  </property>
>> >>>> >> >>> >>
>> >>>> >> >>> >> and I did not set a value explicitly
in yarn-site.xml.
>>Any
>> >>>> idea
>> >>>> >> >>>how I
>> >>>> >> >>> >>can
>> >>>> >> >>> >> get this working? Thanks in advance.
>> >>>> >> >>> >>
>> >>>> >> >>> >> Regards,
>> >>>> >> >>> >> Prajakta
>> >>>> >> >>> >
>> >>>> >> >>> >
>> >>>> >> >>> >
>> >>>> >> >>> >--
>> >>>> >> >>> >Harsh J
>> >>>> >> >>>
>> >>>> >> >>>
>> >>>> >> >>
>> >>>> >>
>> >>>> >>
>> >>>>
>> >>>>
>> >>>
>> >>
>>
>>


Mime
View raw message