hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Re: Querying a Prolog Server from a JVM during a MapReduce Job
Date Tue, 16 Apr 2013 21:01:14 GMT
Assuming that the server can handle high volume and multiple queries there
is no reason not to run it on a large and powerful machine outside the
cluster. Nothing prevents your mappers from accessing a server or even,
depending on the design, a custom InputFormat from pulling data from the
server.
I would not try to run copies of the server on datanodes without a very
compelling reason.


On Tue, Apr 16, 2013 at 1:31 PM, Robert Spurrier
<spurrier.robert@gmail.com>wrote:

> Hello!
>
> I'm working on a research project, and I also happen to be relatively new
> to Hadoop/MapReduce. So apologies ahead of time for any glaring errors.
>
> On my local machine, my project runs within a JVM and uses a Java API to
> communicate with a Prolog server to do information lookups. I was planning
> on deploying my project as the mapper during the MR job, but I am unclear
> on how I would access the Prolog server during runtime. Would it be O.K. To
> just let the server live and run on each data node while my job is running,
> and have each mapper hit the server on its respective node? (let's assume
> the server can handle the high volume of queries from the mappers)
>
> I am not even remotely aware of what types of issues will arise when the
> mappers (from each of their JVMs/process) query the Prolog server (running
> in its own single & separate process on each node). They will only be
> querying data from the server, not deleting/updating.
>
>
> Anything that would make this impossible or what I should be looking out
> for?
>
> Thanks
> -Robert
>
>
>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Mime
View raw message