hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Moving files from JBoss server to HDFS
Date Sun, 13 May 2012 11:03:06 GMT
Please do not use the general@ address for user-oriented questions.
You previously asked the same on another list and many people did
respond to your question promptly. You can find an archive of that at

On Sun, May 13, 2012 at 1:48 PM, financeturd financeturd
<financeturd@yahoo.com> wrote:
> Hello,
> We have a large number of
> custom-generated files (not just web logs) that we need to move from our JBoss servers
to HDFS.  Our first implementation ran a cron job every 5 minutes to move our files from
the "output" directory to HDFS.
> Is this recommended?  We are being told by our IT team that our JBoss servers should
not have access to HDFS for security reasons.  The files must be "sucked" to HDFS by other
servers that do not accept traffic
> from the outside.  In essence, they are asking for a layer of
> indirection.  Instead of:
> {JBoss server} --> {HDFS}
> it's being requested that it look like:
> {Separate server} <-- {JBoss server}
> and then
> {Separate server} --> HDFS
> While I understand in principle
> what is being said, the security of having processes on JBoss servers
> writing files to HDFS doesn't seem any worse than having JBoss servers
> access a central database, which they do.
> Can anyone comment on what a
> recommended approach would be?  Should our JBoss servers push their data to HDFS or
should the data be pulled by another server and then placed
> into HDFS?
> Thank you!
> FT

Harsh J

View raw message