flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From burakkk <burak.isi...@gmail.com>
Subject Re: Transferring another server using flume
Date Thu, 30 Jan 2014 00:34:06 GMT
Hi Ed,
Syslog isn't available for the remote machines and remote machines aren't
desired to install any application or library as possible. I have to pull
data from remote servers without depending on anything remotely.

The problem with rsync is that on the remote servers so many small files
are generating that rsync get stuck in some point. It doesn't fail but it's
just waiting for something doing nothing. It means it's related to getting
the files from the remote servers.

After a brief review of flume, using scribe+flume may solve my problem.
What do you think?

Best regards...

On Thu, Jan 30, 2014 at 1:58 AM, ed <edorsey@gmail.com> wrote:

> Hi Burak,
> Do the machines with the logs on them have syslog available  (e.g.,
> rsyslog for RedHat/CentOS)?  Can the remote servers do any kind of push or
> do you have to pull data from them?  If you you have a syslog daemon
> available on the remote servers then I would try configuring those to send
> the logs to the Flume multiport syslog TCP source.
> In regards to pulling data from the remote servers, what part of rsync is
> causing issues  (assuming your using rsync to pull data)?  Is the problem
> with rsync itself in regards to getting the files from the remote servers
> or is it an issue related to getting the files into HDFS once you've pulled
> the files to the main server?  If the problem is related to getting the
> files into HDFS you could try using the Spooling Directory Source and point
> it at the directory on your main server where you are aggregating the logs
> via rsync.
> Best,
> Ed
> On Wed, Jan 29, 2014 at 11:24 PM, burakkk <burak.isikli@gmail.com> wrote:
>> Hi folks,
>> I have question about flume-ng. There are some different generating log
>> machines. These log files are small (around 4-5mb per file). I want to get
>> or read these files into my main server from these remote servers on
>> a specific directory and then I want to put it into HDFS. I can't install
>> any kind of application on these remote servers so that I can't use avro
>> and thrift source.
>> For now I use rsync to sync files between two different machines and put
>> them using hdfs file commands such as hdfs fs -put. But there are some
>> issues about rsync.
>> In order to solve this problem, what kind of source should I use and how
>> can I do that?
>> Thanks
>> Best Regards...
>> --
>> *BURAK ISIKLI* | *http://burakisikli.wordpress.com
>> <http://burakisikli.wordpress.com>*


*BURAK ISIKLI* | *http://burakisikli.wordpress.com

View raw message