hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rasit OZDAS <rasitoz...@gmail.com>
Subject Re: Using HDFS for common purpose
Date Thu, 29 Jan 2009 09:03:04 GMT
Today Nitesh has given an answer to a similar thread, that was what I wanted
to learn.
I'm writing it here to help others having same question.

HDFS is a file system for distributed storage typically for distributed
computing scenerio over hadoop. For office purpose you will require a SAN
(Storage Area Network) - an architecture to attach remote computer storage
devices to servers in such a way that, to the operating system, the devices
appear as locally attached. Or you can even go for AmazonS3, if the data is
really authentic. For opensource solution related to SAN, you can go with
any of the linux server distributions (eg. RHEL, SuSE) or Solaris (ZFS +
zones) or perhaps best plug-n-play solution (non-open-source) would be a Mac
Server + XSan.

--nitesh

Thanks,
Rasit

2009/1/28 Rasit OZDAS <rasitozdas@gmail.com>

> Thanks for responses,
>
> Sorry, I made a mistake, it's actually not a db what I wanted. We need a
> simple storage for files. Only get and put commands are enough (no queries
> needed). We don't even need append, chmod, etc.
>
> Probably from a thread on this list, I came across a link to a KFS-HDFS
> comparison:
> http://deliberateambiguity.typepad.com/blog/2007/10/advantages-of-k.html<https://webmail.uzay.tubitak.gov.tr/owa/redir.aspx?C=55b317b7ca7548209f9929c643fcbf93&URL=http%3a%2f%2fdeliberateambiguity.typepad.com%2fblog%2f2007%2f10%2fadvantages-of-k.html>
>
> It's good, that KFS is written in C++, but handling errors in C++ is
> usually more difficult.
> I need your opinion about which one could best fit.
>
> Thanks,
> Rasit
>
> 2009/1/27 Jim Twensky <jim.twensky@gmail.com>
>
> You may also want to have a look at this to reach a decision based on your
>> needs:
>>
>> http://www.swaroopch.com/notes/Distributed_Storage_Systems
>>
>> Jim
>>
>> On Tue, Jan 27, 2009 at 1:22 PM, Jim Twensky <jim.twensky@gmail.com>
>> wrote:
>>
>> > Rasit,
>> >
>> > What kind of data will you be storing on Hbase or directly on HDFS? Do
>> you
>> > aim to use it as a data source to do some key/value lookups for small
>> > strings/numbers or do you want to store larger files labeled with some
>> sort
>> > of a key and retrieve them during a map reduce run?
>> >
>> > Jim
>> >
>> >
>> > On Tue, Jan 27, 2009 at 11:51 AM, Jonathan Gray <jlist@streamy.com>
>> wrote:
>> >
>> >> Perhaps what you are looking for is HBase?
>> >>
>> >> http://hbase.org
>> >>
>> >> HBase is a column-oriented, distributed store that sits on top of HDFS
>> and
>> >> provides random access.
>> >>
>> >> JG
>> >>
>> >> > -----Original Message-----
>> >> > From: Rasit OZDAS [mailto:rasitozdas@gmail.com]
>> >> > Sent: Tuesday, January 27, 2009 1:20 AM
>> >> > To: core-user@hadoop.apache.org
>> >> > Cc: arif.yilmaz@uzay.tubitak.gov.tr; emre.gurbuz@uzay.tubitak.gov.tr
>> ;
>> >> > hilal.tarakci@uzay.tubitak.gov.tr; serdar.arslan@uzay.tubitak.gov.tr
>> ;
>> >> > hakan.kocakulak@uzay.tubitak.gov.tr;
>> caglar.bilir@uzay.tubitak.gov.tr
>> >> > Subject: Using HDFS for common purpose
>> >> >
>> >> > Hi,
>> >> > I wanted to ask, if HDFS is a good solution just as a distributed db
>> >> > (no
>> >> > running jobs, only get and put commands)
>> >> > A review says that "HDFS is not designed for low latency" and
>> besides,
>> >> > it's
>> >> > implemented in Java.
>> >> > Do these disadvantages prevent us using it?
>> >> > Or could somebody suggest a better (faster) one?
>> >> >
>> >> > Thanks in advance..
>> >> > Rasit
>> >>
>> >>
>> >
>>
>
>
>
> --
> M. Raşit ÖZDAŞ
>



-- 
M. Raşit ÖZDAŞ

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message