hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Smith <ryan.justin.sm...@gmail.com>
Subject Re: Crawling Using HBase as a back end --Issue
Date Sun, 26 Apr 2009 13:31:11 GMT
Derek Pappas,
If you are using heritrix to crawl, you can try using hbase-writer to write
the crawled output to hbase(instead of arc files).
http://code.google.com/p/hbase-writer/

On Sat, Apr 25, 2009 at 11:41 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> Man I really understand the frustration when something just
> _does_not_work_,
> especially if advertised to work so.
>
> But the thing to remember here, is hbase is cutting edge database stuff -
> highly clustered distributed databases are not super straighforward.  For
> an
> example, take Oracle RAC - you won't be able to get that up and running at
> any interesting performance levels without paying oracle or a highly
> experienced oracle dba to tune it just right.
>
> So, considering how few tuning parameters, and how scalable hbase is, I
> think it's a great deal for the price.
>
> On Sat, Apr 25, 2009 at 8:19 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> >
> > Right, well "hbase did not work" with no details as to why
> > does not help us to improve it. Please kindly consider
> > asking your colleague to forward details at your and/or
> > that person's convenience. Also, for future reference, HBase
> > has a responsive developer community and could have likely
> > helped for only the cost of time to file a bug report and
> > respond to inquiries for more information.
> >
> >   - Andy
> >
> > > From: Derek Pappas
> > > Subject: Re: Crawling Using HBase as a back end --Issue
> > > Date: Thursday, April 23, 2009, 11:35 PM
> > > Someone else in the company knows the details. Sorry did not
> > > mean to pan hbase. We are a very small startup and needed to
> > > get a prototype (version 2) working. We tried using hbase
> > > back in the Dec/Jan time frame.
> >
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message