hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastien Rainville" <srainvi...@brightspark.com>
Subject hbase + nutch
Date Fri, 09 Nov 2007 16:54:27 GMT
Did anybody tried to use both Nutch and HBase together yet?


Basically I need to store structured information extracted from the web
pages. Saving that data in a database like mysql would be a temporary
option but in the long term, the amount of information will grow fast
and I'll need a more scalable system. That's where HBase comes into
play. The next logical move would then be to modify nutch to save the
pages in HBase. The system would then be very flexible. Is it what you
guys have in mind for the future of Nutch?


But for now, Nutch is not integrated with HBase... I can still write
Nutch extensions that save the structured data that I need into HBase.
Is there a way to make them interact smoothly? The first obvious problem
that I have is that both of them are built on a different version of
Hadoop. Is there's a good way of doing it?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message