hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayuran Yogarajah <mayuran.yogara...@casalemedia.com>
Subject Re: indexing log files for adhoc queries - suggestions?
Date Thu, 01 Oct 2009 18:53:13 GMT
ishwar ramani wrote:
> Hi,
> I have a setup where logs are periodically bundled up and dumped into
> hadoop dfs as large sequence file.
> It works fine for all my map reduce jobs.
> Now i need to handle adhoc queries for pulling out logs based on user
> and time range.
> I really dont need a full indexer (like lucene) for this purpose.
> My first thought is to run a periodic mapreduce to generate a large
> text file sorted by user id.
> The text file will have (sequence file name, offset) to retrieve the logs ....
> I am guessing many of you ran into similar requirements... Any
> suggestions on doing this better?
> ishwar
Have you looked into Hive? Its perfect for ad hoc queries..


View raw message