hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayuran Yogarajah <mayuran.yogara...@casalemedia.com>
Subject Re: indexing log files for adhoc queries - suggestions?
Date Thu, 01 Oct 2009 18:53:13 GMT
ishwar ramani wrote:
> Hi,
>
> I have a setup where logs are periodically bundled up and dumped into
> hadoop dfs as large sequence file.
>
> It works fine for all my map reduce jobs.
>
> Now i need to handle adhoc queries for pulling out logs based on user
> and time range.
>
> I really dont need a full indexer (like lucene) for this purpose.
>
> My first thought is to run a periodic mapreduce to generate a large
> text file sorted by user id.
>
> The text file will have (sequence file name, offset) to retrieve the logs ....
>
>
> I am guessing many of you ran into similar requirements... Any
> suggestions on doing this better?
>
> ishwar
>   
Have you looked into Hive? Its perfect for ad hoc queries..

M

Mime
View raw message