hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Binary Search in map reduce
Date Mon, 07 Jan 2013 23:35:15 GMT
It depends.  What data is going into the table, and what keys will drive the lookup?

Let's suppose that you have a single JSON file that has some reasonable number of key/value
tuples.  You could easily load a Hashtable to associate the integer keys with the values (which
appear to be lists of integers).  Each task in your MapReduce could process each input tuple,
doing a lookup by key and appending values to the output records, and that is a perfectly
fine thing to do in MapReduce.  In this model, the JSON file is effectively a constant singleton
table for the entire MapReduce job.  You can just load it from HDFS or any file system.  Specifying
it as a cached file may improve performance somewhat.

If you explain your intent we might be able to help better.


From: jamal sasha [mailto:jamalshasha@gmail.com]
Sent: Monday, January 07, 2013 4:21 PM
To: user@hadoop.apache.org
Subject: Binary Search in map reduce

 I have data in json format like:

key, values are longints.
Now, I want to do a fast lookup of a key.
How would I implement a binary search in map reduce abstraction.

Or am i not thinking about this correctly?
Any suggestions/advices?

View raw message