This is my first attempt to learn the map reduce abstraction.
3710100022400,1350219887, 2011-09-10, 12:39:38.000, 99.00, 1, 0
3710100022400, 5045462785, 2011-09-06, 13:23:00.000, 70.63, 1, 0
Now what I want is to do is to count the number of transaction happening in every half an hour between 7 am and 11 am.
So here are the intervals.
7:30-8 -> 1
So ultimately what I am doing is creating a 2d dictionary
d[id2][interval] = count_transactions.
My mappers and reducers are attached (sample input also).
The code run just fine if i run via
cat input.txt | python mapper.py | sort | python reducer.py
Gives me the output but when i run it on clusters.. it throws an error which is not helpful (basically on the terminal it says job unsuccesful reason NA).
Any suggestion on what am i doing wrong.