hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Verma <ver...@illinois.edu>
Subject Hadoop job logs for research
Date Wed, 26 Jan 2011 03:50:03 GMT
Hi fellow Hadoop users and developers,

I am a third year PhD student at the University of Illinois and am working
on improving workload management and scheduling in Hadoop. I have tested
some of my ideas on synthetic workloads, GridMix, hadoop-examples and a few
of my own applications. I am looking for real workloads that are executed in
the industry. Specifically, I am interested in the job logs (stored by
default on the JobTracker) of real workloads. If people are concerned about
the confidentiality of the application, I would like to mention that these
logs contain very little information about the processed data or the
application itself. Anonymizing the job names (and their submission times,
etc.) would not be too much of a problem.

I would love to collaborate with folks from the industry in understanding
these workloads. I sincerely hope that the research that I am conducting
will benefit everybody.

Thanks a lot.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message