chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: Backfilling and retroactive table creation in MySQL database
Date Fri, 12 Jun 2009 00:18:12 GMT
If this is implemented in Metric Data Loader, MR job and post process data
loader both can leverage the table creation.  For moving forward, we should
remove the database based aggregation because late data arrival may have
partition already expired which cause the aggregated data to be inaccurate
if the data loader recreate the table.  I filed CHUKWA-286 to keep track of
this.

Regards,
Eric

On 6/11/09 4:02 PM, "Jiaqi Tan" <tanjiaqi@gmail.com> wrote:

> Hi,
> 
> Answering my own question, I had a dirty hack to just replace the date
> argument passed to the TableCreator and specify the date at which the
> data-to-be-inserted was timestamped so that the right tables into the
> future get created.
> 
> Two issues: 1. I'm not sure if that's the cleanest solution, and 2.
> perhaps this should be dealt with in conjunction with the backfilling
> work, especially if the backfilling process is aware of the timestamp
> of the data being backfilled into the Chukwa storage.
> 
> Jiaqi
> 
> On Thu, Jun 11, 2009 at 3:59 PM, Ariel Rabkin<asrabkin@gmail.com> wrote:
>> I'd be fine with doing retroactive table creation.  Though I'd sort of
>> like to see us move towards a cleaner database architecture -- the
>> whole system of having template tables, and then a whole series of
>> tables for time partitions always struck me as sort of awkward.
>> 
>> But I'm not really a database guy, so I don't know what the Right Thing is.
>> :(
>> 
>> On Thu, Jun 11, 2009 at 3:42 PM, Jiaqi Tan<tanjiaqi@gmail.com> wrote:
>>> Hi,
>>> 
>>> I've been trying to use the MDL to retroactively load data that was
>>> generated at an earlier time (somewhat like backfilling, but not,
>>> because I'm loading my own state-machine data), but when I create new
>>> tables and fire up dbAdmin.sh in the present time, it only creates
>>> tables into the future. Is there any way to create tables for time
>>> periods that have passed?
>>> 
>>> Also, taking a step back, if we have support for backfilling, should
>>> there also be support for retroactive table creation for the database,
>>> for instance, for bulk-loading trace data that wasn't collected using
>>> the Agent+Collector framework?
>>> 
>>> Jiaqi
>>> 
>> 
>> 
>> 
>> --
>> Ari Rabkin asrabkin@gmail.com
>> UC Berkeley Computer Science Department
>> 


Mime
View raw message