hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: runtimePartitioning in GraphJobRunner
Date Mon, 10 Dec 2012 19:54:15 GMT
Sorry, text is exception.

On Tue, Dec 11, 2012 at 4:52 AM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Again ... User can create their own InputFormatter to read records as
> a <Writable, ArrayWritable> from text file or sequence file, or
> NoSQLs.
>
> You can use K, V pairs and sequence file. Why do you want to use text
> file? Should I always write text file and parse them using
> VertexInputReader?
>
>
> On Tue, Dec 11, 2012 at 4:48 AM, Thomas Jungblut
> <thomas.jungblut@gmail.com> wrote:
>>>
>>> It's a gap in experience, Thomas.
>>
>>
>> Most probably you should read some good books on data extraction and then
>> choose your tools accordingly.
>> I never think that BSP is and will be a good extraction technique for
>> unstructured data.
>>
>> But these are just my two cents here- there seems to be somewhat more
>> political problems in this game than using tools appropriately.
>>
>> 2012/12/10 Thomas Jungblut <thomas.jungblut@gmail.com>
>>
>>> Yes, if you preprocess your data correctly.
>>> I have done the same unstructured extraction with the movie database from
>>> IMDB and it worked fine.
>>> That's just not a job for BSP, but for MapReduce.
>>>
>>> 2012/12/10 Edward J. Yoon <edwardyoon@apache.org>
>>>
>>>> It's a gap in experience, Thomas. Do you think you can extract Twitter
>>>>
>>>> mention graph using parseVertex?
>>>>
>>>> On Tue, Dec 11, 2012 at 4:34 AM, Thomas Jungblut
>>>> <thomas.jungblut@gmail.com> wrote:
>>>> > I have trouble understanding you here.
>>>> >
>>>> > How can I generate large sample without coding?
>>>> >
>>>> >
>>>> > Do you mean random data generation or real-life data?
>>>> > Personally I think it is really convenient to transform unstructured
>>>> data
>>>> > in a text file to vertices.
>>>> >
>>>> >
>>>> > 2012/12/10 Edward <edward@udanax.org>
>>>> >
>>>> >> I mean, With or without input reader. How can I generate large sample
>>>> >> without coding?
>>>> >>
>>>> >> It's unnecessary feature. As I mentioned before, only good for simple
>>>> and
>>>> >> small test.
>>>> >>
>>>> >> Sent from my iPhone
>>>> >>
>>>> >> On Dec 11, 2012, at 3:38 AM, Thomas Jungblut <
>>>> thomas.jungblut@gmail.com>
>>>> >> wrote:
>>>> >>
>>>> >> >>
>>>> >> >> In my case, generating test data is very annoying.
>>>> >> >
>>>> >> >
>>>> >> > Really? What is so difficult to generate tab separated text
data?;)
>>>> >> > I think we shouldn't do this, but there seems to be very little
>>>> interest
>>>> >> in
>>>> >> > the community so I will not block your work on it.
>>>> >> >
>>>> >> > Good luck ;)
>>>> >>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message