asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianfeng Jia <jianfeng....@gmail.com>
Subject Re: Need Feed experts' help with an hanging issue
Date Tue, 01 Dec 2015 01:47:23 GMT
Thanks a ton!

> On Nov 30, 2015, at 5:24 PM, abdullah alamoudi <bamousaa@gmail.com> wrote:
> 
> I know exactly what is going on here. The problem is you pointed out is
> caused by the duplicate keys. If I remember correctly, the main issue is
> that locks that are placed on the primary keys are not released.
> 
> I will start fixing this issue tonight.
> Cheers,
> Abdullah.
> 
> Amoudi, Abdullah.
> 
> On Mon, Nov 30, 2015 at 4:52 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
> 
>> Dear devs,
>> 
>> I hit an wield issue that is reproducible, but only if the data has
>> duplications and also is large enough. Let me explained it step by step:
>> 
>> 1. The dataset is very simple that only has two fields.
>> DDL AQL:
>> —————————————
>> drop dataverse test if exists;
>> create dataverse test;
>> use dataverse test;
>> 
>> create type t_test as closed{
>>  fa: int64,
>>  fb : int64
>> }
>> 
>> create dataset ds_test(t_test) primary key fa;
>> 
>> create feed fd_test using socket_adapter
>> (
>>    ("sockets"="nc1:10001"),
>>    ("address-type"="nc"),
>>    ("type-name"="t_test"),
>>    ("format"="adm"),
>>    ("duration"="1200")
>> );
>> 
>> set wait-for-completion-feed "false";
>> connect feed fd_test to dataset ds_test using policy AdvancedFT_Discard;
>> 
>> ——————————————————————————————
>> 
>> That AdvancedFT_Discard policy will ignore the exception from the
>> insertion and keep ingesting.
>> 
>> 2. Ingesting the data by a very simple socked adapter which reads the
>> record one by one from an adm file. The src is here:
>> https://github.com/JavierJia/twitter-tracker/blob/master/src/main/java/edu/uci/ics/twitter/asterix/feed/FileFeedSocketAdapterClient.java
>> The data and the app package is provided here:
>> https://drive.google.com/folderview?id=0B423M7wGZj9dYVQ1TkpBNzcwSlE&usp=sharing
>> To feed the data you can run:
>> 
>> ./bin/feedFile -u 172.17.0.2 -p 10001 -c 5000000 ~/data/twitter/test.adm
>> 
>> -u for sever url
>> -p for server port
>> -c for count of line you want to ingest
>> 
>> 3. After ingestion, all the requests about the ds_test was hanging. There
>> is no exception and no responds for hours. However it can respond any other
>> queries that on other datasets, like Metadata.
>> 
>> That data contains some duplicated records which should trigger the insert
>> exception. If I change the count from 5000000 to lower, let’s say 3000000,
>> it has no problems, although it contains duplications as well.
>> 
>> Any feed experts have any hint on which part could be wrong? cc and nc log
>> was attached. Thank you!
>> 
>> 
>> 
>> 
>> 
>> 
>> Best,
>> 
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>> 
>> 
>> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message