asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Li <che...@gmail.com>
Subject Re: Need Feed experts' help with an hanging issue
Date Tue, 01 Dec 2015 18:03:10 GMT
Cool.  The prototype Jianfeng is building revealed quite a few issues in
the system :-)

On Mon, Nov 30, 2015 at 5:24 PM, abdullah alamoudi <bamousaa@gmail.com>
wrote:

> I know exactly what is going on here. The problem is you pointed out is
> caused by the duplicate keys. If I remember correctly, the main issue is
> that locks that are placed on the primary keys are not released.
>
> I will start fixing this issue tonight.
> Cheers,
> Abdullah.
>
> Amoudi, Abdullah.
>
> On Mon, Nov 30, 2015 at 4:52 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
>
> > Dear devs,
> >
> > I hit an wield issue that is reproducible, but only if the data has
> > duplications and also is large enough. Let me explained it step by step:
> >
> > 1. The dataset is very simple that only has two fields.
> > DDL AQL:
> > —————————————
> > drop dataverse test if exists;
> > create dataverse test;
> > use dataverse test;
> >
> > create type t_test as closed{
> >   fa: int64,
> >   fb : int64
> > }
> >
> > create dataset ds_test(t_test) primary key fa;
> >
> > create feed fd_test using socket_adapter
> > (
> >     ("sockets"="nc1:10001"),
> >     ("address-type"="nc"),
> >     ("type-name"="t_test"),
> >     ("format"="adm"),
> >     ("duration"="1200")
> > );
> >
> > set wait-for-completion-feed "false";
> > connect feed fd_test to dataset ds_test using policy AdvancedFT_Discard;
> >
> > ——————————————————————————————
> >
> > That AdvancedFT_Discard policy will ignore the exception from the
> > insertion and keep ingesting.
> >
> > 2. Ingesting the data by a very simple socked adapter which reads the
> > record one by one from an adm file. The src is here:
> >
> https://github.com/JavierJia/twitter-tracker/blob/master/src/main/java/edu/uci/ics/twitter/asterix/feed/FileFeedSocketAdapterClient.java
> > The data and the app package is provided here:
> >
> https://drive.google.com/folderview?id=0B423M7wGZj9dYVQ1TkpBNzcwSlE&usp=sharing
> > To feed the data you can run:
> >
> > ./bin/feedFile -u 172.17.0.2 -p 10001 -c 5000000 ~/data/twitter/test.adm
> >
> > -u for sever url
> > -p for server port
> > -c for count of line you want to ingest
> >
> > 3. After ingestion, all the requests about the ds_test was hanging. There
> > is no exception and no responds for hours. However it can respond any
> other
> > queries that on other datasets, like Metadata.
> >
> > That data contains some duplicated records which should trigger the
> insert
> > exception. If I change the count from 5000000 to lower, let’s say
> 3000000,
> > it has no problems, although it contains duplications as well.
> >
> > Any feed experts have any hint on which part could be wrong? cc and nc
> log
> > was attached. Thank you!
> >
> >
> >
> >
> >
> >
> > Best,
> >
> > Jianfeng Jia
> > PhD Candidate of Computer Science
> > University of California, Irvine
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message