asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sattam Alsubaiee <salsuba...@gmail.com>
Subject Re: Indexes not performing well
Date Sun, 29 May 2016 06:32:07 GMT
Creating indexes on fields with high selectivities (such as hourOfDay
and dayOfWeek) are not encouraged at all. Each secondary index lookup will
have to probe the primary index to fetch other fields in the record. It
would be much more efficient if you just perform scans as opposed of
accessing secondary indexes when querying such fields.

I would recommend that you drop at least the following indexes:
drop index posdata.hour;
drop index posdata.day;

Also I would highly recommend that you utilize AsterixDB filters, which is
very good optimization (could save up to 99% of query time) when you deal
with time-correlated fields such as timestamps:
https://asterixdb.apache.org/docs/0.8.8-incubating/aql/filters.html
http://dl.acm.org/citation.cfm?id=2786007

Cheers,
Sattam

On Sun, May 29, 2016 at 8:58 AM, Michael Carey <mjcarey@ics.uci.edu> wrote:

> @Pouria: Please share your findings here when you check this out - this is
> quite strange, since none of the other performance results that have been
> obtained on the system have looked anything like this.  (I will try to look
> at this too at some point, but will unfortunately be MIA from June 1-15
> first.)  Weird....
>
> On 5/26/16 9:20 AM, Pouria Pirzadeh wrote:
>
> Hi Magnus,
>
> Thanks for your email and sharing the information.
> If it is Ok with you, Would you please share with us the exact DDL
> (including type definitions, dataset and index definitions) and exact AQL
> queries that you ran against AsterixDB ?
> I am just interested in checking the query plans and see what ended up
> being run as jobs.
>
> Thanks.
> Pouria
>
> On Thu, May 26, 2016 at 4:59 AM, Magnus Kongshem <kongshem@online.ntnu.no>
> wrote:
>
>> Hi,
>>
>> There has been a lot of questions from me regarding AsterixDB and I thank
>> all of you who have answered me. So it is time for me to contribute with
>> some obeservations. I am writing my master thesis where I test multiple
>> databases on a large data set. I should also mention that I have installed
>> AsterixDB on a single machine.
>>
>> What I have observed is that asterixDB has a "poorer" read performance
>> when I specify indexes on the data set compared to not implementing any
>> indexes. See the attachment for details, its an excerpt of my thesis
>> explaining and describing the queries, the indexes and the test results.
>> Any thoughts on these test results?
>>
>> I also cannot help to notice that the read performance for a query
>> querying a small portion, medium portion and large portion of the data set
>> is very similar. The largest query finds 75 million records and the
>> smallest query finds 3.5 million records, but almost have the same read
>> performance. How can this be?
>>
>> Perhaps you can use these test results in the future development of
>> asterixDB.
>>
>> I you would like, I can send you my final thesis when it's done.
>>
>> --
>>
>> Mvh
>>
>> Magnus Alderslyst Kongshem
>> +47 415 65 906 <%2B47%20415%2065%20906>
>>
>
>
>

Mime
View raw message