phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-4148) COUNT(DISTINCT(...)) should have a memory size limit
Date Mon, 11 Sep 2017 18:12:00 GMT


Lars Hofhansl commented on PHOENIX-4148:

I have a clumsy patch. (one can check for the size of an aggregator, but getSize() turns out
to be very expensive for the count-distinct aggregator. I changed that to be incremental instead.
The patch is clumsy (hence not posting it). 

Also I still managed to kill the RS by using up all its heap, so perhaps the default max memory
to use for Phoenix is too big.

I'll continue to look, but it does seem like this needs a bigger discussion. Nothing Phoenix
does should ever lead to a RegionServer to exhaust its heap... This would fail the Phoenix
query anyway, so it's far better to fail the Phoenix query before that. The memory management
in Phoenix is pretty smart, I think, it just seems that it needs to be adjusted.

> COUNT(DISTINCT(...)) should have a memory size limit
> ----------------------------------------------------
>                 Key: PHOENIX-4148
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
> I just managed to kill (hang) a region server by issuing a COUNT(DISTINCT(...)) query
over a column with very high cardinality (20m in this case).
> This is perhaps not a useful thing to do, but Phoenix should nonetheless not allow to
have a server fail because of a query.
> [~jamestaylor], I see there GlobalMemoryManager, but I do not quite see how I'd get a
reference to one, once needs a tenant id, etc.

This message was sent by Atlassian JIRA

View raw message