phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaanai Zhang <cloud.pos...@gmail.com>
Subject Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes
Date Tue, 18 Sep 2018 16:08:53 GMT
>
> I don't understand what performance issues you think exist based solely on
> the above. Those numbers appear to be precisely in line with my
> expectations. Can you please describe what issues you think exist?
>

1. Performance of the thick client has almost 1~4 time higher than the thin
client, the performance of the thin client will be decreased when the
number of concurrencies is increased.  For some applications of the web
server, this is not enough.
2. An HA thin client.
3. SQL audit function

A lot of developer like using the thin client, which has a lower
maintenance cost on the client.Sorry, that's all that comes to me. :)

Please be more specific. Asking for "more documentation" doesn't help us
> actually turn this around into more documentation. What are the specific
> pain points you have experienced? What topics do you want to know more
> about? Be as specific as possible.
>

About documents:
1. I think that we cloud add documents about migrate tools and migrate
cases since many users migrate from RDBMS(MYSQL/PG/SQL SERVER) to Phoenix
for some applications of non-transaction.
2. How to design PK or indexes.

About pain points:
The stability is a big problem. Most of the people use Phoenix as a common
RDBMS, they are informal to execute a query, even if they don't know why
server crash when a scan full table was executed, so define use boundary of
Phoenix is important that rejects some query and reports it user's client.

Are you referring to the hbase-spark (and thus, Spark SQL) integration? Or
> something that some company is building?
>

Some companies are building with SPARK SQL to access Phoenix to support
OLAP and OLTP requirements. it will produce heavily load for HBase cluster
when Spark reads Phoenix tables,  my co-workers want to directly read
HFiles of Phoenix tables for some offline business, but that depends on
more flexible Phoenix API.

Uh, I also got some feedback that some features import for users, For
example, "alter table modify column" can avoid reloaded data again which is
expensive operate for the massive data table. I had upload patches to JIRA(
PHOENIX-4815 <https://issues.apache.org/jira/browse/PHOENIX-4815>), but
nobody responds to me  :(.

Now,  i devote to develop chaos test and PQS stability(it was developed on
the branch of my company, these patches will contribute to the community
after stable running ),  if you have any suggests, please tell to me what
you thinking. I would appreciate your reply.


----------------------------------------
   Jaanai Zhang
   Best regards!



Josh Elser <josh.elser@gmail.com> 于2018年9月18日周二 上午12:03写道:

>
>
> On Mon, Sep 17, 2018 at 9:36 AM zhang yun <cloud.poster@gmail.com> wrote:
>
>> Sorry for replying late. I attended HBaesCon Asia as a speaker and got
>> some some notes. I think Phoenix’ pains as following:
>>
>> 1. Thick client isn’t as more popular as thin client. For some we
>> applications: 1. Users need to spend a lot of time to solve the
>> dependencies, 2. Users worry about the stability which some calculation
>> operates are processed within thick client . 3. Some people hope use multi
>> program language client, such as Go, .Net and Python etc…  Other benefits:
>> 1 Easy to add SQL audit function. 2. Recognise invalid SQL and report  to
>> user...    As you said this definitely a big issue which is worth paying
>> more attention.  However thick client exists some problems,  it is recently
>> test data about performance:
>>
>>
> I don't understand what performance issues you think exist based solely on
> the above. Those numbers appear to be precisely in line with my
> expectations. Can you please describe what issues you think exist?
>
>
>> 2. Actually, Phoenix has a high barrier for beginner than common RDBMS,
>> users need to learn HBase before using Phoenix, Most of people don’t know
>> how to  reasonable to use,  so we need more detail documents make Phoenix
>> use easier.
>>
>
> Please be more specific. Asking for "more documentation" doesn't help us
> actually turn this around into more documentation. What are the specific
> pain points you have experienced? What topics do you want to know more
> about? Be as specific as possible.
>
>
>> 3. HBase 3.0 has a plan about native SQL, does Phoenix has a plan? Even
>> if many peoples don’t know HBase has a SQL layer which is called Phoenix,
>> so can we put the link on HBase website?
>>
>
> Uh, I have no idea what you're referring to here about "native SQL". I am
> not aware of any such effort that does this solely inside of HBase, nor
> does it seem inline with HBase's "do one thing well" mantra.
>
> Are you referring to the hbase-spark (and thus, Spark SQL) integration? Or
> something that some company is building?
>
> How about submitting  patch to HBase to modify
> https://hbase.apache.org/poweredbyhbase.html ? :)
>
>
>>
>> On 2018/08/27 18:03:30, Josh Elser <e...@apache.org> wrote:
>> > (bcc: dev@hbase, in case folks there have been waiting for me to send
>> >
>> > this email to dev@phoenix)>
>> >
>> > Hi,>
>> >
>> > In case you missed it, there was an HBaseCon event held in Asia >
>> > recently. Stack took some great notes and shared them with the HBase >
>> > community. A few of them touched on Phoenix, directly or in a related >
>> > manner. I think they are good "criticisms" that are beneficial for us
>> to >
>> > hear.>
>> >
>> > 1. The phoenix-$version-client.jar size is prohibitively large>
>> >
>> > In this day and age, I'm surprised that this is a big issue for people.
>> >
>> > I know have a lot of cruft, most of which coming from hadoop. We have >
>> > gotten better here over recent releases, but I would guess that there
>> is >
>> > more we can do.>
>> >
>> > 2. Can Phoenix be the de-facto schema for SQL on HBase?>
>> >
>> > We've long asserted "if you have to ask how Phoenix serializes data,
>> you >
>> > shouldn't be do it" (a nod that you have to write lots of code). What
>> if >
>> > we turn that on its head? Could we extract our PDataType serialization,
>> >
>> > composite row-key, column encoding, etc into a minimal API that folks >
>> > with their own itches can use?>
>> >
>> > With the growing integrations into Phoenix, we could embrace them by >
>> > providing an API to make what they're doing easier. In the same vein,
>> we >
>> > cement ourselves as a cornerstone of doing it "correctly".>
>> >
>> > 3. Better recommendations to users to not attempt certain queries.>
>> >
>> > We definitively know that there are certain types of queries that >
>> > Phoenix cannot support well (compared to optimal Phoenix use-cases). >
>> > Users very commonly fall into such pitfalls on their own and this
>> leaves >
>> > a bad taste in their mouth (thinking that the product "stinks").>
>> >
>> > Can we do a better job of telling the user when and why it happened? >
>> > What would such a user-interaction model look like? Can we supplement >
>> > the "why" with instructions of what to do differently (even if in the >
>> > abstract)?>
>> >
>> > 4. Phoenix-Calcite>
>> >
>> > This was mentioned as a "nice to have". From what I understand, there >
>> > was nothing explicitly from with the implementation or approach, just >
>> > that it was a massive undertaking to continue with little immediate >
>> > gain. Would this be a boon for us to try to continue in some form? Are
>> >
>> > there steps we can take that would help push us along the right path?>
>> >
>> > Anyways, I'd love to hear everyone's thoughts. While the concerns were
>> >
>> > raised at HBaseCon Asia, the suggestions that accompany them here are >
>> > largely mine ;). Feel free to break them out into their own threads if
>> >
>> > you think that would be better (or say that you disagree with me -- >
>> > that's cool too)!>
>> >
>> > - Josh>
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message