incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: OPP and controlling partitioning
Date Wed, 17 Nov 2010 19:51:41 GMT
1) Don't have anything to hand, other than to say the tokens for the OPP are the keys themselves.
If your tokens are UUID are the sequential (version 1) UUIDs ? If so you can create a UUID
for a known time, with the non type bytes all set high and use that as a token value. Otherwise
you will need to create UUID's that evenly partition the entire space of UUID values. 

2) Someone else had a similar question and Jonathan pointed them to the StorageProxyAPI http://www.mail-archive.com/user@cassandra.apacheorg/msg07305.html http://wiki.apache.org/cassandra/StorageProxy

I kind of understand where you're going but I would still recommend (if you have not already)
building a prototype with the RandomPartitioner and just throwing requests at it. The work
required to answer the queries will be distributed around the cluster.

IMHO the best way to learn how to do the things your are talking about is to build the basic
app first and fail fast. There are a lot of new mistakes to make when starting out with Cassandra.

Hope that helps. 
Aaron

On 18 Nov, 2010,at 02:22 AM, Claudio Martella <claudio.martella@tis.bz.it> wrote:

@Adi:

Yes, that's exactly the reason for the OPP in the Subject :)

@Aaron:

Thanks for the complete answer.

1) In my case "vertexid_" is a uuid. Could you send me some reference on
how to achieve this partitioning based on this prefix and
orderpreservingpartitioning? I can't find docs about it.
2) Ah, that's a pity, I guess I'll have to extend cassandra's api to
have this call through thrift. do you think it would be an interesting
add to cassandra's api?


About your comment, in reality i want to embrace the fact that cassandra
replicates. I do want a strict partitioning based on prefix but i also
do want these partitions to be replicated, so no single points of failure.

About the network i/o, I do agree that it might not be the biggest
problem (although it's the slowest medium in the whole workpipe), but in
my case i also want to achieve distribution of computation (it's
something similar, if you want, to moving computation to data in m/r).


Thanks again for your support.


Claudio

On 11/15/10 9:16 PM, Adi wrote:
>
> >>1) "So if your node tokens are set as "vertexid_" all keys with the
> same prefix will be in the same range."
> Adding to Aaron's comment -
> This will be the case if you use OrderPreservingPartitioner.
> RandomPartitioner(the default) will distribute the tokens randomly
> across nodes.
>
>
>
>
> On Mon, Nov 15, 2010 at 2:47 PM, Aaron Morton <aaron@thelastpickle.com
> <mailto:aaron@thelastpickle.com>> wrote:
>
> Rows are distributed around the cluster according to the ordering
> from the Partitioner used, and the Replication Strategy. All data
> for the same key will be stored together, and then replicated RF
> times. 
>
> To answer your questions...
> 1) Each node is responsible for the keys between the previous
> nodes token and it's own. So if your node tokens are set as
> "vertexid_" all keys with the same prefix will be in the same
> range. Note that the row data will be stored on RF replicas, and
> not just on the node with the appropriate token. 
>
> 2) I *think* you want to look at
> o.a.c.s.StorageService.getNaturalEndpoints() , this is not exposed
> to the outside world though. However *every* read or write request
> is sent to all replicas, even those at CL ONE. There is no concept
> of one node been the only place that a row is stored. 
>
> FWIW it sounds like you want to disable some of the fine work
> cassandra does to ensure your data is replicated and available. By
> deciding that one machine will be responsible for a portion of the
> data you are introducing a single point of failure. Try writing
> your app against a cluster and let cassandra take care of things,
> then dive into the details. For example I cannot remember anyone
> on the list having serious issues with network overhead. 
>
> You may also want to consider flock db from twitter, it sits on
> top of a sharded MySQL db https://github.com/twitter/flockdb
>
> Hope that helps. 
> Aaron
>
>
> On 16 Nov, 2010,at 03:53 AM, Claudio Martella
> <claudio.martella@tis.bz.it <mailto:claudio.martella@tis.bz.it>>
> wrote:
>
>> Hello list,
>>
>> I'm in the process of writing an application which uses cassandra
>> as a
>> "storage" backend. The application is a graph database and it's
>> supposed
>> to be a baseline application for further development in the field.
>>
>> The idea is to implement a property graph: a multigraph (multiple
>> edges
>> connecting two vertices are possible) with properties in the form of
>> name/value for edges and vertices. The idea is to traverse the graph
>> with queries like "give me all the women that are liked by men i
>> know",
>> something like:
>> Vertex[name=claudio]=>outgoingEdge[type=knows]=>Vertex[gender=male]=>outgoingEdge[type=likes]=>Vertex[gender=female].
>> This is basically a step by step expansion/filtering based on
>> properties.
>>
>> In my architecture my application-logic node is coupled with the
>> cassandra node storing its data. I'd like to have some kind of
>> "atomic
>> set" of data that is "granted" to be stored on the same cassandra
>> node
>> (in my case the vertex, its adj list, its properties, its edges and
>> their properties), so that i can issue the required filtering and
>> expansion to a particular node which will issue the logic behind
>> it (and
>> i can route such request with the same logic cassandra routes its
>> requests).
>> This is in an effort to (a) minimize network i/o (i'd be able to send
>> the query token to the application node which would issue a local
>> get to
>> its local cassandra) and (b) distribute computation (i'd be able to
>> distribute filtering between all the nodes storing for example the
>> node's neighborhood). This is still not optimal, but it would be
>> a good
>> start.
>>
>> For this reason i thought about a datamodel that has composite keys:
>>
>> vertexid and edgeid are uuids while propertyname is a string.
>>
>> CF vertices {
>>
>> vertexid_propertyname {
>>
>> propertyvalue: null
>> }
>> }
>>
>>
>> CF edges {
>>
>> vertexid_[in|out]_propertyname_edgeid {
>>
>> propertyvalue: othervertexid
>> }
>> }
>>
>> With this datamodel i could easily and efficiently issue slices and
>> ranges to cassandra with the equality predicates on properties i
>> need.
>> What i need now is to partition my data on the prefix
>> "vertexid_". Such
>> a datamodel does have a concept of "ascending ordering", so i thought
>> about OPP, but to my understanding OPP does not grant that all
>> the data
>> starting with the same prefix will end up in the same cassandra node,
>> but only some of it. My set of data about a vertex could still be
>> split
>> between two cassandra nodes in case the token ends up being a key
>> in the
>> middle of the set, right?
>>
>> What i require exactly is:
>>
>> (1) to have all the rows belonging to the same vertexid (which is a
>> uuid) on the same cassandra node. Can i achieve this?
>> (2) given this partitioning, know the IP of the cassandra node
>> storing
>> that vertex data, from outside of cassandra. This is the logic
>> cassandra
>> uses to route requests for keys and i have to access it from outside.
>>
>> Can anybody comment about these?
>>
>>
>> Thanks
>>
>>
>> Claudio
>>
>>
>> Unit Research & Development - Analyst
>>
>> TIS innovation park
>> Via Siemens 19 | Siemensstr. 19
>> 39100 Bolzano | 39100 Bozen
>> Tel. +39 0471 068 123
>> Fax +39 0471 068 129
>> claudio.martella@tis.bz.it <mailto:claudio.martella@tis.bz.it>
>> http://www.tis.bz.it
>>
>> Short information regarding use of personal data. According to
>> Section 13 of Italian Legislative Decree no. 196 of 30 June 2003,
>> we inform you that we process your personal data in order to
>> fulfil contractual and fiscal obligations and also to send you
>> information regarding our services and events. Your personal data
>> are processed with and without electronic means and by respecting
>> data subjects' rights, fundamental freedoms and dignity,
>> particularly with regard to confidentiality, personal identity
>> and the right to personal data protection. At any time and
>> without formalities you can write an e-mail to privacy@tis.bz.it
>> <mailto:privacy@tis.bz.it> in order to object the processing of
>> your personal data for the purpose of sending advertising
>> materials and also to exercise the right to access personal data
>> and other rights referred to in Section 7 of Decree 196/2003. The
>> data controller is TIS Techno Innovation Alto Adige, Siemens
>> Street n. 19, Bolzano. You can find the complete information on
>> the web site www.tis.bz.it <http://www.tis.bz.it>.
>>
>>
>


-- 
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax +39 0471 068 129
claudio.martella@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of Italian Legislative
Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order
to fulfil contractual and fiscal obligations and also to send you information regarding our
services and events. Your personal data are processed with and without electronic means and
by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard
to confidentiality, personal identity and the right to personal data protection. At any time
and without formalities you can write an e-mail to privacy@tis.bz.it in order to object the
processing of your personal data for the purpose of sending advertising materials and also
to exercise the right to access personal data and other rights referred to in Section 7 of
Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n.
19, Bolzano. You can find the complete information on the web site www.tis.bz.it.



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message