incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jef...@gmail.com
Subject Re: higher layer library makes things faster?
Date Wed, 19 Sep 2012 16:59:37 GMT
Actually its not uncommon at all.  Any caching implemented on a higher level will generally
improve speed at a cost in memory.

Beware common wisdom, its seldom very wise
Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: "Hiller, Dean" <Dean.Hiller@nrel.gov>
Date: Wed, 19 Sep 2012 07:35:07 
To: user@cassandra.apache.org<user@cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Subject: higher layer library makes things faster?

So there is this interesting case where a higher layer library makes things slower.  This
is counter-intuitive as every abstraction usually makes things slower with an increase in
productivity.    It would be cool if more and more libraries supported something to help with
this scenario I think.

The concept.   Once in a while you end up needing a new query into an noSQL data model that
already exists, and you do something like this

UserRow user = fetchUser(rowKey);
List<RoleMappingRow> mappings = fetchRoleMappings(user.getRoleMappingIds())
List<GroupIdRowKeys> rowKeys = new ArrayList<GroupIdRowKeys>();
for(RoleMapping m : mappings) {
   rowKeys.addAll(m.getGroupIds());
}
List<GroupRow> groups = fetchGroupRows(rowKeys);

It turns out if you index stuff, it is much faster in a lot of cases.  Instead you can scan
just 2 index rows to find the keys for the groups and read them in.  Basically you do one
scan on the RoleMapping where mapping has a FK to UserRow and you now have a list of primary
keys for your RoleMapping.  Next, you do a scan of the GroupRow index feeding in the primary
keys of your RoleMapping which feeds back your GroupRow primary keys that you need to lookup….in
this way, you skip not only a lot of coding that goes into avoiding getting all the UserRow
data back, and can simply scan the indexes.

That said, every time you update a row, you need to remove an old value from the index and
a new one to the index.  Inserts only need to add.  Removes only need to remove.

Anyways, I have found this to be quite an interesting paradigm shift as right now doing the
latter manually requires a lot more code (but then, that is why more and more libraries like
playOrm I think will exist in the future as it makes the above very simple to do yourself).

Later,
Dean
Mime
View raw message