Java web developer's thoughts: May be we need both RDBMS and key-value storage together?

Recent trend in high scalability community is to move from relational DB storage model to key value pairs. Problem with this approach comes when you need to store lists of references between objects:

When you need to add or remove value in the list you have to read and write whole list, and it can be big.

Conflict resolution between list changes is very difficult to deal with.

It’s complex to keep track on references to objects when you delete them.

So here is idea: we could combine RDBMS and key-value storage and use each of storage paradigms to deal with part that they do best. RDBMS can nicely manage references between objects using object IDs. And key value storage can deal with storing serialized objects content. This way we can leverage tooling RDBMS provides to deal with references between objects and still move big part of IO load to easy scalable key value storage. It should take care of problems #1 and #2 very well, little bit more difficult problem to deal with is #3, it can be easy to find records that refer to object that we delete id you don’t have sharding, which might be the option for site with middle scale since we already moved significant part of IO to key value storage, but for high scale site sharding is a must and in this situation you probably will have to setup some sort of garbage collection background process that removes refs to deleted objects in all shards.
Important part of this idea is how implement API for such system. I’m working now on implementation of this idea within Dynamic Dao (ddao.sf.net) framework. At this point I plan to make it like this:

public interface FooDao {
@SelectCachedBeans(“keyValueStorageName”,
“select foo_id from foo_ref where id=#0# start #1# limit #2#”)
List getFooForUser(long userId, int start, int lmit);
}

This logic will execute call to JDBC for given SQL statement, get list of IDs, retrieve cached objects and return them in the list.

Java web developer's thoughts

Sunday, April 19, 2009

May be we need both RDBMS and key-value storage together?

1 comment:

Followers

Blog Archive

About Me