Friday, March 12, 2010

Awesomely fast key value store design

I've got interesting idea for awesomely fast key value store. 

Writing data:
  1. When data change request comes to Client API it stores changed data in Temp Storage. Also change request gets stored to Persistent Queue.
  2. Queue Processing Job gets queued update and applies it to data in persistent storage. 

Reading data:
  1. Read request comes to DB Client API , it checks if data for this request available in Temp Storage . If it is then use it, otherwise go to Persistent Store. 

This approach allows to write to distributed DB with speed of writing to persistent queue, which is usually way faster then DB updates. And read can be easily scaled by adding additional replicas.

Potential issues with this design. 
  1. Temp Storage eventually will start overflowing. It`s hard to get memcached storage capacity as big as Persistent Storage capacity. When it happens DB Client going to fallback to Persistent DB for records that been pushed out of Temp Storage, we need to make sure that Queue Processing is done for the matching queued records. 
  2. Temp Storage based on memcached is not that reliable and if it goes down we might loose data consistency for short period of time until current data in queue will be propagated to persistent DB. It`s reliability can be improved, but let`s look at what might happen in this scenario. First thing that will happen is that users will temporary loose their changes for records that still in the queue. This might not be that bad considering it'll happen just for short period of time. But if user that already had change waiting in the queue in that moment will submit yet another change to system it might lead to situation wen change will be permanently lost. 
  3. This design relies very heavily on Queue Processing job to be reliable and fast enough. So it should be well designed. On other hand this design allows Queue Processing job to be temporarily stopped (for some maintenance tasks) without affecting end user. That is as long as Temporary storage is big enough and job is fast enough to  catch up with queued changes later.
  4. As any key value storage this design has difficulty dealing with concurrent updates in same record. 


No comments:

Post a Comment

Followers