Storing a file in its current state is easy. Take small portions of the file, calculate its hash, compress and encrypt it and save every resulting piece using the hash as a name.
What about its metadata? You also want to store other two structs: a ordered list of chunks (or blocks or blobs) in order to restore the contents and metadata (name, creation time, permissions...)
This leads to, at least, three types of object and two categories.
First category: contents. We store them remotely and don't use them again until we have to restore the file.
Second category: metadata. In addition to storing them remotely we need to keep metadata locally in order to detect changes to the filesystem and keep track of these changes.
It seems that a clear architecture separation can be made. Backup logic doesn't need to know about storage and vice versa.
Groups, replication and communication with peers should be done by the storage subsystem.
So... first decision made. Create a storage subsystem. It could be a program on its own. It should only know about blobs an locations. You should be able to configure in it groups of peers and when tell it to store a blob in a group. Should restauration be done through it too? Not really sure, but it seems just fine. The same subsystem could be asked to get a blob and know where to find it. Not sure yet if the name of the blob should be enough or the group that stores it should be specified. Shouldn't all that group stuff be invisible to the backup subsystem? I think so, but it'll require further consideration.
It would be nice to explore storage subsystem used by camlistore. Could I develop a storage server as a backend which could be pluged to camlistore? A local storage subsystem could be developed too. Or maybe use camlistore storage subsystem?
Maybe many instances of storage subsystem could exist. Each one of them associated with a group of peers. The outcome is that the storage subsystem doesn't know anything about groups. The backup must handle this.
As I said, I need to think about it a little bit longer.
No comments:
Post a Comment