Thursday, November 29, 2012

Deathstar and reinventing the wheel

I've spent most of the week reading papers about techniques for distributed backups. It seems to exist lots of theory about it, but no free software application in active development.
I only have found existing code for DIBS (Distributed Internet Backup System), but last commit seems to date back from 2009. I'll study its code to learn about the problem I'm trying to solve.
After this little research done I've come to conclusion that the program I'm writting can be compared with building a deathstar, not a kennel as I first thought. So, even sticking to the initial idea of "evolutive development", I need to gather quite a lot of information before producing any actual code.
Other issue that keeps bothering my mind is the idea that I can be reinventing the wheel. It seems that the rsync protocol is well suited for the problem of transmitting data between peers. If that's the case, I will try to use it. Even more, I will try to find an existing implementation or library in Go and use it. I need to make an effort to remind to myself: It's free software. Reuse when possible. This mantra should be extended to the whole development process. Try to rely in existing standards and working implementations as much as possible.
So, for a while, expect activity in this blog, but not much it the repository. It's time for reading and summarizing information.

Saturday, November 24, 2012

Directory syncing

Directory syncing, as well as finding time, has proven  to be harder than I expected three weeks ago. Short version: I failed in my first two week goal. 
But the project is alive and i have learned quite a lot of things. Right now the program can compare two folders prompting which files should be copied, which ones are ok and which ones should be erased. This functionality isn't finished, because it doesn't check modification dates yet.
I'm using filepath.Walk to traverse the directory tree. Two independent coroutines are launched and the main function reads information from both of them. As path.FileWalk returns files ordered alphabetically it's possible for the main function to know which action should be take with each file.
But this implementation isn't good enough for various reasons. 
First of all, filepath.Walk behaviour is perfect for copying and creating directories and files, but not appropriated for deleting files and directories. We must delete first files and afterwards their parent directory. This problem can be fixed by keeping a list of files to be deleted and performing the deletion at the end. But honestly, I don't think this would be a good solution. So, a more customized version of path.FileWalk is needed.
Secondly, once the program has evolved and the remote encrypted copy is made, it will be inefficient, if not impossible, to check the remote tree as we do now. A local database with information of the last backup made must be kept. The local program should verify files using this database. Looking after remote files will be a task of the remote daemon.
I'm finishing this implementation in order to detect more pitfalls and learn file manipulation in Go. When it's finished I will start work with the database.

Friday, November 2, 2012

About Peerbackup

Almost forgot about it. What's this blog about?
I'm writing a program to backup remotely files between peers. The idea is simple: I want to backup my files automatically to some remote place and I don't want to rely in private companies to store them.
Here's how it should work: a group of people install the program and agree in dedicating an amount of hard drive space to store each others encrypted files. The program runs in background mode, as a daemon, and detects changes in original files synchronizing them in the remote backup.
So, i need to learn about communications, cryptography and filesystems.
The project is free software and resides in github (https://github.com/libercv/peerbackup)
Don't expect it to be useful in near future. It's a project for learning, so I'm sure I will make lots of silly mistakes.
Of course, help and guidance would be much appreciated.

Thursday, November 1, 2012

Two week goal

A friend of mine told me last week that in his new job they use a two week planification system. They set goals in order to be achived in two weeks. When this two weeks have past, they repeat the process.
I think it's a good way to start, so i'll set the first two week goal.
Deadline: Nov 18th 2012.
Achievement: Synchronize files between two directories

 Let's go!