Jump to Navigation

A better filesystem with tagging?

Recently I had cause to reflect on just how inconvenient today's file systems are at organising information. I don't know about you but my desktop and home directory rapidly become filled with odds and ends that I have downloaded or created. I can start off with a clean home directory and within a space of 6 months its a complete mess.

File systems contain very little user focused meta-data

Often it is a new technology that I wish to become familiar with or an interesting tutorial or discussion paper. Other times its that correspondence with the Insurance company or some-other institution that you want to file to keep for later reference. If you are like me you don't spend much time designing an information architecture for your storage needs and start off with general folders like "downloads" or "documents".

Pretty soon you realise this categorisation is too general but don't have the time to re-organise all those files. It gets worse because you start storing information in different places and trying to find that ebook or tar file become almost impossible. Over time, as you learn more about a subject area, your classification start to change and you realise that something has been mis-classified. For example you may put hibernate and JPA under JAVA but later would like to put it under ORM or both.

The fundamental problem here is that filesystems do not contain user focused meta-data. There is plenty of metadate for creation time, last accessed and modified, read-only etc but from a user point of view this information has very little relevance.

The folder analogy does not scale

What file systems need to do is hide the implementation details of where a file is stored and provide a convenient user interface. I can't help but feel the folder analogy for file systems is outdated and does not scale.

Desktop search has helped

Sure things like search are becoming more sophisticated, Beagle can be a real help with searching the contents of PDF and spreadsheet files but often you didn't use the word you search for in your documents, and its completely useless for binary files, being a developer I have a lot of these.

Social Bookmarking for file systems?

All this got me to thinking that what file systems need is to store user meta-data that will allow a user to tag files with additional information much like del.icio.us . This would enable the user to have multiple categorisations for files aiding in search.

In addition it would be beneficial if there is an external database of MD5 checksums that stored common tags for files, like any social bookmarking site, so if, at a later date, after my conceptual model of a problem domain has developed, I could search my files with tags I did not assign myself. The MD5 idea would only work for binary files and public available material like tutorials etc. Of course people will have security concerns about sending their tags and summary of filesystem content to an external server but millions of people are already doing this for mail and bookmarks.

I guess there is someone out there doing research on this. Lets hope there is a significant development in file systems soon that focuses on the user. Lets also hope its not patented to death.

I also think that in the long run the evlution of filesystems will be linked to the evlution of storage. It is just a matter of time before we are storing our entire filesystem online with access from any location and no need for users to worry about organising their data as the filesystem will do it for them.  



by Dr. Radut.