thingelstad.com

Death by Metadata

My software needs to start doing more work for me. Really. Either that or I need to get a digital librarian. What for you ask? I feel like I’m dying under the weight of metadata, data about data.

The vast archive of 33,000 digital photos that I have, the 400GB of music I’ve ripped from my CD collection and the hours of digital video are all great. It’s awesome, and I wouldn’t go back for anything. However, all this digital data has metadata as well, and that metadata doesn’t just magically exist. It has to be placed there somehow.

For example, I’ve recently moved all of my photos into Adobe Lightroom. Lightroom is awesome and is lightyears (no pun intended) ahead of the file-system approach I used to manage my photos with. One of the great things it can do is tag photos (bill, bob, cat, dog, house, etc) and put them in collections (Wedding Photos, France Vacation, etc). After tagging, finding those pictures of Bill, Bob and a cat is as easy as a simple query. No more digging through hundreds or thousands of files. Want a picture of a squirrel? One click will do.

The rub though is that those tags have to be created, and who’s going to do that? I want my software to get smarter and start doing this for me. Facial recognition isn’t easy but it is possible. My Mac Pro sits on for hours at night. Let’s burn some CPU cycles identifying faces in photos. I’m happy to train it.

Music has a similar problem. ID3 tags are a must for a large music collection. If you don’t have them, or if they are wrong, forget it. Luckily here software has done some work for me already and it’s pretty easy to tag things off of big central databases on the net. However, now I have the harsh reality that 70% of my music by volume is tagged as “Rock”. Why? Because a music catalog would call it all Rock, even though I think it’s totally different. And here there is a big problem, because no group of people will all agree what genre AC/DC is – Rock, Metal, Classic Rock, Garbage.

I want to leverage all this great power of the digital world, and the promise is amazing, but software needs to make it simple, easy and quick ways to allow people to start associating metadata. The photography example is obvious to me. I don’t care if it takes a year to do 30,000 pictures – just as long as it does it. And I’m fine sitting down for a few minutes and identifying people and things for it as it trains. This stuff exists, it just needs to be integrated in the right way.

Hopefully that will happen soon.


| 2007 |