Thinking About Ensuring Content Longevity

Just listened to Dave Winer's podcast about building a For The Record Blogging System. As he rightly points out that the problem with such a system isn't the actual technology to create / build the post, but ensuring that the post lives on forever. We have lost so much of the history of the web as as companies who hosted our content have gone out of business or changed their focus.    People use Medium for this today, what guarantee do we have that they will be up and running tomorrow / next month / next year?

My initial thought about this was "people should host their own content on their own domain".... and then I thought about it.  If I were to die tomorrow, everything I host would slowly fade away:

  1. When my credit cards are cancelled my host will discontinue service and *poof* my content will be stricken from the web. Same goes for my domains, eventually they'll expire and then "Bye Bye"
  2. If I make arrangements to make sure hosting and domain renewal aren't a problem; eventually they'll be a problem somewhere on the server. Maybe a security update will be released that won't get applied and the server is hacked. Maybe a drive failure. Maybe DB corruption. At some point the lack of a sysadmin will become a problem.
  3. If I make it through this hurdle eventually things will slowly slip away via entropy. HTML, CSS and Javascript features will be deprecated and removed. File formats will die out of favor with new and shiny things replacing them (think flash). 
  4. Eventually my estate will run out of money and #1 will happen.

Perhaps by random coincidence the next podcast that popped up on my player was Jason Scott's.  For those not in the know, Jason works at the Internet Archive, a non-profit founder by Brewster Kahle which is attempting to archive ... well ... everything.  You may know the archive from the wayback machine or collection of older games.  Those are only a small part, they are truly trying to be a modern day Library of Alexandria.  If it can be digitized and archived the IA is trying to archive it. 

Now, the wayback machine, isn't the direct solution to this problem but the fact these things popped up back to back got me thinking about the problem.