Is Vint Cerf wrong?

ubuysa

The BSOD Doctor
For those who don't know, Vint Cerf and Bob Kahn developed the protocols that became TCP/IP for the DARPA network (ARPANET) back in the 1970's and that led directly to the development of the Internet we know today. My tiny claim to fame is that I once attended a seminar given by Vint Cerf many years ago whilst I was working in the USA.

But I digress.

It's reported today (http://www.bbc.com/news/science-environment-31450389) that Mr Cerf is concerned about a future 'data black hole' because the hardware and software platforms to which we entrust our digital data will be so obsolete in the future that historians may never be able to access the data. He is suggesting that we preserve a 'snapshot' (his words) of the data and the hardware/OS/application platform together so that the data created by the application can be accessed in the future.

I think he's wrong (and I think there's more than a whiff of Google in what he says - who, for example will keep these 'snapshots'? Care to guess?).

For a man who built his reputation on standards (TCP/IP is not a physical thing, it's a set of standards) I'd have thought he'd be much keener on developing more open standard data formats. I'm thinking of the Portable Document Format (pdf, used by Adobe and others - though pdf is not an open standard), the Open Document Format (odf, used by OpenOffice and others), as well as jpeg, mpeg, etc. etc. I don't see that the best way to preserve our digital data is to save hardware/OS/application images, it's simply to exclusively use an open standard based data format in which to write the data. All we then need to do is ensure we maintain a detailed description of the standards into the future and we'll always be able to recover the data regardless of the hardware/OS/application platform then in use. So I'm very surprised to see someone of Mr Cerf's pedigree talking about system 'snapshots' rather than open standards (TCP/IP is the open standard after all).

To be fair, at the end of the BBC article linked above he does mention standards, but in a rather off-hand way. I rather think he's speaking not as Vint Cerf, Father of the Internet, but as Vint Cerf, Vice-President of Google Inc.

What does the forum think? Is Vint Cerf right, are hardware/OS/application repositories the best way to preserve our digital data or are public open standards more future-proof?
 

Tom DWC

Moderator
Moderator
I was reading that article on the bus yesterday, it did get me thinking. My take on it...

There are a few situations I can think of where things I own have already become inaccessible, in the form of a few old games. I installed Empire Earth (2001) for a bit of nostalgia this week only to find it won't run from Windows 7 onwards, or on a virtual machine either - with no known fixes. It's fairly uncommon at the moment but such instances are only going to increase as the software and hardware continues to change, making backwards compatibility more difficult - especially with more complex media such as games. While at the moment it would still be possible to buy a second hand XP machine, or install XP natively, it's inevitable that the availability of old second hand machines will eventually fade out, and older operating systems will cease to function on the hardware of the day. At this point I guess the game could be considered 'lost' to time, and that could well happen within the next 15-20 years. Though even then, the game will always have a documented history on the internet, through wikipedia and countless other websites. It definitely wouldn't be entirely lost to future generations and the information will still be out there if they wanted to find it. The internet already provides the snapshot, it's accessibility has declined but it's certainly not lost in a data black hole.

In most other instances that come to my mind, if for example you backup your CDs to .flac and your Blu-Rays to .mkv, and assuming you take precautions for the event of hardware failure, they aren't going anywhere. You have your digital data preserved in a lossless, open format that can be converted to any other format in future.

I'm probably not thinking on a grand enough scale or far enough ahead. I mean long term technology is likely to evolve beyond anything like what we have now and the way we do things will change. But if you look at our history, we're pretty good at keeping a record of stuff and no doubt we'll manage to find a way with digital content too.
 
Top