Robert Kahn, one of the inventors of the Internet, has a plan for how it can evolve further – and be both open and secure at the same time
For much of its early years, the Internet was an unalloyed good-news story: Scientists collaborating, companies growing, communities forming. But lately we’ve been seeing more of its dark side: hacking, trolling, fake news. So what’s next?
Robert E. Kahn, an American software engineer credited as one of the “fathers” of the Internet, has an idea or two about that. Kahn, when working for the Pentagon’s Defense Advanced Research Projects Agency (DARPA), in 1972 organised the first public demonstration of what became the Internet. And a year later, he and Vinton Cerf (now at Google), developed the Internet’s key architectural idea, the Transfer Control Protocol (now called TCP/IP), an approach to connect networks and the computers attached to them, no matter where they are or how they’re built. Others piled new ideas on top of that – browsers and the Web, encryption and social media, the cloud and e-commerce.
Kahn left DARPA in late 1985, and is now heading a US-based non-profit organisation, Corporation for National Research Initiatives. In an interview with Science|Business, edited for clarity, Kahn outlines what he thinks should be the next big idea to improve the Internet: the “Digital Object Architecture". Kahn also elaborated on his ideas 25 September at a Science|Business conference in Brussels on open science and innovation.
Q. The Internet has grown up. So what’s the next challenge?
A. We have around three billion devices on the Internet. The Internet of Things (in which billions of sensors and machines - from factories to farms to houses – will be connected) is on our immediate horizon. That’s projected to be much bigger than the current Internet. This is a big opportunity space, and there is no agreement as to who, if anyone, will provide the conceptual leadership that may be needed. This is similar to the situation with the Internet in the early days.
Q. How is that?
A. When we were starting to work on computer networking, in the US many of the people likely to be most affected by our work showed little or no interest. We had one dominant carrier at the time, AT&T. I had meetings with them, but they said they’d take the lead in their own time when the business prospects were clearer, and, in the meantime, they could provide us the telecommunication lines if we could pay for them to play around with. A few years later there were many different corporate solutions proposed for computer networking and each company aspired to play a major role in the market. Suddenly interoperability became important.
I have never taken a position about a specific choice of vendors or detailed technology choices. But, whatever choices are made, something will be needed that makes them work together in some sense. We had all these different computers and networks to deal with, and we wanted to simplify the interoperability aspect. And that’s what we did by introducing TCP/IP and routers (which we called gateways back then). TCP/IP was adopted because it worked, wasn’t burdened by proprietary claims from any of the big companies or, indeed, anyone else. As an architecture, it was publicly available and without charge.
The Internet architecture has remained remarkably stable since that time. Almost 50 years. Meanwhile the underlying technology has scaled up by a factor of about 10 million: we were using dial-up modems at the 300 bit per second range back then, and now we’re in the gigabit per second range. All the parameters have been going up, yet the basic architecture has largely remained unchanged – which means it has lasted through many generations of change in the underlying technologies. We just tried to make it easy for the bits to get from one computer to another regardless of which networks they were on.
Q. So why change anything? What is your proposal?
A. Our approach is a logical extension of today’s Internet. The Digital Object Architecture is a way of naming and organising everything online so it can be easily found. Everything online – a newspaper article in a library, a smart electric meter in a home, or even a person online – when represented in digital form, would have its own, unique digital identifier. And with that identifier, you could easily find them online regardless of where in the world they are or how they’re connected.
Well, this approach has already been adopted in many quarters. Just like the basic Internet architecture, which is independent of the various components that interoperate using it, the Digital Object Architecture has a similar property. Documents could be in Microsoft Word, or whatever. Storage can be in databases, or cloud services. It doesn’t even guarantee you know what the information is that is incorporated in a digital object as it may even be encrypted or subject to some proprietary means of coding.
In the Internet of Things, you might have 100 billion devices online all of which may be generating digital information. How will this work in the future? Certainly, no single system can handle that large a load. And nobody’s going to want to store it all in any one place. If it’s all distributed how can it be discovered or obtained with sufficient access controls or security? Well, if all this information is structured as digital objects, and all of it is potentially accessible from its own identifier, we have the makings of a workable system – provided the security is adequate.
The key thing about a digital object is the unique, persistent identifier associated with it. If I told you the identifier, you can get that object wherever it happens to be. Just like the Internet today masks all the details of what computers you’re using, what country you’re in, what routers are there. You take technology substantially out of the picture.
Here’s how it works. Each organisation would obtain a unique number for a prefix that identifies an object as part of that organisation’s system. You could be a company, a university, a governmental body or even an individual – you control your information yourself. A decentralised registry keeps track of the prefixes, so if I allotted your organisation a prefix –say 1015 - the registry would understand that identifiers starting with that prefix had to be from you. Then every digital object your organisation controls would have a unique suffix – say, 12345 or “living room temperature” or sensor xyz – and a local registry has metadata about that digital object. It’s information about where the digital object is, in what repository. And then you or programs acting on your behalf go get it if you have the authorisation to access it. And you’re done.
Q. But if there are more objects online, won’t that just expand the hacking problem?
A. No, this system has security built in. For instance, maybe it’s your medical record, and maybe it’s only you and your doctor who should be able to access it; if so, that access information is stored with the digital object in the registry. Then, if you ask to see the record, the system checks: Are you authorized to access it? The system asks to validate who you are. Or it blocks you because you aren’t authenticated for access. The encryption could be anything in the future – perhaps involving quantum technology after that becomes widely available.
Q. What happens as technology changes?
A. One important attribute is persistence. You could have a 100 year old database attempting to interact with a brand new cloud. If I were to give you a spreadsheet in Visicalc (an early format), would you know what to do with it? It’s a proprietary data structure. It’s somewhat like having it encrypted. If you were trying to manage that over very long periods of time, you’ve got to manage not only the programme, you have to manage the application software, and the environment in which it runs. That’s probably an impossible task to accomplish in general. I would rather store that spreadsheet not in a proprietary form, but in a more easily understood way over long time frames. For example: I have an object whose type is matrix, with so many rows and columns, with ij entries such and such. And then I can interpret it, and feed it into a programme that might exist 100 or even 1,000 years from now and say: These are the parameters. This future programme can deal with other programmes of the day, and the printer user interfaces of the day. What that brings to science is a kind of consistency over very long periods of time.
Q. How much is this system in use today?
A. We are well along in its adoption. There’s the early work in the publishing industry. They decided they needed a better way to manage information in the 1990s when everything was going digital, but they were worried. At that time half of the URLs didn’t work after perhaps a year or two; and perhaps as many as 95 per cent of them didn’t work after five years. So, if you’re going to put out a journal in digital form, you don’t want one where in five years the references basically don’t work anymore. So, the publishers set up a foundation to manage their use of handles and called them DOIs. It’s a big success. Virtually every major publisher uses them.
An interesting recent application in the U.K. involves using the architecture to identify things in buildings. My understanding is that it was intended to include everything to do with buildings: the rugs on the floor, the paint on the walls, the air conditioning systems, the knobs on the doors, the steel beams. With digital objects, however branded, they could trace the provenance, in perpetuity, as long as the information was properly managed in digital form over time. So if you came back 100 years from now you could still know what’s in the building, even if the building no longer exists.
Q. But this idea has critics. One argument is that, if every person has an identifier, it becomes easy for an authoritarian government to track you down and block you. No hiding.
A. If you’ve got a government that wants to be heavy-handed, there is little that technology can do to stop it. They can unplug the Internet in their country. Anything you create can be used for good or bad. You can use fire to cook your food or burn you up. Name something that doesn’t have the ability to be misused.
There is also the increased security for a person’s information in digital form when each object representing such information may be encrypted or otherwise protected from unauthorized access.
Q. But why do we need this at all? Will it make my life easier?
A. It makes access to information easier. If there are only one or two telephones in the world and somebody has a switch, and you ask, what’s the value of that switch? Well, perhaps not that much. But, if I have three telephones or more, I may be in a completely different situation. At some point, the economics may support the use of switching. You would have to visualise how best to accommodate many telephones, and how the future may unfold technically and socially, as well as businesswise. It’s the same with digital objects and the future Internet: The more objects we’re trying to get to online, the more we need a system to handle that complexity – and that’s the digital objects architecture I believe.
In the 1980s, some people asked: Was the Internet going to make my life easier or better? Well, it was a technology project initially that had more widespread potential. At the time, I said it likely won’t save money, as you’ll still spend a lot of money on computers and infrastructure. And you’ll waste a lot more time because, with word processing, you can type and retype whereas before you wrote it just once and finished. But the test is: If you put it in place, would you ever go back? No. Nobody considered that option seriously.