At least as important as the official talks and lectures at the Heidelberg Laureate Forum are the opportunities for informal conversations with the laureates, guests, and young researchers. Among some of the most interesting situations are those where lectures and conversations complement each other.
The semantic connections of HLF
On Friday, September 29th, the HLF moved to the St. Leon-Rot Campus of SAP, and during the coffee break, I had a chance to talk with Alexei Efros, who had given one of the two lectures on Deep Learning on Tuesday (here is a nice post by Nana Liu on the other, by John Hopcroft; I myself briefly touched upon Efros’s work, namely an erroneously zebra-striped Vladimir Putin, in my own post on Deep Learning, Is this text strangely reading?).
I am typing this while listening with one ear, and an unspecified and so far unspecifiable number of neurons, to Leslie Valiant’s talk about computer science and neuroscience and I was struck, as I was during Manuel Blum’s talk Can a Machine Be Conscious? on Thursday by, pardon my youthful arrogance, how old-fashioned these two laureates talk about the brain. We hear about sub-processes, about the brain implementing different algorithms, about random access tasks, storage allocated and so on, borrowing the vocabulary of classical programming and classical computer architecture.
Talking about consciousness without talking about consciousness
In contrast, in the talks on Deep Learning, even while neither speaker mentioned consciousness we had some glimpses, I suspect, of how computer scientists might talk about understanding consciousness in ten years or so. After all, what are these Deep Learning networks doing? In one of the (conceptually) simplest examples of unsupervised learning, such a network would get an input picture to excite its input nodes, and the task of the output nodes would be to reproduce this picture as closely as possible. The clue is that such a network will not (and, depending on the number of nodes in the intermediate layers, cannot) take the trivial route of linking each output pixel to the corresponding input pixel. Instead, the network will develop representations that, as far as we can tell, show some similarity with the way that humans would talk about an image. Let’s take an image that is representative of typical (safe for work) Internet content:
When describing that image, we will never start a pixel-by-pixel description. Chances are, we will not even mention the boring white pixels in the top left corner, the natural starting point for a pixel-by-pixel narrative. Instead, we will say that there is a cat, on its side, yawning, and we might go into a few more details from there.
A deep learning network is likely to develop, without our explicit prompting, representations of cat shapes, or penguin shapes, or dog shapes, or representations of parts of these animals. And a deep learning network whose purpose is to translate a text from one language to another (as in Is this text strangely reading?), or that is transcribing audio into sentences, will develop representations not only of words as sequences of letters, or of phonemes, but representations that capture how certain words are connected, occur in certain combinations but not in others, and surely that cannot help but incorporate some of the semantics of those words.
Aren’t we there yet?
To me, that sounds as if we are nearly there. Already, in translation tasks for instance, we use recurrent neural networks, that is, networks that are looping back onto themselves, using their own outputs at an input. Manuel Blum, when discussing human consciousness in his talk, stressed the importance of our internal monologue, the way we constantly talk to ourselves in the privacy of our own heads, immortalized in literature by the novels of Dorothy Richardson, James Joyce and others as the stream-of-consciousness narrative mode.
Now consider a deep learning network built for executing a certain task in real time, its representation patterns activating and de-activating as the network goes about its business, takes in new information as it becomes available, and executing certain actions in response. Isn’t it tempting to describe this stream of pattern as “thinking”? If the network’s capabilities, commensurate to its assigned tasks, include output in the shape of text, written or vocalized, we should be able to test this hypothesis.
Whether or not the system is speaking or writing at this very moment, after all, it is likely that the internal representations linked to specific words, and sequences of words, will be activated by association. A trivial example: If we want our system to recognize pictures of cats, and, if asked, tell us using proper English whether or not a certain picture is that of a cat or not, the system needs to have made the appropriate connection between recognizing the cat, and expressing that fact in words. Somehow, the representations of cat shapes and of the word or words used to describe a cat need to be linked. The simplest solution would be for the image-recognizing part of the system to proffer the sequence of words “picture of cat recognized!” to the answer-articulating part of the system whether or not an articulation has been asked for at this point in time or not – and it is tempting to relate this to our own stream of consciousness, which flows along as our internal way of talking to ourselves, and which is “switched to speaker” whenever we want to articulate our thoughts loudly. “Listen” to the un-articulated activated words, and you might be able to tell whether the result really does amount to some kind of stream of consciousness.
Hype, counter-hype, non-hype?
Speculations of this kind are ripe when one looks at the recent advances in Deep Learning. Many people, and among them rather smart people, seem to be very excited about where this new direction could take us. And just like for any other topic that creates excitement, there are people, and among them apparently again rather smart ones, who raise a cautious voice. Is AI riding a one-trick pony, as one article asks? (A “thank you” to reader Martin Holzherr who pointed to this link.) Are deep learning networks really so smart? Are they smarter than they seem, and are we, in consequence, further away from understanding intelligence than the optimists might think?
It is interesting that both John Hopcroft and Alexei Efros are immediately cautious when I ask them about linking their work to consciousness. Hopcroft, in conversation, is very clear in stating that the neural networks he has been training on images are recognizing shapes, no more. Efros is more openly interested in the implications, and remarks that it would be nice if philosophers who study consciousness were to take a closer look at deep learning. He’s somewhat cautious about the concept of consciousness, but wouldn’t be surprised if consciousness, or at least something that one could identify as consciousness from the outside (which, to be fair, is no more and no less the way we judge our fellow human beings to be conscious) were to emerge from a suitably complex artificial neural network.
Hype or not? That is difficult to judge at this point in time. But one way or the other, we are likely to know more at, say, the 15th Heidelberg Laureate Forum in 10 years’ time. My bet is still that we will hear talks about conscious (as far as we can tell!) dynamic deep learning networks at that future HLF.