$\DeclareMathOperator*{\argmax}{argmax}$

# McClelland at Stanford's Computational Neuroscience Journal Club

One of the privileges and pleasures that living in Silicon Valley affords is the opportunity to attend fascinating talks by giants in academia and industry alike. I'm cognizant of the fact that there are some who may wish to attend, but are unable to do so, and so I decided to share my notes digitally and freely.

Disclaimer: the following notes capture the talk as I perceived it and do not necessarily accurately capture the speaker's point as intended.

## Speaker

Professor James McClelland is a professor of psychology at Stanford University, perhaps best known for his co-authorship of Parallel Distributed Processing, a book which sparked a resurgence in connectionism and neural-network based modeling. The ambiguously attributed third author, the "PDP Research Group," included luminaries such as Geoffrey Hinton, Francis Crick, Michael Jordan and many more.

## Hosting Organization

Stanford's Computational Neuroscience Journal Club (CNJC) is a cross-departmental, biweekly meeting group that discusses recent papers in and related to computational neuroscience. The club is organized by Saurabh Vyas and Alex Williams.

## Content

### Late 1950s

• Early approaches to artificial intelligence diverged
• Rosenblatt (creator of the delta rule) worked on perceptron (statistical)
• Chomsky worked on syntactic structure (logical)
• extreme contrast between the two
• Simon and Newell claimed to have created artificial intelligence with their General Problem Solver, but really all it did was transform logical propositions into other logical propositions

### 1970s

• McClelland did undergrad (in something related to animal conditioning?)
• McClelland remembers a quote in the 1967 default undergraduate psychology textbook by Neisser: the discoveries made by cognitive psychologists were "only of peripheral interest" towards understanding intelligence
• Learned that phenomenon could be explained through interactions of neurons e.g. horseshoe crab neurons fired in response to light stimuli
• Researchers moved away from artificial neural networks because they were hugely computationally expensive
• Two notables continued working on neural networks: Grossberg and Anderson
• Grossberg was the original Schmidhuber: made numerous contributions, took credit for many, many more
• Grossberg showed how competitive pools of neurons could capture many perceptual phenomena. He also introduced competitive learning
• Anderson used matrix/vector-based approach to modeling (which we take for granted today)
• Anderson worked on attractor dynamics. He would create models of 50-100 neurons and demonstrated attractors as certain inputs would get sucked into a corner of the hypercube

### Late 1970s

• Enter Rumelhart. The guy is a wicked smart. McClelland felt intimidated. The guy had created his own version of LISP because he disliked some of its features.
• Rumelhart was dissatisfied with symbolic AI. He felt that perception/comprehension required graded constraint satisfaction (what is this?), which wasn't easily captured with LISP-like processing
• Geoffrey Hinton somehow entered the picture. He came from studying holograms - specificially, pattern completion with neural networks
• Hinton had some project of extracting rudimentary semantics(?)

### Early 1980s

• Group gathered in UCSD: Francis Crick, Rumelhart, McClelland, Michael Jordan, Hinton
• Rumelhart and McClelland wrote PDP book while Hinton and Sejnowski worked on Boltzmann Machines and Rumelhart, Hinton and Williams worked on adapting backpropagation to neural networks
• Tension when writing the book because Crick wanted to stick to the data and backprop is biologically implausible (synapses are directed)

### Are Rules a Thing of the Past?

• This was a random aside. I don't know how it fit into the talk at this point
• Psychologists had observed children creating new past-tense conjugations of verbs e.g. "taked" instead of "took"
• Someone (McClelland?) tried to teach neural networks to learn past tense congujations i.e. given current tense, output past tense
• trained neural network with regular verbs, then irregular verbs
• Network made similar mistakes as children while training, and similarly overcame those mistakes as training went on
• Network could successfully pattern-recognize to unseen inputs. For example, network could output "weep->wept" based on input "keep->kept"

### Mid 1980s

• Pinker and Fodor invited McClelland to a debate at MIT on theories as laid out in PDP
• Pinker and Fodor got to give back-to-back opening remarks (an hour each), then gave the audience a break, at which point everyone left

### Backpropagation

• Rumelhart was interested in negation problem i.e. 1 layer couldn't negate input
• Rumelhart suggested hidden layer to side that could override input
• Neurobiologists rejected backprop as umplausible
• Artificial neural networks led some neurobiologists to a crisis of faith. Many had hoped for desireable computation, but while neural networks were extremely deterministic, they were almost impossible to interpret

### Cognitive Neuropsychology

• In 1970s, some researchers decided to use artificial neural networks to model brain damage
• Deep dyslexics can't read non-words e.g. Vint. Shallow dyslexics make different errors.
• Modeled brain behavior with three neural networks (semantics, orthography, phonology), with connections between pairs.
• Lesioning specific pathways resulted in observed problems in patients, demonstrating that brain damage could be modeled

### Late 90s

• Neural networks entered research winter
• NN couldn't solve problems that people claimed they could e.g. visual object recognition
• Depth was viewed as an enemy, as depth frequently slowed down learning multiplicatively
• Bayesian approaches became popular. By early 2000s, Stanford had stopped teaching neural networks.

### New Directions

• Thinks promising research lies in the direction of experience-driven development of language, motor skills and cognitive systems
• This will require datasets like those available to children
• Specifically says he's most impressed with the work of Tim Lillicrap, Alex Graves and Greg Wayne (can't find a webpage)

## Notes

I appreciate any and all feedback. If I've made an error or if you have a suggestion, you can email me or comment on the Reddit thread.