Monthly Archives / November 2013

  • Nov 24 / 2013
  • 2

Book Preview: Chapter 1 – Some Context for Machine Intelligence

The following is the draft of Chapter One of my upcoming book, Real Machine Intelligence with NuPIC – Using Neuroscience to Build Truly Intelligent Machines. The book is intended as an introduction to Jeff Hawkins’ Hierarchical Temporal Memory theory, which seeks to explain in detail the principles underlying the human brain, and the open source software he’s built based on those principles. The book, aimed at the interested non-expert, will be out on Amazon in early December. You might like to read the Introduction first.


This book is about a new theory of how the brain works, and a piece of software which uses this theory to solve real-world problems intelligently in the same way that the brain does. In order to understand both the theory and the software, a little context is useful. That’s the purpose of this chapter.

Before we start, it’s important to scotch a couple of myths which surround both Artificial Intelligence (AI) and Neuroscience.

The first myth is that AI scientists are gradually working towards a future human-style intelligence. They’re not. Despite what they tell us (and they themselves believe), what they are really doing is building computer programs which merely appear to behave in a way which we might consider “smart” or “intelligent” as long as we ignore how they work. Don’t get me wrong, these programs are very important in our understanding of what constitutes intelligence, and they also provide us with huge improvements in understanding the nature and structure of problems solved by brains. The difficulty is that brains simply don’t work the way computer programs do, and there is no reason to believe that human-style intelligence can be approached just by adding more and more complex computer programs.

The other myth is that Neuroscience has figured out how our brains work. Neuroscience has collected an enormous amount of data about the brain, and there is good understanding of some detailed mechanisms here and there. We know (largely) how individual cells in the brain work. We know that certain regions of the brain are responsible for certain functions, for example, because people with damage there exhibit reduced efficiency in particular tasks. And we know to some extent how many of the pieces of the brain are connected together, either by observing damaged brains or by using modern brain-mapping technologies. But there is no systematic understanding which could be called a Theory of Neuroscience, one which explains the working of the brain in detail.

In order to understand how traditional AI does not provide a basis for human-like intelligence, let’s take a look inside a digital computer.

A computer chip contains a few billion very simple components called transistors. Transistors act as a kind of switch, in that they can allow a signal through or not, based on a control signal sent to them. Computer chip, or hardware, designers produce detailed plans for how to combine all these switches to produce the computer you’re reading this on. Some of these transistors are used to produce the logic in the computer, making decisions and performing calculations according to a program written by others: software engineers. The program, along with the data the program uses, are stored in yet more chips – the memory – using transistors which are either on or off. The on or off state of these memory “bits” comprise a code which stands for data – whether numbers, text, image pixels, or program codes which instruct the computer what instruction to perform at a particular time.

If you open up a computer, you can clearly see the different parts. There’s a big chip (usually with a fan on top to cool it), called the Central Processing Unit or CPU, which is where the hardware logic is housed. Separate from this, a bank of many smaller chips houses the Random Access Memory (RAM) which is the fastest kind of memory storage. There will also be either a hard disk (HD) or a solid state disk (SSD, a kind of chip-based long-term memory, faster than a HD, bigger but slower than RAM) which is where all your bulk data (programs, documents, photos, music and video) is stored for use by the computer. When your computer is running, the CPU is constantly fetching data from the memory and disks, doing some work on it, and writing the results back out to storage.

Computers have clearly changed the world. With these magical devices, we can calculate in one second with a spreadsheet program what would have taken months or years to do by hand. We can fly unflyable aircraft. We can predict the weather 10 days ahead. We can create 3D movies in high definition. We can, using other electronic “senses”, observe the oxygen and sugar consumption inside our own brains, and create a “map” of what’s happening when we think.

We write programs for these computers which are so well thought out that they appear to be “smart” in some way. They look like they’re able to out-think us; they look like they can be faster on the draw. But it turns out that they’re only good at certain things, and they can only really beat us at those things. Sure, they can calculate how to fly through the air and get through anti-aircraft artillery defences, or they can react to other computer programs on the stock exchange. They seem to be “superhuman” in some way, yet the truth is that there is no “skill” involved, no “knowledge” or “understanding” of what they’re doing. Computer programs don’t learn to do these amazing things, and we don’t teach them. We must provide exhaustive lists of absolutely precise instructions, detailing exactly what to do at any moment. The programs may appear to behave intelligently, but internally they are blindly following the scripts we have written for them.

The brain, on the other hand, cannot be programmed, and yet we learn a million things and acquire thousands of skills during our lives. We must be doing it some other way. The key to figuring this out is to look in some detail at how the brain is put together and how this structure creates intelligence. And just like we’ve done with a computer, we will examine how information is represented and processed by the structures in the brain. This examination is the subject of Chapter Two. Meanwhile, let’s have a quick look at some of the efforts people have made to create an “artificial brain” over the past few decades.

Artificial Intelligence is a term which was coined in the early 1950’s, but people have been thinking about building intelligent machines for over two thousand years. This remained in the realm of fantasy and science fiction until the dawn of the computer age, when machines suddenly became available which could provide the computational power needed to build a truly intelligent machine. It is fitting that some of the main ideas about AI came from the same legendary intellects behind the invention of digital computers themselves: Alan Turing and John von Neumann.

Turing, who famously helped to break the Nazi Enigma codes during WWII, theorised about how a machine could be considered intelligent. As a thought experiment, he suggested a test involving a human investigator who is communicating by text with an unknown entity – either another human or a computer running an AI program. If the investigator is unable to tell whether he is talking to a human or not, then Turing considers the computer to have passed his test and must be regarded as “intelligent” by this definition. This became known as the Turing Test and has unfortunately become a kind of Holy Grail for AI researchers for more than sixty years.

Meanwhile, the burgeoning field of AI attracted some very smart people, who all dreamed of soon becoming the designer of a machine one could talk to and which could help one solve real-world problems. All sorts of possibilities seemed within easy reach, and so the researchers often made grand claims about what was “just around the corner” for their projects. For instance, one of the “milestones” would be a computer which could beat the World Chess Champion, a goal which was promised “within 5 years” every year since the mid-50s, and which was only achieved in the 21st century using a huge computer and a mixture of “intelligent” and “brute-force” techniques, none of which resembled how Gary Kasparov’s brain worked.

Everyone recognised early on that intelligence at the level of the Turing Test would have to wait, so they began by trying to break things down into simpler, more achievable tasks. Having no clue about how our brains and minds worked as machines, they decided instead to theorise about how to perform some of the tasks which we can perform. Some of the early products included programs which could play Noughts and Crosses (tic-tac-toe) and Draughts (checkers), programs which could “reason” about placing blocks on top of other blocks (in a so-called micro-world), and a program called Eliza which used clever and entertaining tricks to mimic a psychiatrist interviewing a patient.

Working on these problems, developing all these programs, and thinking about intelligence in general has had profound effects beyond Computer Science in the last sixty years. Our understanding of the mind as a kind of computer or information processor is directly based on the knowledge and understanding gained from AI research. We have AI to thank for Noam Chomsky’s foundational Universal Grammar, and the field of Computational Linguistics is now required for anyone wishing to understand linguistics and human language in general. Brain surgeons use the computational model of the brain to identify and assess birth defects, the effects of disease and brain injuries, all in terms of the “functional modules” which might be affected. Cognitive psychology is now one of the basic ways to understand the way that our perceptions and internal processes operate. And the list goes on. Many, many fields have benefited indirectly from the intense work of AI researchers since 1950.

However, traditional AI has failed to live up to even its own expectations. At every turn, it seems that the “last 10%” of the problem is bigger than the first 90%. A lot of AI systems require vast amounts of programmer intelligence and do not genuinely embody any real intelligence themselves. Many such systems are incapable of flexibly responding to new contexts or situations, and they do not learn of their own accord. When they fail, they do not do so in a graceful way like we do, because they are brittle and capable only of working while “on-tracks” in some way. In short, they are nothing like us.

Yet AI researchers kept on going, hoping that some new program or some new technique would crack the code of intelligent machine design. They have built ever-more-complex systems, accumulated enormous databases of information, and employed some of the most powerful hardware available. The recent triumphs of Deep Blue (beating Kasparov at chess) and Watson (winning at the Jeopardy quiz game) have been the result of combining huge, ultra-fast computers with enormous databases and vast, complex, intricate programs costing tens of millions of dollars. While impressive, neither of these systems can do anything else which could be considered intelligent without reinvesting similar resources in the development of those new programs.

It seems to many that this is leading us away from true machine intelligence, not towards it. Human brains are not running huge, brittle programs, nor consulting vast databases of tabulated information. Our brains are just like those of a mouse, and it seems that we differ from mice only in the size and number of pieces (or regions) of brain tissue, and not in any fundamental way.

It appears very likely that intelligence is produced in the brain by the clever arrangement of brain regions, which appear to organise themselves and learn how to operate intelligently. This can be proven in the lab, when experimenters cut connections, shut down some regions, breed mutants and so on. There is very little argument in Neuroscience that this is how things work. The question then is: how do these regions work in detail? What are they doing with the information they are processing? How do they work together? If we can answer these questions, it is possible that we can both learn how our brains work and build truly intelligent machines.

I believe we can now answer these questions. That’s what this book claims to be about, after all!

  • Nov 13 / 2013
  • 0

Book Preview: Introduction to “Real Machine Intelligence with NuPIC”

The following is the (draft) Introduction to my upcoming book, Real Machine Intelligence with NuPIC – Using Neuroscience to Build Truly Intelligent Machines. The book is intended as an introduction to Jeff Hawkins’ Hierarchical Temporal Memory theory, which seeks to explain in detail the principles underlying the human brain, and the open source software he’s built based on those principles. The book, aimed at the interested non-expert, will be out on Amazon in early December.

NuPIC-Book-CoverThis book is about a true learning machine you can start using today. This is not science fiction, and it’s not some kind of promised technology we’re hoping to see in the near future. It’s already here, ready to download and use. It is already being used commercially to help save energy, predict mechanical breakdowns, and keep computers running on the Internet. It’s also at the centre of a vibrant open source community with growing links to leading-edge academic and industrial research. Based on more than a decade of research and development by Jeff Hawkins and his team at Grok, NuPIC is a system built on the principles of the human brain, a theory called Hierarchical Temporal Memory (or HTM).

NuPIC stands for Numenta Platform for Intelligent Computing. On the face of it, it’s a piece of software you can download for free, do the setup, and start using right away on your own data, to solve your own problems. This book will give you the information you need to do just that. But, as you’ll learn, the software (and its usefulness to you as a product) is only a small part of the story.

NuPIC is, in fact, a working model in software of a developing theory of how the brain works, Hierarchical Temporal Memory. Its design is constrained by what we know of the structure and function of the brain. As with an architect’s miniature model, a spreadsheet in the financial realm, or a CAD system in engineering, we can experiment with and adjust the model in order to gain insights into the system we’re modelling. And, just as with those tools, we can also do useful work, solve real-world problems, and derive value from using them.

And, as with other modelling tools, we can use NuPIC as a touchstone for a growing discussion of the basic theory of what is going on inside the brain. We can compare it with all the facts and understanding from decades of neuroscience research, a body of knowledge which grows daily. We believe that the theories underlying NuPIC are the best candidates for a true understanding of human intelligence, and that NuPIC is already providing compelling evidence that these theories are valid.

This book begins with an overview of how NuPIC fits in to the worlds of Artificial Intelligence and Neuroscience. We’ll then delve a little deeper into the theory of the brain which underlies the project, including the key principles which we believe are both necessary and sufficient for intelligence. In Chapter 3, we’ll see how the design of NuPIC corresponds to these principles, and how it works in detail. Chapter 4 describes the NuPIC software at time of writing, as well as its commercial big brother, Grok. Finally, we’ll describe what the near future holds for HTM, NuPIC and Grok, and how you can get involved in this exciting work. The details of how to download and operate NuPIC are found in the Appendices, along with details of how to join the NuPIC mailing list.

  • Nov 13 / 2013
  • 0
New Column

Adding Prediction to the NuPIC Spatial Pooler

Jeff’s theory describes an interaction between prediction and feedforward activation, whereby a cell is partially depolarised by coincident activity on its distal (predictive) dendrites. Predictive cells get a head start when they receive feedforward inputs, and are thus most likely to fire compared with the other cells in a column, as well as non-predictive cells in neighbouring columns.

For some reason, this is not completely implemented in NuPIC. The Spatial Pooler (SP) does not take prediction into account at all, and in fact acts as if it were one big cell with a single feedforward dendrite and no distal dendrites.

I propose the following changes from the existing (diagram below, left) to the proposed (right) CLA SP design:

1. Each cell now has its own feedforward dendrite.
2. Cells in a column have identical potential inputs (fanout).
3. Permanences initialised identically for all cells in a column.
4. Potential activation for each cell is sum of predictive and feedforward potential.
5. Cell with highest total activation provides column’s activation for inhibition.
6. Same cell is chosen for activation if column becomes active.
7. Feedforward permanences will diverge depending on correlations for each cell.

New Column


Anticipated Advantages of this Design

1. More Accurate Neural Model

The real neocortex has a feedforward dendrite per cell. The reason the cells share similar feedforward response is that the feedforward axons pass up through the column together, so they will form similar (but not identical) synapses with all the cells in the column.

Cells in a column will all have different histories of activation, so the permanences of their synapses with a given feedforward axon will not be identical. Each cell will learn to match its own activation history with the corresponding inputs.

In the real neocortex, prediction is implemented by a partial depolarisation of the cell membrane. This lowers the amount of feedforward potential needed to fire the cell. The cells with the highest total of predictive and feedforward potential will fire first and be in the SDR for the region.

2. More Informed Spatial Pooler Selection

The current SP ignores prediction, so it does not have the additional information which the region believes, i.e. what sequence are we in, and at what position? This is a significant factor in reducing ambiguity and constraining the space of likely patterns, which the real neocortex uses universally.

In addition, each cell now gets to tune its feedforward response more precisely to its actual inputs (i.e. the ones which occur in its sequence memory). Inputs which contribute to patterns found only in other cells’ sequences will be treated as noise and disconnected. This improves the overall noise suppression and spatial recognition.

3. Easier Reconstruction of Inputs

Because each cell has its own feedforward dendrite, the permanences on that dendrite will evolve to become a statistical representation of the bits associated with that cell. This makes it easier to reconstruct the inputs from the activation pattern (or from an SDR imposed on the region from above).

The current per-column dendrite represents the collective statistics of inputs to all the cells, and thus contains noise which confuses reconstruction at present.

4. Better Anomaly Detection

The added precision in reducing ambiguity through the use of sequence information in the SP will also improve anomaly detection. The region will be more sensitive to out-of-sequence events.

Potential Downsides

1. Resource Costs

Clearly, NuPIC will have a bigger memory requirement and a longer cycle time. In return, learning and prediction will improve in both quality and accuracy, so techniques such as swarming may decide which SP to use.

2. Slow Learning

It is possible that learning will slow as a result of this change, since only one cell’s dendrite will be updated per input record, instead of updating them all in parallel as at present.

This may be mitigated by copying updates to all cells in a column for the first N input records (or the first N activations of the column). This will hopefully replicate the current SP’s ability to learn the data. After that, we switch to the new per-cell updating policy to fine-tune the permanences.

I’ve been looking through the (Python) code to find out where all the changes need to be made. Here’s what I’ve found out:

The SP doesn’t know anything about TP’s. In fact, it thinks it has only one cell per column, or that a cell and a column are the same thing. That’s why it has one feedforward dendrite per column. The TP only knows about the SDR of columns, it doesn’t see inside the SP, so it can’t see any feedforward dendrites. So, if we want to do this, we have to connect the column and its cells (at least virtually). This would happen in the Region (which owns both).

Here’s how I think we should perform the hack (in Python):

In the SP:

1. Store a value for each column which is the predictive potential for the predicting cell (if any, so usually zero). Call this array _predictivePotentials.
2. Calculate the feedforward potential as usual, store in _overlaps, but then do _overlaps += _predictivePotentials.
3. Everything else is the same (or so the SP thinks!).

In the TP: No change!

In the Region:

For the first N steps, just use the standard SP and TP to learn good per-column dendrites from the data. After that, clone the column dendrites from the SP for the cells. Call this _cellFFDendrites.

1. Take the results from the TP (the predictive cells) and get their FF dendrites from _cellFFDendrites.
2. Overwrite the dendrites in the SP for those columns with the cell dendrites. The SP is now working with the per-cell dendrite.

_saveDendrites = SP._dendrites[TP._predictiveCols]
SP._dendrites[TP._predictiveCols] = _cellFFDendrites[_predictiveCells]

3. Update the SP’s _predictivePotentials:

SP._predictivePotentials = 0
SP._predictivePotentials[TP._predictiveCols] = TP._predictiveActivity[TP._predictiveCols]

4. (On the next step, we do SP before TP) Run SP as above.

5. Copy back out the (possibly refined) dendrites to the cells.

_cellFFDendrites[_predictiveCells] = SP._dendrites[TP._predictiveCols]
SP._dendrites[TP._predictiveCols] = _saveDendrites

  • Nov 13 / 2013
  • 0

Welcome to

Welcome to, where I’ll be sharing my thoughts and experiences on technology, and in particular on developments surrounding NuPIC, a very exciting new technology for machine intelligence based on the principles of the brain.