Temporal Constraints on Visual Learning: A Computational Model

James V. Stone and Nicol Harper

Given a constant stream of perceptual stimuli, how can the underlying invariances associated with a given input be learned? One approach consists of using generic truths about the spatiotemporal structure of the physical world as constraints on the types of quantities learned. The learning methodology employed here embodies one such truth: that perceptually salient properties (such as stereo disparity) tend to vary smoothly over time. Unfortunately, the units of an artificial neural network tend to encode superficial image properties, such as individual grey-level pixel values, which vary rapidly over time. However, if the state of units are constrained to vary slowly then the network is forced to learn a smoothly varying function of the training data. We implemented this {\em temporal smoothness} constraint in a backpropagation (BP) network which learned stereo disparity from random dot stereograms. Temporal smoothness was formalised using regularisation theory by modifying the standard cost-function minimised during training of a network. Temporal smoothness was found to be similar to other techniques for improving generalisation, such as early-stopping and weight-decay. However, in contrast to these, the theoretical underpinnings of temporal smoothing are intimately related to fundamental characteristics of the physical world. Results are discussed in terms of regularisation theory, and the physically realistic assumptions upon which temporal smoothing is based.