"Size Constancy Scaling" as a Psychological Explanation for Disappointments with Wide-Angle Lens Images


Many photographers suffer deep disappointment with the results of their attempts to photograph the world through wide-angle lenses. Can we account for some of this disappointment by considering the difference between how a camera/lens/film (or image sensor) system 'sees' the world, and how the human eye/brain system 'sees' the world?

A foxglove dwarfs the Langdale Pikes. Elterwater Common towards sunset.

It is very common for a photographer to visit a dramatic landscape, the Isle of Skye, or the Alps, for example, and to wish to photograph the scene. Because of the big hills, and the viewpoint often being low down in the valley, a wide-angle lens may seem the obvious choice to allow everything to fit in the photograph. With the unaided eye, the photographer sees tall, imposing mountains towering closely overhead; when the photograph comes back from the processing lab, or is viewed for the first time on a computer screen, those towering mountain peaks have been reduced to tiny pimples on a horizon that seems much more distant than the remembered actuality (how well I remember those early disappointments!) Where did those mountains go? I remember seeing them! They were not that small and far away! So why doesn't the photograph match what I remember actually seeing? And, as an aside, if we were able to place a photographer and a painter side-by-side, would they produce similar images? (And in this juxtaposition, I am excluding that modern species of painter, the one who paints from a photograph; it may well take a painter longer to produce their image than a photographer, but the painting that I am interested in seeing is the one produced from what the painter sees, not one that merely replicates a photograph).

The American mountaineering photographer, Galen Rowell, was a powerful influence on me when, as a climber and would-be photographer, I gravitated towards mountains and landscape photography. I now recognise that Rowell was technically very accomplished, able to fully realise the ultimate potential of chemistry, optics and technology in his photography; he seemed, though, not willing (or perhaps not able) to address the more psychological, and thus intangible elements. His greatest skill was his ability to critically analyse the technical components of his photographic craft when discussing his images: his best-known work, "Mountain Light: In search of the Dynamic Landscape" was crucial in my formative years as a photographer. The commentaries accompanying the photographs in the book address issues of photographic quality primarily in terms of chemistry, optics, or technology; occasionally, an effort is made to explore the emotional content or effect of a photograph, but by and large the psychological aspects of the mechanism by which photographs exert their influence are not addressed. My own approach to my personal photography was very sympathetic to this technocratic methodology, so it was very easy for me to absorb and learn. I simply adopted the "received wisdom" and trotted out the standard explanations for why photographs seldom match what we remember (particularly later, when I took on the role of teacher of photography). In that tendency I was, however, not alone - most photographic text books, many of the "How to be a better Photographer"-type books, and virtually every one of the magazines on the newsagents' stands, take the same line - that there is some technical or technological reason for the discrepancy between what I see in the field, and what I see in the photograph.

So - this is what I used to say, in what I now think of as an incomplete attempt to explain why photographs so seldom meet the expectations of the photographer:

How do we see?
In simple terms, when a scene is scanned by the eye, light from the part of the scene currently being scanned is focussed by the eye onto the retina, and the signal from the retina is transmitted to the brain where it is translated into a visual picture that the viewer can make sense of. As the eye moves over the scene, taking in objects that are close by, and objects that are far away, the eye constantly refocusses the image it is receiving (thus creating the impression that all objects are simultaneously sharply-focussed), and the brain constantly reprocesses the information it is receiving. Even in conditions of near-darkness the eye can distinguish enough detail to allow the viewer to avoid obstacles, because the brain is able to interpolate the imprecise data and fill in the gaps from previous experience.

How does a camera’s way of “seeing” differ from the eye’s way of seeing?
The eye is often compared with a camera, and there are some similarities. Both use a lens to focus light; both use a light-sensitive medium to record the impression of the light; but there the similarities end. Aside from the obvious difference of the eye seeing a constantly changing and moving image, the two principal differences that are of significance to the photographer are:
the ability of the eye to constantly re-focus;
and the ability of the eye to see (via the brain’s interpretive abilities) in a very wide range of light conditions.

The ability of the eye to rapidly change focus, and thus for nearby objects to be perceived in sharp focus virtually simultaneously with more distant objects, is not shared by the camera. It is only possible for the camera lens to be focussed on one single plane at any one time; therefore, in a photograph, only objects lying on this plane of focus will be objectively sharply focussed. In practice, we all know of photographs that appear to be sharp throughout the depth of the scene, so there is obviously a technique which allows the photographer to mimic this ability of the eye.

However, the ability of the eye to see detail in a much wider range of light conditions than can a light-sensitive emulsion or digital sensor, is one which poses many more problems for the photographer. Generally speaking, the brightness range (scene contrast) that can be perceived by the eye/brain combination can be as much as 10,000:1 - that is to say, the brightest part of the scene can be 10,000 times brighter than the darkest, and the eye can still distinguish detail. In the case of a colour negative film, this contrast range is reduced to around 1000:1, and in the case of a transparency film, it is reduced to around 100:1. Combine that with the contrast reduction involved in producing a paper print of a photograph and it should be evident that a photographic representation of the real world is not in any way equivalent to literal reality. Quite the contrary, in fact - a photograph, by its very nature, is always at best an interpretation of reality.

Now, in itself, there is actually nothing inaccurate in any of the above. But, and this is the big but, there is nothing in there to account for the tiny, pimple-like mountains. The eye/brain system is patently different to the lens/camera system, but nothing in the above comparison even begins to explain the difference between what we see and what a photograph shows.

So, I turned to science to see if I could find an answer. The first book that I read (and, incidentally, I learned of it from a reference in another Galen Rowell book) was: Eye, Brain, and Vision by David H. Hubel; 1988. Scientific American Library. David Hubel (1926 - ), along with his colleagues Roger W. Sperry and Torsten N. Wiesel, was awarded the Nobel Prize in Physiology or Medicine in 1981, for work on the physiology of the eye/brain neural system. There is some very interesting consideration given in the book to the importance of learning to see as an infant (my emphasis); but as the following extract illustrates, this book is still principally a study of the eye and brain as a physiological machine.

"The eye has often been compared to a camera. It would be more appropriate to compare it to a TV camera attached to an automatically tracking tripod—a machine that is self-focussing, adjusts automatically for light intensity, has a self-cleaning lens, and feeds into a computer with parallel-processing capabilities so advanced that engineers are only just starting to consider similar strategies for the hardware they design. The gigantic job of taking the light that falls on the two retinas and translating it into a meaningful visual scene is often curiously ignored, as though all we needed to see was an image of the external world perfectly focussed on the retina. Although obtaining focussed images is no mean task, it is modest compared with the work of the nervous system—the retina plus the brain. [ … ] the contribution of the retina itself is impressive. By translating light into nerve signals, it begins the job of extracting from the environment what is useful and ignoring what is redundant. No human inventions, including computer-assisted cameras, can begin to rival the eye."

This book is a very worthwhile read, and it has given me many insights into the visual mechanisms and how we see. But, still no clue as to how the mountains become pimples inside a camera. For a real discovery about what might be going on, though, I had to wait for Eye and Brain; the psychology of seeing by R. L. Gregory; 1979, Third Edition. Weidenfeld and Nicolson.

Richard Langton Gregory FRS (1923 – 2010) was Emeritus Professor of Neuropsychology at the University of Bristol. A website about his work can be found here:


I was fortunate to find this book in the Lancaster Oxfam bookshop, and the thing that caught my eye was the word psychology. I had been contemplating for some years the possibility that the explanation for the incredible shrinking mountains might lie somewhere in the psyche, but desultory searches through second-hand bookshops, and occasional Google searches through the internet had failed to turn up anything significant. Here are Professor Gregory’s first couple of paragraphs.

"We are so familiar with seeing, that it takes a leap of imagination to realise that there are problems to be solved. But consider it. We are given tiny, distorted, upside-down images in the eyes, and we see separate, solid objects in surrounding space. From the patterns of stimulation on the retinas we perceive the world of objects, and this is nothing short of a miracle.

The eye is often described as like a camera, but it is the quite uncamera-like features of perception that are most interesting. How is information from the eyes coded into neural terms, into the language of the brain, and reconstituted into experience of surrounding objects? The task of eye and brain is quite different from either a photographic or a television camera converting objects merely into images. There is a temptation, which must be avoided, to say that the eyes produce pictures in the brain. A picture in the brain suggests the need of some kind of internal eye to see it—but this would need a further eye to see its picture … and so on in an endless regress of eyes and pictures. This is absurd. What the eyes do is to feed the brain with information coded into neural activity—chains of electrical impulses—which, by their code and the patterns of brain activity, represent objects. We may take an analogy from written language: the letters and words on this page have certain meanings, to those who know the language. They affect the reader's brain appropriately, but they are not pictures. When we look at something, the pattern of neural activity represents the object, and to the brain is the object. No internal picture is involved."

At last! In the first couple of sentences the author had become the first that I had read to raise the obvious, though to me unanswerable, question of how what the eyes see is translated into the experience of vision. And in paragraph two, although he initially appears to trot out the old standard about the eye being like a camera, he is actually telling us that it is the uncamera-like aspects of the visual system that are significant.

This book, like Hubel’s, is well worth the read, conducting a thorough review of the history of the development of our current understanding of the visual system. If you are interested in the visual system, I recommend that you read the book. For my present enquiry, though, I will draw your attention to chapter 9, Illusions, and a very strange perceptual phenomenon called 'Size Constancy'.

According to the laws of optics, when a lens projects an image of an object (the lens of the eye onto the retina, the lens of a camera onto the film), the size of the image is inversely proportional to the distance of the object from the lens. If the distance is halved, then the image size is doubled - and this is just as true of the retinal image as it is of the camera image. The weird thing is that, to the naked eye, two similar objects placed at different distances look the same size! The brain, or more accurately the perceptual system, attempts to correct for incongruities in the apparent size of similar objects! Try this simple experiment. Stretch out your right arm so that you are looking at your right hand at arm's length. Now extend your left arm and place your left hand next to your right elbow - your left hand is thus at half the distance of your right hand. Despite the fact that the retinal image of the left hand is twice the size of the retinal image of the right hand, the two hands look to be the same size! (If the left hand is now moved to overlap the right hand, Size Constancy is defeated, and it is immediately obvious that the nearer hand looks considerably larger). Now compare what you can see in reality with the photograph below.


Size Constancy illustration. Photo by Warren Matthews.

Another example of the same phenomenon: look at yourself in the mirror, and judge how large your face appears to be. If you then extend your arm and measure the height of your face between finger and thumb, you discover the alarming possibility that your face may only be two inches high!

So there it is. A credible explanation for the disparity between the view that I distinctly remember seeing, and the bizarrely distorted, shrunken mountains that appear in the photographs. The retinal image has exactly the same characteristics as the camera image, but I remember the scene as distinctly different because my brain shows it to me in a distinctly different way to how my camera lens ‘sees’ the scene. In my photography classes I did point out that there was a particular feature of the images created by wide-angle lenses - wide-angle lenses accentuate small objects in the foreground, making them appear overly-large and dominant, while large, distant objects become much smaller and appear to recede. However, because I was unaware of the role of the brain (and particularly its Constancy Scaling function), that explanation was essentially a technocratic cover-up for the fact that I was unable to account for why what I saw was not what I photographed.

The next question is, obviously, what can I do about this? But that's for another time – watch this space.

No comments:

Post a Comment