### Perspective, Shadow, Point of View

For an excellent, non-technical introduction to ray-tracing, see William J. Mitchell's The Revisioned Eye, MIT (1993) chapter 7.
This is a brief tutorial/workshop which uses POV raytracings to illustrate the basic parts of perspective and orthographic representations ("projections") of 3-dimensional space. It moves from single objects to arrangements in space. POVRAY3 provides atmospheres and blurrings which can be used to indicate depth, but I have not used them here. These raytracings are thus even less naturalistic than what a camera would provide. Both camera and POV "camera" views are rigidly monocular.

The three most used methods of representing extension and location in three dimensional space are not the devices employed in the graphic at the left but oblique, orthographic, and perspective projections, where perspective means the vanishing point perspective codified by Leon Battista Alberti in the 15th century. For photographic and digitally generated imagery, the latter two schemes are the most important. Cameras make images with vanishing point perspective, and the "cameras" of ray tracing and VRML representation can be set for either orthographic or perspective rendering.

Some of these examples illustrate other points as well; just ignore the extra complexity for the moment. The term viewing plane is used here where Marr uses "frontal plane."

1. Perspective: Perspective has two cardinal traits: convergence of parallels (aka "linear perspective") and foreground exaggeration. With perspective, all parallel lines not parallel to the viewing plane converge to vanishing points, so, in other words, the number of vanishing points varies according to the angle of view. Consider the options with a rectangular solid, say a cube. If the cube is viewed dead on (i.e., with the direction of gze passing through the center of a face at right angles to it) and the face is opaque, the face occludes the other faces and all you see is a square with both pairs of parallel sides parallel to the viewing plane: hence no convergence.

1.1 Convergence to a single point. If you remain directly in front of the cube but shift the viewing point up so that some of the top is visible, those top face receding "parallel" lines will converge to a vanishing point. At left is a ray-traced image of a box frame containing a glass ball--we will be making much use of this rather nice-looking object. For now, the point is the converging edges of the top (and indeed of the checkerboard files as well): the POV ray tracer is working in default mode, which renders with perspective. The angle of view is 110 degrees, approximately that of normal human vision and much more than even a wide-angle camera lens can give. The checkerboard squares are 1x1 units on an xz plane.

1.2 Convergence to two points. What have we here? Now the front face shows convergence —the verticals will meet somewhere down below the floor. The only difference from the previous one is that the focus of view has been set on the center of the object, which tips the viewing plane so that it is no longer parallel to the front face.

Here the view point is moved to the right as well as up, but directed straight ahead, so that the viewing plane remains parallel to the front face. Both the top and the side show convergence (and a sort of stretching) but the front face remains a true square and its verticals appear parallel with the right vertical of the rear face that is visible. That is: as long as the viewing plane is vertical, all vertical lines are parallel to it and will not converge. If we turn the focus in toward the center of the ballbox, as in the next image, none of the faces are parallel and all edges converge. The stretching is peripheral distortion arising from the focus of view being directed past the ballbox. (see Parker and Deregowski, p. 105.)

1.3 Convergence to 3 points. Here none of the faces are square to the viewing plane. The two sets of sides on the top converge at opposite points on the horizon, and the verticals still join at some point beneath the checkerboard floor. There is notable foreshortening culminating in the right front upper vertex, which appears to be pointing at the viewer. This point is nearest the "camera." The floor squares are also elongated by foreshortening. Foreshortening is greater the closer you are, and when one wants to minimize foreshortening in a perspective view, the distance of the camera on the z-axis is usually increased manyfold and then compensated for by narrowing the angle of view, thus achieving a telephoto effect of "flattening." Of course, that also reduces the dramatic convergence of parallels as well. And if you really want to stop the foreshortening and converging, you can also use an orthographic projection instead of perspective.

1.4 Is perspective "the way we see?" Perspective is an idealized, limiting case of three-dimensional perception for two reasons:

• Perspective assumes a unique, stationary point of view. That is, by looking into a scene with perspective, we know where "we" are viewing it from (ground level or higher, to the left or the right of some object in the scene, etc.). As it is said, the scene only looks exactly like that from that point, so each person's point of view would see a slightly different arrangement. This is why, with perspective paintings and with photographs, we imagine ourselves in the position of the artist or camera lens. Also, this is why 20th century artists have complained that perspective is rigid. In any case, we are not entirely monocular, nor are our eyes, head, or body completely stationary for very long. One of the first things people do when they "get" a stereogram, red/blue, or other "3D" image is to move their eyes and heads around just to watch the objects change their relative position. (Beyond a certain distance, of course, binocular displacement dwindles to insignificance.)
• Stephen R. Palmer, Vision Science,
John Willats, title
D. M. Parker and J. B. Deregowski, Perception and Artistic Style, c. 4: The Limits of Perspective
• Perspective does presumably capture features of the way an observed scene is projected onto the retina, but information coming from the eye is often corrected in perception for constancy of shape and size. That is, once we recognise a bounded region as an object of a certain kind, we adjust what we see for shape and scale. Thus, we often do not notice convergence and foreshortening and have to have them pointed out or thrust in our face before we concede that those aspects are indeed distorting what we are seeing. Stephen Palmer describes an tendency of subjects to compromise between a perspectivally distorted depiction and a more normal one even when they know that what was presented was the distorted image. Similarly, John Willats describes classical painters using a camera obsura to force their perception of perspective, especially of exaggerated, foreshortened objects, and in fact takes the presence of such objects as a sign that the painter was probably using a camera obscura; for example, the near corner folds of the table cloth in Vermeer's Music Lesson bulk up rather hugely in foreshortened view, bigger than most people would draw them. When the trained eyes of artists require apparatus to show us what we see, is that what we see? Parker and Deregowski are very good on the non-determination of seeing by perspective.

2. Light and Shadow: If we have directional light in a scene, we have a second set of clues to extension and location in space. And we can go both ways, inferring if we cannot see them the location and nature of the light or lights playing on the scene. The ballbox scene is lighted by two fairly tightly focused spotlights. Can you tell where they are? Nevermind, the next section should make it easier. Is the box resting on the ground? How do you know? Since the box is pretty tight on the ball and the ball appears to project through openings the frame, does that mean the ball projects through the floor? Wouldn't you like to look underneath? (and if you did, blue because?) For a good introduction to shadows in perception and art, check out Roberto Casati's shadowmill. On the Web, one might say, there is a spot light behind your left shoulder, since the default for drop shadow seems to have it cast to the right and down.

3. Movement: Testing how much moving around objects in the world helps to grasp what is represented, I used a java applet to allow you to move the camera eye around the ballbox by moving your cursor in two dimensions. (This takes a little while to load; if it hangs, just hit "reload". ) There are two directions of movement possible.

You can circle the ballbox at a distance of 5 from its center around the y-axis; dragging cursor to the right moves (you) counterclockwise at 20 degree intervals. The red arrows represent the angle of the spotlights.

The other movement possible is from near the floor up to 60 degrees above the floor--all the way around the ballbox. The view angles are as indicated. Taken together, the applet lets you fly around, up and down, the ballbox.

4. Orthographic projection: Orthographic projections have neither convergence of parallels nor foreshortening. They are often used in engineering because they preserve dimensions. these projections are called orthographic (or orthogonic) because the rays from the object are not assumed to converge in the eye but to remain parallel. Natural light would behave this way if viewers were infinitely distant from the object. As noted above, distance increases in perspective projection, foreshortening and convergence decrease. Of particular interest is the isometric projection, in which the camera is rotated 45 degrees on the x-axis and y-axis, so that it is looking directly down the diagonal of an object, say a ballbox, located at the origin.

Left is a standard perspective of the ballbox along the diagonal. Next to it is also a perspective view, but with distant camera and long lens. It looks like the isometric projection below it, but you can see convergence if you follow the diagonals of the floor tiles. Top and bottom edges are somewhat stretched by foreshortening.

This is the isometric, non-perspective view, in which all edges are the same length in the projection. The result is that the lower edges appear longer and the ballbox frame slightly spread. This is something of an optical illusion depending on our expectation of foreshortening from the position above the box. The floor tile show zero convergence.

Much more interesting than individual objects and the viewer's position in seeing them are the ways different projections alter a scene with multiple objects. The first projection on the left is a relatively close perspective view of a nice little still life by Luis Muñoz. Notice that the relatively high, tight angle is bringing about convergence of the verticals. Next to it is another perspective take with longer lens, which so reduces the convergence that the objects lose their Cubist look.

Here once again we have an isometric, non-perspectival view. It is most interesting that the actual position of the camera is the same as in the first, perspectival view. Note the top of the bottle--that too is distinctly Cubist. (See Willat's discussion of Miro's table scene "The Breakfast Table".) I am not particularly fond of this isometric view since it seems to make the bottle very stubby and just arranges objects in a line, where the others, especially the first, evoke a world in which the objects are in use or were so until recently. For a painting with a pure isometric projection see Charles Sheeler's "Interior".

Further Exploration: Check out Camillo Trevisan's excellent free program Cartesio which is available for Windows and Mac systems and runs well under Wine in Linux.

[Back to Photomontage]