Computergraphics.pdf

Learning
View more...
   EMBED

Share

  • Rating

  • Date

    December 1969
  • Size

    1.1MB
  • Views

    559
  • Categories

Preview only show first 6 pages with water mark for full document please download

Transcript

Computer Graphics Emmanuel Kondela January 11, 2013 2 Contents 1 Introduction To Computer Graphics 1.1 Introduction . . . . . . . . . . . . . . . . . 1.2 Basic elements . . . . . . . . . . . . . . . . 1.3 History . . . . . . . . . . . . . . . . . . . . 1.4 Application . . . . . . . . . . . . . . . . . 1.5 Graphic rendering pipeline . . . . . . . . . 1.6 Input devices . . . . . . . . . . . . . . . . 1.7 Designing and Coding Graphics Programs 2 Image Processing 2.1 Image Formation in the Eye and 2.1.1 Eye . . . . . . . . . . . . 2.1.2 Camera models . . . . . 2.2 What is pixel? . . . . . . . . . . 2.3 Primary Color . . . . . . . . . . 2.4 Secondary Color . . . . . . . . . 3 Vector Graphics 3.1 Drawing Algorithms . . . 3.1.1 Resterization . . . 3.1.2 Pixelization . . . . 3.1.3 Scan conversion . . 3.2 Point drawing algorithm . 3.3 Line drawing algorithm . . 3.4 Midpoint circle algorithm 3.5 Ellipse drawing algorithm 7 7 8 9 9 11 13 13 15 15 16 18 20 23 26 29 29 29 31 32 34 34 34 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Clipping 4.1 . . . . . . . . . . 4.2 Point clipping . . 4.3 Line clipping . . 4.4 Polygon clipping CONTENTS 37 38 39 39 40 41 41 42 44 45 45 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Geometric Transformation 5.1 What is this? . . . . . . . . . . . . . . . . 5.2 Euclidean transformation . . . . . . . . . . 5.3 Affine transformation . . . . . . . . . . . . 5.4 Projective transformation . . . . . . . . . 5.5 Matrix Multiplication and Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Curves 47 6.1 Parametric curves . . . . . . . . . . . . . . . . . . . . . . . . . 47 7 Fractals 7.1 Introduction . . . . . 7.2 Generation of fractals 7.3 Geometric fractals . 7.4 Recapitulation . . . . 7.5 Random fractals . . . 7.6 Applications . . . . . 49 49 49 49 49 49 49 51 51 52 52 52 52 52 52 52 52 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Computer Animation 8.1 Introduction . . . . . . . 8.2 Conventional Animation 8.3 Real time vs image . . . 8.4 Animation technique . . 8.5 Rotoscopy . . . . . . . . 8.6 Key framing . . . . . . . 8.7 Spline driven animation 8.8 Morphing . . . . . . . . 8.9 Particle systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Rendering 53 9.1 Illumination model . . . . . . . . . . . . . . . . . . . . . . . . 53 9.2 Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 CONTENTS 9.3 9.4 5 Polygon shading . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Solid models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6 CONTENTS Chapter 1 Introduction To Computer Graphics 1.1 Introduction The term computer graphics describes any use of computers to create and manipulate images. This book introduces the algorithmic and mathematical tools that can be used to create all kinds of imagesrealistic visual effects, informative technical illustrations, or beautiful computer animations. Graphics can be two or three-dimensional; images can be completely synthetic or can be produced by manipulating photographs. This book is about the fundamental algorithms and mathematics, especially those used to produce synthetic images of three-dimensional objects and scenes. Actually doing computer graphics inevitably requires knowing about specific hardware, file formats, and usually a graphics API (see Section 1.3) or two. Computer graphics is a rapidly evolving field, so the specifics of that knowledge are a moving target. Therefore, in this book we do our best to avoid depending on any specific hardware or API. Readers are encouraged to supplement the text with relevant documentation for their software and hardware environment. Fortunately, the culture of computer graphics has enough standard terminology and concepts that the discussion in this book should map nicely to most environments. This chapter defines some basic terminology, and provides some historical background as well as information sources related to computer graphics. 7 8 CHAPTER 1. INTRODUCTION TO COMPUTER GRAPHICS Figure 1.1: modeling Computer graphics is the use of computer to define, store, manipulate, interrogate, and present pictorial output. Although computer graphics is a vast field that encompasses almost any graphical aspect, we are mainly interested in the generation of images of 3-dimensional scenes. Computer imagery has applications for film special effects, simulation and training, games, medical imagery, flying logos, etc. Computer graphics relies on an internal model of the scene, that is, a mathematical representation suitable for graphical computations. The model describes the 3D shapes, layout and materials of the scene. This 3D representation then has to be projected to compute a 2D image from a given viewpoint, this is the rendering step. Rendering involves projecting the objects (perspective), handling visibility (which parts of objects are hidden) and computing their appearance and lighting interactions. Finally, for animated sequence, the motion of objects has to be specified. We will discuss animation in this document. 1.2 Basic elements • Modeling - shape (geometry) • Rendering - display (shading, illumination, color, texture etc.) • Animation - movement (dynamics) 1.3. HISTORY 9 1.3 History • 963: Sutherland First Graphics Workstation • 1969: First SIGGRAPH (ACM) • Early 1970s: Raster Graphics, Shading, Illumination • Late 1970s: Texture Mapping, Ray Tracing • Early 1980s: Realism in Rendering • Late 1980s: Physically Based Animation • 1989: Tin Toy (Pixar) wins Academy Award • 1990s: Interaction, Scientific Visualization, Virtual Reality, Augmented Reality, Multimedia, etc. • 2000s: Real-time Visualization of Large Data Sets, Data Compression, Vision and Graphics, etc. 1.4 Application Almost any endeavor can make some use of computer graphics, but the major consumers of computer graphics technology include the following industries: • Video games increasingly use sophisticated 3D models and rendering al- gorithms. • Cartoons are often rendered directly from 3D models. Many traditional 2D cartoons use backgrounds rendered from 3D models, which allows a continuously moving viewpoint without huge amounts of artist time. • Visual effects use almost all types of computer graphics technology. Almost every modern film uses digital compositing to superimpose backgrounds with separately filmed foregrounds. Many films also use 3D mod- eling and animation to create synthetic environments, objects, and even characters that most viewers will never suspect are not real. • Animated films use many of the same techniques that are used for visual effects, but without necessarily aiming for images that look real. 10 CHAPTER 1. INTRODUCTION TO COMPUTER GRAPHICS • CAD/CAM stands for computer-aided design and computer-aided manufacturing. These fields use computer technology to design parts and prod- ucts on the computer and then, using these virtual designs, to guide the manufacturing process. For example, many mechanical parts are designed in a 3D computer modeling package and then automatically produced on a computer-controlled milling device. • Simulation can be thought of as accurate video gaming. For example, a flight simulator uses sophisticated 3D graphics to simulate the experience of flying an airplane. Such simulations can be extremely useful for initial training in safety-critical domains such as driving, and for scenario training for experienced users such as specific fire-fighting situations that are too costly or dangerous to create physically. • Medical imaging creates meaningful images of scanned patient data. For example, a computed tomography (CT) dataset is composed of a large 3D rectangular array of density values. Computer graphics is used to create shaded images that help doctors extract the most salient information from such data. • Information visualization creates images of data that do not necessarily have a natural visual depiction. For example, the temporal trend of the price of ten different stocks does not have an obvious visual depiction, but clever graphing techniques can help humans see the patterns in such data. • Computational biology • Computational physics • Computer-aided design • Computer simulation • Digital art • Education • Graphic design • Infographics 1.5. GRAPHIC RENDERING PIPELINE 11 Figure 1.2: Figure shows example of computer animation produced using motion capture • Information visualization • Rational drug design • Scientific visualization • Video Games • Virtual reality • Web design 1.5 Graphic rendering pipeline Every desktop computer today has a powerful 3D graphics pipeline. This is a special software/hardware subsystem that efficiently draws 3D primitives in perspective. Usually these systems are optimized for processing 3D triangles with shared vertices. The basic operations in the pipeline map the 3D vertex locations to 2D screen positions and shade the triangles so that they both look realistic and appear in proper back-to-front order. Although drawing the triangles in valid back-to-front order was once the most important research issue in computer graphics, it is now almost always solved using the z-buffer, which uses a special memory buffer to solve the problem in a brute-force manner. 12 CHAPTER 1. INTRODUCTION TO COMPUTER GRAPHICS It turns out that the geometric manipulation used in the graphics pipeline can be accomplished almost entirely in a 4D coordinate space composed of three traditional geometric coordinates and a fourth homogeneous coordinate that helps with perspective viewing. These 4D coordinates are manipulated using 4 4 matrices and 4-vectors. The graphics pipeline, therefore, contains much machinery for efficiently processing and composing such matrices and vectors. This 4D coordinate system is one of the most subtle and beautiful constructs used in computer science, and it is certainly the biggest intellectual hurdle to jump when learning computer graphics. A big chunk of the first part of every graphics book deals with these coordinates. The speed at which images can be generated depends strongly on the number of triangles being drawn. Because interactivity is more important in many applications than visual quality, it is worthwhile to minimize the number of triangles used to represent a model. In addition, if the model is viewed in the distance, fewer triangles are needed than when the model is viewed from a closer distance. This suggests that it is useful to represent a model with a varying level of detail (LOD). Rendering is the convesrion of a scene into an image. Most implementations of OpenGL have a similar order of operations, a series of processing stages called the OpenGL rendering pipeline. This ordering, as shown in Figure 1-2, is not a strict rule of how OpenGL is implemented but provides a reliable guide for predicting what OpenGL will do. If you are new to threedimensional graphics, the upcoming description may seem like drinking water out of a fire hose. You can skim this now, but come back to Figure 1-2 as you go through each chapter in this book. The following diagram shows the Henry Ford assembly line approach, which OpenGL takes to processing data. Geometric data (vertices, lines, and polygons) follow the path through the row of boxes 1.6. INPUT DEVICES 13 Figure 1.3: Rendering pipeline 1.6 Input devices Physical Devices: Logical Devices: Locator Valuator Pick Choice String Keyboard, Mouse, tablet, etc. High level interface with the user program Tablet Dial Light Pen Function Keys Keyboard 1.7 Designing and Coding Graphics Programs Certain common strategies are often useful in graphics programming. In this section we provide some advice that you may find helpful as you implement the methods you learn about in this book. A key part of any graphics program is to have good classes or routines for geometric entities such as vectors and matrices, as well as graphics entities such as RGB colors and images. These routines should be made as clean and efficient as possible. A universal design question is whether locations and displacements should be separate classes because they have different operations, e.g., a location multiplied by one-half makes no geometric sense 14 CHAPTER 1. INTRODUCTION TO COMPUTER GRAPHICS while one-half of a displacement does (Goldman, 1985; DeRose, 1989). There is little agreement on this question, which can spur hours of heated debate among graphics practitioners, but for the sake of example lets assume we will not make the distinction. This implies that some basic classes to be written include: • vector2. A 2D vector class that stores an x- and y-component. It should store these components in a length-2 array so that an indexing operator can be well supported. You should also include operations for vector addition, vector subtraction, dot product, cross product, scalar multiplication, and scalar division. • vector3. A 3D vector class analogous to vector2. • hvector. A homogeneous vector with four components. • rgb. An RGB color that stores three components. You should also include operations for RGB addition, RGB subtraction, RGB multiplication, scalar multiplication, and scalar division. • transform. A 4 4 matrix for transformations. You should include a matrix multiply and member functions to apply to locations, directions, and surface normal vectors. As shown in Chapter 6, these are all different. • image. A 2D array of RGB pixels with an output operation. In addition, you might or might not want to add classes for intervals, orthonormal bases, and coordinate frames. Chapter 2 Image Processing 2.1 Image Formation in the Eye and Camera Biological vision is the process of using light reflected from the surrounding world as a way of modifying behavior. Generally, with humans, we say that the surrounding environment is interpreted by visual input. This usually implies some form of conscious understanding of the 3D world from the 2D projection that it forms on the retina of the eye. However much of our visual computation is carried out unconsciously and often our interpretations can be fallacious. In this section we will briefly overview the human visual system and try to understand the ways in which this system uses computation as a means of interpreting its input. To do this, we will also try to understand how images are formed on cameras, how images are stored in computers, and how computations can be carried out efficiently. Although not strictly correct, this analogy between machine vision and biological vision is currently the best model available. Moreover, the models interact in an ever increasing fashion: we use the human visual system as an existence proof that visual interpretation is even possible in the first place, and its response to optical illusions as a way to guide our development of algorithms that replicate the human system; and we use our understanding of machine vision and our ability to generate ever more complex computer images as a way of modifying, or evolving, our visual system in its efforts to interpret the visual world. For example, image morphing scenes would be difficult to interpret for the naive viewer. 15 16 CHAPTER 2. IMAGE PROCESSING Figure 2.1: Sketch of a cross-section of the eye 2.1.1 Eye Any understanding of the function of the human eye serves as an insight into how machine vision might be solved. Indeed, it was some of the early work by Hubel and Wiesel on the receptive fields in the retina that has led to the fundamental operation of spatial filtering that nowadays dominates so much of early image processing. There are many good references to the function of the eye, although Frisby gives an excellent overview with a computational flavour. The eye is considered by most neuroscientists as actually part of the brain. It consists of a small spherical globe of about 2cm in diameter, which is free to rotate under the control of 6 extrinsic muscles. Light enters the eye through the transparent cornea, passes through the aqueous humor, the lens, and the vitreous humor, where it finally forms an image on the retina. It is the muscular adjustment of the lens, known as accommodation, that focuses the image directly on the retina. If this adjustment is not correctly accomplished, the viewer suffers from either nearsightedness or farsightedness. Both conditions are easily corrected with optical lenses. The retina itself is a complex tiling of photoreceptors. These photoreceptors are known as rods and cones. When these photoreceptors are stimulated by light, they produce electrical signals that are transmitted to the brain via the optic nerve. The location of the optic nerve on the retina obviously prohibits the existence of photoreceptors at this point. This point is known as the blind 2.1. IMAGE FORMATION IN THE EYE AND CAMERA 17 Figure 2.2: Sketch of a cross-section of the retina spot and any light that falls upon it is not perceived by the viewer. Most people are unaware of their blind spot, although it is easy to demonstrate that it exists. And its existence has been known about for many years as it is reputed that the executioners in France during the revolution used to place their victim so that his or her head fell onto the blind spot, thus illiciting a pre-guillotine perception of the poor character without a head. The rods and cones do not have a continuous physical link to the optic nerve fibres. Rather, they communicate through three distinct layers of cells, via junctions known as synapses. These layers of cells connect the rods and cones to the ganglion cells, which respond to the photostimulus according to a certain receptive field. We can see from Figure 2 that the rods and cones are at the back of the retina. Thus the light passes through the various cell layers to these receptive fields, and is then transmitted via various synaptic junctions back towards the optic nerve fibre. It has long been known that the spatial organization of some of the receptive fields of the retinal ganglion cells is circularly symmetric, with either an excitatory central region and an inhibitory surround, or with an inhibitory centre and an excitatory surround. Such cells are known to computational vision researchers as “mexican-hat operators”. Their existence inspired Marr 18 CHAPTER 2. IMAGE PROCESSING [6] to develop a computational approach to physiological vision understanding, since their response to a light signal was analogous to the convolution of the signal with the second derivative of a Gaussian. Marr’s theory of edge detection will be explored in a later lecture. However, there are also other spatial organisations of receptive fields. There are orientationally selective receptive fields, with an excitatory lobe to one side and an inhibitory lobe on the other, so as to form an antisymmetric field. These cells exist over a wide variety of orientations. Both the even and odd symmetric receptive fields are known as simple cells. They respond strongly to luminance edges and line features in images, respectively. Other types of receptive fields, known as complex cells, are also found in the retina. Their behaviour is more complex, combining in a non-linear fashion the responses of the even and odd symmetric filter responses. And end-stopped cells appear to act as simple differentiation operators. How the behaviour of these cells might be combined into a single computational model will be outlined in a future lecture. The human eye is a remarkable organ, whose sensitivity and performance characteristics approach the absolute limits set by quantum physics. The eye is able to detect as little as a single photon as input, and is capable of adjusting to ranges in light that span many orders of magnitude. No camera has been built that even partially matches this performance. Very little is known about what happens to the optic signal once it begins its voyage down the optic nerve. The optic nerve has inputs arriving from both the left and right sides of both eyes, and these inputs split and merge at the optic chiasma. Moreover, what is seen by one eye is slightly different from what is seen by the other, and this difference is used to deduce depth in stereo vision. From the optic chiasma, the nerve fibres proceed in two groups to the striate cortex, the seat of visual processing in the brain. A large proportion of the striate cortex is devoted to processing information from the fovea. 2.1.2 Camera models To understand how vision might be modeled computationally and replicated on a computer, we need to understand the image acquisition process. The role of the camera in machine vision is analogous to that of the eye in biological systems. The pinhole camera is the simplest, and the ideal, model of camera function. It has an infinitesimally small hole through which light enters before forming an inverted image on the camera surface facing the hole. To simplify things, we usually model a pinhole camera by placing the image 2.1. IMAGE FORMATION IN THE EYE AND CAMERA 19 Figure 2.3: Perspective projection in the pinhole camera model plane between the focal point of the camera and the object, so that the image is not inverted. This mapping of three dimensions onto two, is called a perspective projection, and perspective geometry is fundamental to any understanding of image analysis. Euclidean geometry is a special case of perspective geometry, and the use of perspective geometry in computer vision makes for a simpler and more elegant expression of the computational processes that render vision possible. A superb overview of the geometric viewpoint in computer vision is given by Faugeras. A perspective projection is the projection of a three-dimensional object onto a two-dimensional surface by straight lines that pass through a single point. Simple geometry shows that if we denote the distance of the image plane to the centre of projection by f, then the image coordinates (xi,yi) are related to the object coordinates (xo,yo,zo) by xi = f x zo o and yi = f y. zo o These equations are non-linear. They can be made linear by introducing homogeneous transformations, which is effectively just a matter of placing the Euclidean geometry into the perspective framework. Each point (x,y,z) in three-space is mapped onto a line in four-space given by (wx,wy,wz,w), where w is a dummy variable that sweeps out the line (w = 0). In homogeneous coordinates, the perspective projection onto the plane is given by We now introduce some notation that will be useful in latter sections. The projective plane is used to model the image plane. A point in the plane is represented by a 3-vector (x1, x2, x3) of numbers not all zero. They define a vector x up to a scale factor. A line l is also defined by a triplet of numbers (u1, u2, u3), not all zero, and satisfies the equation u1x + u2y + u3 = 0. 20 CHAPTER 2. IMAGE PROCESSING A point on a line is given by the relations l.x = 0 or lT x = 0 or xT l = 0. Two points define a line by the equation l = p ∧ q,where ∧ denotes the vector product. Likewise, two lines define a point by the equation x = l ∧ m. This duality between lines and points in the image plane is often exploited in homogeneous notation. As well, a vector product can also be written in matrix notation, by writing the vector as a skew-symmetric matrix. Thus, Likewise, projective space is used as a model for Euclidean 3-space. Here, points and planes are represented by quadruplets of numbers not all zero, and are duals of each other. 2.2 What is pixel? A computer image is usually represented as a discrete grid of picture elements known as pixels (picture elements). It is smallest unit of picture that can be presented or controlled. Each pixel has its address. The address of a pixel corresponds to its coordinates. Pixels are normally arranged in ana array of locations (2-dimensional grid), and are represented using dots or squares. Each pixel is a sample of the original image, more samples provide more accurate representations of the original image. The number of pixels determines the resolution of the image, though has a more specific definition. Pixel counts can be expressed as a single number, as in a ”three-megapixel” digital camera, which has a nominal three million pixels, or as a pair of numbers, as in a ”640 by 480 display”, which has 640 pixels from side to side and 480 from top to bottom (as in a VGA display), and therefore has a total number of 640 480 = 307,200 pixels or 0.3 megapixels. Typical resolutions range from 320*200 to 2000*1500. The pixels, or color samples, that form a digitized image (such as a JPEG file used on a web page) may or may not be in one-to-one correspondence with screen pixels, depending on how a computer displays an image. In computing, an image composed of pixels is known as a bitmapped image or a raster image. The word raster originates from television scanning patterns, and has been widely used to describe similar halftone printing and storage techniques. 2.2. WHAT IS PIXEL? 21 Figure 2.4: Pixels on color image. Computers can use pixels to display an image, often an abstract image that represents a GUI. The resolution of this image is called the display resolution and is determined by the video card of the computer. LCD monitors also use pixels to display an image, and have a native resolution. Each pixel is made up of triads, with the number of these triads determining the native resolution. On some CRT monitors, the beam sweep rate may be fixed, resulting in a fixed native resolution. Most CRT monitors do not have a fixed beam sweep rate, meaning they do not have a native resolution at all, instead they have a set of resolutions that are equally well supported. To produce the sharpest images possible on an LCD, the user must ensure the display resolution of the computer matches the native resolution of the monitor. As stated above, if the total number of pixels horizontal and vertical increase this means the resolution of rbg color image increase and the reduction of total number of pixels means poor resolution. The intensity of pixel is a variable, for a black and white image, a number describes the intensity of each pixel. It can be expressed between 0.0 (black) and 1.0 (white). However, for internal binary representation reasons, it is usually stored as an integer between 0 (black) and 255 (white). 22 CHAPTER 2. IMAGE PROCESSING Figure 2.5: Geometry of color elements of various CRT and LCD displays. The number of distinct colors that can be represented by a pixel depends on the number of bits per pixel (bpp). A 1 bpp image uses 1-bit for each pixel, so each pixel can be either on or off. Each additional bit doubles the number of colors available, so a 2 bpp image can have 4 colors, and a 3 bpp image can have 8 colors: • 1 bpp, 21 = 2 colors (monochrome) • 2 bpp, 22 = 4 colors • 3 bpp, 23 = 8 colors • 8 bpp, 28 = 256 colors • 16 bpp, 216 = 65,536 colors (”Highcolor” ) • 24 bpp, 224 = 16.8 million colors (”Truecolor”) In color image systems, a color is typically represented by three or four component intensities such as red, green, and blue, (RGB color space) or cyan, magenta, yellow, and black (CMYK color space). 2.3. PRIMARY COLOR 23 For RBG color space image, the intensity of each pixel is an average value of red, green and blue intensities. this is to say, RBG color components have 256 grayscale levels each, (0 - 255) for red channel, (0 - 255) for green channel and (0 - 255) for blue channel. Thus with an 8-bits for red, 8-bits for green and 8-bits for blue a pixel consists of a total number of 24-bits per every pixel. A 24-bit depth allows 8 bits per component. On some systems, 32-bit depth is available: this means that each 24-bit pixel has an extra 8 bits to describe its opacity (for purposes of combining with another image). For a color image, each pixel is described by a triple of numbers representing the intensity of red, green and blue. For example, pure red is (255, 0, 0) and purple is (255, 0, 255). For color depths of 15 or more bits per pixel, the depth is normally the sum of the bits allocated to each of the red, green, and blue components. Highcolor, usually meaning 16 bpp, normally has five bits for red and blue, and six bits for green, as the human eye is more sensitive to errors in green than in the other two primary colors. For applications involving transparency, the 16 bits may be divided into five bits each of red, green, and blue, with one bit left for transparency. Because the image is represented by a discrete array of pixels, aliasing problems may occur. The most classical form of aliasing is the jaggy aspect of lines. Antialiasing techniques are thus required. In the case of the line, it consists in using intermediate gray levels to smooth the appearance of the line. Another form of aliasing can be observed on television when people wear shirts with a fine stripped texture. A flickering pattern is observed because the size of the pattern is on the same order of magnitude as the pixel size. 2.3 Primary Color Primary colors are sets of colors that can be combined to make a useful range of colors. For human applications, three primary colors are usually used, since human color vision is trichromatic. For additive combination of colors, as in overlapping projected lights or in CRT displays, the primary colors normally used are red, green, and blue. For subtractive combination of colors, as in mixing of pigments or dyes, such as in printing, the primaries 24 CHAPTER 2. IMAGE PROCESSING Figure 2.6: Ant-aliasing and aliasing problem Figure 2.7: Additive color mixing normally used are cyan, magenta, and yellow, though the set of red, yellow, blue is popular among artists. See RGB color model, CMYK color model, and RYB color model for more on these popular sets of primary colors. Any particular choice for a given set of primary colors is derived from the spectral sensitivity of each of the human cone photoreceptors; three colors that fall within each of the sensitivity ranges of each of the human cone cells are red, green, and blue. Other sets of colors can be used, though not all will well approximate the full range of color perception. For example, an early color photographic process, autochrome, typically used orange, green, and violet primaries. However, unless negative amounts of a color are allowed the gamut will be restricted by the choice of primaries. The combination of any two primary colors creates a secondary color. The most commonly used additive color primaries are the secondary colors of the most commonly used subtractive color primaries, and vice versa. Primary colors are not a fundamental property of light but are related to the physiological response of the eye to light. Fundamentally, light is a con- 2.3. PRIMARY COLOR 25 tinuous spectrum of the wavelengths that can be detected by the human eye, an infinite-dimensional stimulus space. However, the human eye normally contains only three types of color receptors, called cone cells. Each color receptor responds to different ranges of the color spectrum. Humans and other species with three such types of color receptors are known as trichromats. These species respond to the light stimulus via a three-dimensional sensation, which generally can be modeled as a mixture of three primary colors. Before the nature of colorimetry and visual physiology were well understood, scientists such as Thomas Young, James Clark Maxwell, and Hermann von Helmholtz expressed various opinions about what should be the three primary colors to describe the three primary color sensations of the eye. Young originally proposed red, green, and violet, and Maxwell changed violet to blue; Helmholtz proposed ”a slightly purplish red, a vegetation-green, slightly yellowish (wave-length about 5600 tenth-metres), and an ultramarine-blue (about 4820)”. In modern understanding, the human cone cells do not correspond to any real primary colors. Species with different numbers of receptor cell types would have color vision requiring a different number of primaries. For example, for species known as tetrachromats, with four different color receptors, one would use four primary colors. Since humans can only see to 380 nanometers (violet), but tetrachromats can see into the ultraviolet to about 300 nanometers, this fourth primary color for tetrachromats is located in the shorter-wavelength range. Many birds and marsupials are tetrachromats, and it has been suggested that some human females are tetrachromats as well, having an extra variant version of the long-wave (L) cone type. The peak response of human color receptors varies, even among individuals with ”normal” color vision;[12] in non-human species this polymorphic variation is even greater, and it may well be adaptive.[13] Most placental mammals other than primates have only two types of color receptors and are therefore dichromats; to them, there are only two primary colors. It would be incorrect to assume that the world ”looks tinted” to an animal (or human) with anythig other than the human standard of three color receptors. To an animal (or human) born that way, the world would look normal to it, but the animal’s ability to detect and discriminate colors would be different from that of a human with normal color vision. If a human and an animal both look at a natural color, they see it as natural; however, if both look at a color reproduced via primary colors, such as on a color television screen, the human may see it as matching the natural color, while the animal does not, since the primary colors have been chosen to suit human capabilities. 26 CHAPTER 2. IMAGE PROCESSING Color practice technology is usefully contrasted with color theory science because science assumes perfect conditions, whereas commercially available products must deliver impressive results at affordable prices. Some recent TV and computer displays are starting to add a fourth ”primary” of yellow, often in a four point square pixel area, to get brighter pure yellows and larger color gamut. Even the four-primary technology does not yet reach the range of colors the human eye is theoretically capable of perceiving (as defined by the sample-based estimate called the Pointer Gamut[17]), with 4-primary LED prototypes providing typically about 87% and 5-primary prototypes about 95%. Several firms, including Samsung and Mitsubishi, have demonstrated LED displays with five or six primaries, or color LED point light sources per pixel. A recent academic literature review claims a gamut of 99% can be achieved with 5-primary LED technology. While technology for achieving a wider gamut appears to be within reach, other issues remain, for example affordability, dynamic range, brilliance. An even bigger problem is that there exists hardly any source material recorded in this wider gamut, nor is it possible to somehow recover this information in existing pictures, as it was never stored. Regardless, industry is still exploring a wide variety of ”primary” active light sources (per pixel) with the goal of matching the capability of human color perception within a broadly affordable price. One example of a potentially affordable, but yet unproven active light hybrid places a LED screen over a plasma light screen, each with different ”primaries”. Because both LED and plasma technologies are many decades old (plasma pixels going back to the 1960s) and because sales are verging on a billion, both have become so affordable that they could be combined. 2.4 Secondary Color In the printing industry, to produce the varying colors the subtractive primaries cyan, magenta, and yellow are applied together in varying amounts. Before the color names cyan and magenta were in common use, these primaries were often known as blue-green and purple, or in some circles as blue and red, respectively, and their exact color has changed over time with access to new pigments and technologies.[ Mixing yellow and cyan produces green colors; mixing yellow with magenta produces reds, and mixing magenta with cyan produces blues. In theory, mixing equal amounts of all three pigments 2.4. SECONDARY COLOR 27 Figure 2.8: Subtractive color mixing should produce grey, resulting in black when all three are applied in sufficient density, but in practice they tend to produce muddy brown colors. For this reason, and to save ink and decrease drying times, a fourth pigment, black, is often used in addition to cyan, magenta, and yellow. The resulting model is the so-called CMYK color model. The abbreviation stands for cyan, magenta, yellow, and keyblack is referred to as the key color, a shorthand for the key printing plate that impressed the artistic detail of an image, usually in black ink. In practice, colorant mixtures in actual materials such as paint tend to be more complex. Brighter or more saturated colors can be created using natural pigments instead of mixing, and natural properties of pigments can interfere with the mixing. For example, mixing magenta and green in acrylic creates a dark cyansomething which would not happen if the mixing process were perfectly subtractive. In the subtractive model, adding white to a color, whether by using less colorant or by mixing in a reflective white pigment such as zinc oxide, does not change the color’s hue but does reduce its saturation. Subtractive color printing works best when the surface or paper is white, or close to it. A system of subtractive color does not have a simple chromaticity gamut analogous to the RGB color triangle, but a gamut that must be described in three dimensions. There are many ways to visualize such models, using various 2D chromaticity spaces or in 3D color spaces. 28 CHAPTER 2. IMAGE PROCESSING Chapter 3 Vector Graphics 3.1 Drawing Algorithms Vector graphics is the use of geometrical primitives such as points, lines, curves, and shapes or polygon(s), which are all based on mathematical expressions, to represent images in computer graphics. Vector, in this context, implies more than a straight line. Vector graphics are based on images made up of vectors (also called paths, or strokes) which lead through locations called control points. Each of these points has a definite position on the x and y axes of the work plan. Each point, as well, is a variety of database, including the location of the point in the work space and the direction of the vector (which is what defines the direction of the track). Each track can be assigned a color, a shape, a thickness and also a fill. This does not affect the size of the files in a substantial way because all information resides in the structure; it describes how to draw the vector. There are three drawing algorithms such as rasterization, pixelization and scan conversion. 3.1.1 Resterization Rasterisation (or rasterization) is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (pixels or dots) for output on a video display or printer, or for storage in a bitmap file format. 29 30 CHAPTER 3. VECTOR GRAPHICS In normal usage, the term refers to the popular rendering algorithm for displaying three-dimensional shapes on a computer. Rasterization is currently the most popular technique for producing real-time 3D computer graphics. Real-time applications need to respond immediately to user input, and generally need to produce frame rates of at least 24 frames per second to achieve smooth animation. Compared with other rendering techniques such as ray tracing, rasterization is extremely fast. However, rasterization is simply the process of computing the mapping from scene geometry to pixels and does not prescribe a particular way to compute the color of those pixels. Shading, including programmable shading, may be based on physical light transport, or artistic intent. The process of rasterising 3D models onto a 2D plane for display on a computer screen is often carried out by fixed function hardware within the graphics pipeline. This is because there is no motivation for modifying the techniques for rasterisation used at render time and a special-purpose system allows for high efficiency. In computer graphics, a raster graphics image, or bitmap, is a data structure representing a generally rectangular grid of pixels, or points of color, viewable via a monitor, paper, or other display medium. Raster images are stored in image files with varying formats. The convention in this course will follow that of OpenGL, placing the origin in the lower left corner, with that pixel being at location (0, 0). Be aware that placing the origin in the upper left is another common convention. One of 2N intensities or colors are associated with each pixel, where N is the number of bits per pixel. Greyscale typically has one byte per pixel, for 28 = 256 intensities. Color often requires one byte per channel, with three color channels per pixel: red, green, and blue. Color data is stored in a frame buffer. This is sometimes called an image map or bitmap. A bitmap corresponds bit-for-bit with an image displayed on a screen, generally in the same format used for storage in the display’s video memory, or 3.1. DRAWING ALGORITHMS 31 Figure 3.1: Raster image maybe as a device-independent bitmap. A bitmap is technically characterized by the width and height of the image in pixels and by the number of bits per pixel (a color depth, which determines the number of colors it can represent). 3.1.2 Pixelization Pixelization is a video and image editing technique in which an image is blurred by displaying part or all of it at a markedly lower resolution. It is primarily used for censorship. The effect is a standard graphics filter, available in all but the most basic bitmap graphics editors. A familiar example of pixelization can be found in television news and documentary productions, in which vehicle license plates and faces of suspects at crime scenes are routinely obscured to maintain the presumption of innocence, as in the television series COPS. Bystanders and others who do not 32 CHAPTER 3. VECTOR GRAPHICS sign release forms are also customarily pixelized. Footage of nudity (including genitalia, buttocks, or breasts) is likewise obscured in some media: before the watershed in many countries, in newspapers or general magazines, or in places in which the public cannot avoid seeing the image (such as on billboards). Drug references, as well as gestures considered obscene (such as the finger) may also be censored in this manner. Pixelization is not usually used for this purpose in films, DVDs, subscription television services, pornography (except for countries in which the law requires it). When obscene language is censored by an audible bleep, the mouth of the speaker may be pixelized to prevent lip reading, often as in COPS graphic injuries and excess blood will be pixelized. Pixelization may also be used to avoid unintentional product placement, or to hide elements that would date a broadcast, such as date and time stamps on home video submissions. Censorship for such purposes is most common on reality television series. 3.1.3 Scan conversion Scan conversion is the process of converting basic, low level objects into their corresponding pixel map representations. This is often an approximation to the object, since the frame buffer is a discrete grid. The first problem to consider is whether or not to draw a pixel at all. For a pixel to be rendered, it must be within a triangle, and it must not be occluded, or blocked by another pixel. There are a number of algorithms to fill in pixels inside a triangle, the most popular of which is the scanline algorithm. Since it is difficult to know that the rasterization engine will draw all pixels from front to back, there must be some way of ensuring that pixels close to the viewer are not overwritten by pixels far away. A z buffer is the most common solution. The z buffer is a 2d array corresponding to the image plane which stores a depth value for each pixel. Whenever a pixel is drawn, it updates the z buffer with its depth value. Any new pixel must check its depth value against the z buffer value before it is drawn. Closer pixels are drawn and farther pixels are disregarded. 3.1. DRAWING ALGORITHMS 33 To find out a pixel’s color, textures and shading calculations must be applied. A texture map is a bitmap that is applied to a triangle to define its look. Each triangle vertex is also associated with a texture and a texture coordinate (u,v) for normal 2-d textures in addition to its position coordinate. Every time a pixel on a triangle is rendered, the corresponding texel (or texture element) in the texture must be found. This is done by interpolating between the triangles vertices associated texture coordinates by the pixels onscreen distance from the vertices. In perspective projections, interpolation is performed on the texture coordinates divided by the depth of the vertex to avoid a problem known as perspective foreshortening (a process known as perspective texturing). Before the final color of the pixel can be decided, a lighting calculation must be performed to shade the pixels based on any lights which may be present in the scene. There are generally three light types commonly used in scenes. Directional lights are lights which come from a single direction and have the same intensity throughout the entire scene. In real life, sunlight comes close to being a directional light, as the sun is so far away that rays from the sun appear parallel to Earth observers and the falloff is negligible. Point lights are lights with a definite position in space and radiate light evenly in all directions. Point lights are usually subject to some form of attenuation, or fall off in the intensity of light incident on objects farther away. Real life light sources experience quadratic falloff. Finally, spotlights are like real-life spotlights, with a definite point in space, a direction, and an angle defining the cone of the spotlight. There is also often an ambient light value that is added to all final lighting calculations to arbitrarily compensate for global illumination effects which rasterization can not calculate correctly. There are a number of shading algorithms for rasterizers. All shading algorithms need to account for distance from light and the normal vector of the shaded object with respect to the incident direction of light. The fastest algorithms simply shade all pixels on any given triangle with a single lighting value, also known as flat shading. There is no way to create the illusion of smooth surfaces this way, except by subdividing into many small triangles. Algorithms can also separately shade vertices, and interpolate the lighting value of the vertices when drawing pixels. This is known as Gouraud shading. The slowest and most realistic approach is to calculate lighting sep- 34 CHAPTER 3. VECTOR GRAPHICS arately for each pixel, also known as Phong shading. This performs bilinear interpolation of the normal vectors and uses the result to do local lighting calculation. 3.2 3.3 Point drawing algorithm Line drawing algorithm Set the color of pixels to approximate the appearance of a line from (x0 , y0) to (x1 , y1). It should be • straight and pass through the end points. • independent of point order. • uniformly bright, independent of slope. The explicit equation for a linear line is given as follows y = mx + b. • Given two points (x0, y 0) and (x1, y 1) that lie on a line, we can solve for m and b for the line. • Consider y 0 = mx0 + b and y 1 = mx1 + b. • Subtract y0 from y1 to solve for m = y 1−y 0 x1−x0 and b = y 0 − mx0. • Substituting in the value for b, this equation can be written as y = m(x − x0) + y 0. Consider this simple line drawing algorithm as follows; int x float m, y m = (y1 - y0) / (x1 - x0) for (x = x0; x ¡= x1; ++x) y = m * (x - x0) + y0 setpixel(x, round(y), linecolor) 3.4 Midpoint circle algorithm x = h + rcos y = k + rsin Recall that this looks like 3.4. MIDPOINT CIRCLE ALGORITHM 35 where r is the radius of the circle, and h,k are the coordinates of the center. What these equation do is generate the x,y coordinates of a point on the circle given an angle (theta). The algorithm starts with theta at zero, and then loops adding an increment to theta each time round the loop. It draws straight line segments between these successive points on the circle. The circle is thus drawn as a series of straight lines. If the increment is small enough, the result looks like a circle to the eye, even though in strict mathematical terms is is not. Below is the algorithm in pseudocode showing the basic idea. theta = 0; // angle that will be increased each loop h = 12 // x coordinate of circle center k = 10 // y coordinate of circle center step = 15; // amount to add to theta each time (degrees) repeat until theta >= 360; x = h + r*cos(theta) y = k + r*sin(theta) draw a line to x,y add step to theta The decision about how big to make the step size is a tradeoff. If it is very small, many lines will be drawn for a smooth circle, but there will be more computer time used to do it. If it is too large the circle will not be smooth and be visually ugly. 36 CHAPTER 3. VECTOR GRAPHICS 3.5 Ellipse drawing algorithm The circles can be made into ellipses by simply ”squashing” them in one direction or the other. For an ellipse that is wider than it is tall, be divide the y coordinate by a number (here 2) to reduce its height. The two inner calculations would then look like: varx = h + r ∗ M ath.cos(theta); vary = k − 0.5 ∗ r ∗ M ath.sin(theta); Changing the 0.5 will alter how much the vertical size is squashed. Multiply the y term this way will make an ellipse that is tall and narrow. Chapter 4 Clipping In this chapter, you are introduced to the concepts of handling pictures that are larger than the available display screen size, since any part of the picture that lies beyond the confines of the screen cannot be displayed. We compare the screen to a window, which allows us to view only that portion of the scene outside, as the limits of the window would permit. Any portion beyond that gets simply blocked out. But in graphics, this blocking out is to be done by algorithms that decide point beyond which the picture should not be shown. This concept is called clipping. Thus, we are clipping a picture so that it becomes viewable on a window. The other related concept is the windowing transformation. It is not always necessary that you clip off the larger parts of the picture. You may resolve to zoom it to lower sizes and still present the whole picture. This concept is called windowing. Here you are not cutting off the parts beyond the screen size, but are trying to prepare them to a size where they become displayable on the screen. Again, such a prepared picture need not occupy the complete window. In fact, it may be possible for you to divide the screen into 2 or more windows, each showing a different picture. Then, the pictures will be prepared to be fitted not to the entire screen, but to their respective windows. Since all these operations are done at run time, it is necessary that the algorithms will have to be very fast and efficient. Cliping is basically extraction. Once triangle vertices are transformed to their proper 2D picture locations, some of these locations may be outside the viewing window, or the area on the screen to which pixels will actually be written. Clipping is the process of truncating triangles to fit them inside the 37 38 viewing area. CHAPTER 4. CLIPPING The most common technique is the Sutherland-Hodgeman clipping algorithm. In this approach, each of the 4 edges of the image plane is tested at a time. For each edge, test all points to be rendered. If the point is outside the edge, the point is removed. For each triangle edge that is intersected by the image planes edge, that is, one vertex of the edge is inside the image and another is outside, a point is inserted at the intersection and the outside point is removed. 4.1 Need for Clipping and Windowing? The size of a monitor terminal on which the pictures are displayed is limited both the physical dimensions and its resolution. The physical dimensions limit the maximum size of the picture that can be displayed on the screen and the resolution (no. of pixels/inch) limits the amount of district details that can be shown. Suppose the size of the picture to be shown is bigger than the size of the screen, then obviously only a portion of the picture can be displayed. The context is similar to that of viewing a scene outside the window. While the scene outside is quite large, a window will allow you to see only that portion of the scene as can be visible from the window the latter is limited by the size of the window. Similarly if we presume that the screen, which allows us to see the pictures as a window, then any picture whose parts lie outside the limits of the window cannot be shown and for algorithmic purposes, they have to be clipped. Note that clipping does not become necessary only when we have a picture larger than the window size. Even if we have a smaller picture, because it is lying in one corner of the window, parts of it may tend to lie outside or a picture within the limits of the screen may go (partly or fully) outside the window limits, because of transformation done on them. And what is normally not appreciated is that as result of transformation, parts, which were previously outside the window limits, may come within limits as well. Hence, in most cases, after each operation an the pictures, it becomes necessary to check whether the picture lies within the limits of the screen and if not, too decide as to where exactly does it reach the limits of the window and clip it at that point. Further, since it is a regular operation in interactive graphics, the algorithms to do this will have to be pretty fast and efficient. The other related concept is 4.2. MID-POINT CLIPPING 39 windowing. It is not always that we cut down the invisible parts of the picture to fit it into the window. The alternate option is to scale down the entire picture to fit it into the window size i.e. instead of showing only a part of the picture, its dimensions can be zoomed down. In fact, the window can be conceptually divided into more than one window and a different picture can be displayed in each window, each of them prepared to fit into the window. In a most general case, one may partly clip a picture and partly transform it by windowing operation. Also, since the clipped out parts cannot be discarded by the algorithm, the system should be able to keep track of every window and the status of every picture in each of them and keep making changes as required all in real time. Having seen what clipping and windowing is all about; we straightaway introduce you to a few clipping and windowing algorithms. 4.2 Mid-Point clipping While mathematical formulae exist to compute the intersection of two straight lines (in this case, the edge under consideration and the straight line to be clipped) it comes computationally intensive and hence another efficient method has been developed. As you know, multiplication and division are most time consuming and hence an algorithm that minimizes on multiplications and divisions is always considered superior. The algorithm in question can be seen to be very efficient in minimizing the divisions that occur, when the intersection of two straight line are computed directly. Consider a straight line P1 P2. We have decided, (based on the earlier suggested algorithm) that the point P11 is visible while the point P2 is not. The point is that we should find a point P1 which is the point farthest from P1 and still visible. Any point beyond P11 becomes invisible. The question is to find P11. The algorithm processes by dividing the line P1 P2 at the middle, hence the name mid-point division. Let us say the point is P11 . This point, in this ease is visible. That means the farthest visible point away from P11 . So divide the segment P11 p2 at the middle. In this case, the new mid point P21 is invisible. So the farthest visible point is between P11 P21 . So divide the segment into two and so on, until you end up at a point that cannot be further divided. The segment P1 to this point is to be visible on the screen. Now we formally suggest the mid point division algorithm. 40 CHAPTER 4. CLIPPING 4.3 Line clipping Look at the following set of lines. screen in which they are to be displayed. The rectangle indicates the How to decide which of the lines, or more precisely which part of every line is to be displayed. The concept is simple. Since we know the coordinates of the screen, i Any line whose end points lie within the screen coordinate limits will have to display fully (because we cannot have a straight line whose end points are within the screen and any other middle point in outside). ii Any line whose end points lie totally outside the screen coordinates will have to examined to see if any intermediate point is inside the screen boundary. iii Any line whose one end point lies inside the boundary will have to be identified. In case f (ii) and (iii), we should decide up to what point, the line segment can be displayed. Simply finding the intersection of the line with the screen boundary can do this. Though on paper the concept appears simple, the actual implementation poses sufficient problems. We now see how to find out whether the respective end points lie inside the screen or not. The subsequent sections will inform us about how to go about getting the intersection points. 4.4 Polygon clipping A polygon is a closed figure bounded by line segments. While common sense tells us that the figure can be broken into individual lines, each being clipped individually, in certain applications, this method does not work. Look at the following example. A solid arrow is being displayed. Suppose the screen edge is as shown by dotted lines. After clipping, the polygon becomes opened out at the B points A and B. But to ensure that the look of solidly is retained, we should close the polygon along the line A-B. This is possible only if we consider the arrow as a polygon not as several individual lines.Hence we make use of special polygon clipping algorithms the most celebrated of them is proposed by Sutherland and Hodgeman. Chapter 5 Geometric Transformation 5.1 What is this? We are introduced to the basics of pictures transformations. Graphics is as much about the concept of creating pictures as also about making modifications in them. Most often, it is not changing the pictures altogether, but about making ”transformation” in them. Like shifting the same picture to some other place on the screen, or increasing or decreasing it’s size (this can be in one or two directions) or rotating the picture at various angles The rotation also can be either w.r.t. the original x, y coordinates or with any other axis. All these are essentially mathematical operations. We view points (and hence pictures, which are nothing but the collections of points) as matrices and try to transform them by doing mathematical operations on them. These operations yield the new pixel values, which, when displayed on the monitor give the transformed picture. We have seen the concept of producing pictures, given their equations. Though we talked of generating only straight lines and circles, needless to say similar procedures can be adopted for the other more complex figures - in many cases a complex picture can always be treated as a combination of straight line, circles, ellipse etc., and if we are able to generate these basic figures, we can also generate combinations of them. Once we have drawn these pictures, the need arises to transform these pictures. We are not essentially modifying the pictures, but a picture in the center of the screen needs to be shifted to the top left hand corner, say, or a picture needs 41 42 CHAPTER 5. GEOMETRIC TRANSFORMATION to be increased to twice it’s size or a picture is to be turned through 90o . In all these cases, it is possible to view the new picture as really a new one and use algorithms to draw them, but a better method is, given their present form, try to get their new counter parts by operating on the existing data. This concept is called transformation. When talking about geometric transformations, we have to be very careful about the object being transformed. We have two alternatives, either the geometric objects are transformed or the coordinate system is transformed. These two are very closely related; but, the formulae that carry out the job are different. We only cover transforming geometric objects here. We shall start with the traditional Euclidean transformations that do not change lengths and angle measures, followed by affine transformation. Finally, we shall talk about projective transformations. 5.2 Euclidean transformation The Euclidean transformations are the most commonly used transformations. An Euclidean transformation is either a translation, a rotation, or a reflection. We shall discuss translations and rotations only. Translation on 2D (Plane) In these notes, we consider the problem of representing 2D graphics images which may be drawn as a sequence of connected line segments. Such images may be represented as a matrix of 2D points xi yi . The idea behind this representation is that the first point represents the starting point of the first line segment drawn while the second point represents the end of the first line segment and the starting point of the second line segment. The drawing of line segments continues in similar fashion until all line segments have been drawn. A matrix having n + 1 points describes a figure consisting of n line segments. It is sometimes useful to think of each pair of consecutive points in this matrix representation. xi−1 yi−1 xi yi 5.2. EUCLIDEAN TRANSFORMATION 43 Figure 5.1: Rotating a point about origin Translation on 3D (Space) Rotation on 2D (Plane) Suppose we wish to rotate a figure around the origin of our 2D coordinate system. Figure 3 shows the point (x, y ) being rotated θ degrees (by convention, counter clock-wise direction is positive) about the origin. The equations for changes in the x and y coordinates are: x = x × cos(θ) − y × sin(θ) y = x × sin(θ) + y × cos(θ)(1) If we consider the coordinates of the point (x, y ) as a one row two column matrix x y and the matrix cos(θ) sin(θ) −sin(θ) cos(θ) We can define a J monad, rotate, which produces the rotation matrix. This monad is applied to an angle, expressed in degrees. Positive angles are measured in a counter-clockwise direction by convention. Rotation on 3D (Space) Lets consider the same idea for rotations in R3 . One way to think of a rotation is to move a set of points (x, y, z) in the world relative to some fixed coordinate frame. For example, you might rotate your head or rotate your chair (and your body, if its sitting on the chair). In this interpretation, we define a 3D axis of rotation and an angle of rotation. We also define the origin (0, 0, 0) of the coordinate system. We rotate about the origin, so that 44 CHAPTER 5. GEOMETRIC TRANSFORMATION the origin doesnt move. For the example of rotating your head, the origin could be some point in the center of your head. We can rotate about many different axes. For example, to rotate about x axis, we perform:  1 0 0  cos(θ) sin(θ)   0  0 −sin(θ) cos(θ) We could also rotate about the y axis:   cos(θ) 0 −sin(θ)  0 1 0    sin(θ) 0 cos(θ) We could also rotate about the z axis:   cos(θ) −sin(θ) 0  cos(θ) 0   sin(θ )  0 0 1  5.3 Affine transformation Euclidean transformations preserve length and angle measure. Moreover, the shape of a geometric object will not change. That is, lines transform to lines, planes transform to planes, circles transform to circles, and ellipsoids transform to ellipsoids. Only the position and orientation of the object will change. Affine transformations are generalizations of Euclidean transformations. Under affine transformations, lines transforms to lines; but, circles become ellipses. Length and angle are not preserved. In this section, we shall discuss scaling, shear and general affine transformations. Scaling Scaling transformations stretch or shrink a given object and, as a result, change lengths and angles. So, scaling is not an Euclidean transformation. The meaning of scaling is making the new scale of a coordinate direction p times larger. In other words, the x coordinate is ”enlarged” p times. This requirement satisfies x’ = p x and therefore x = x’/p. 5.4. PROJECTIVE TRANSFORMATION 45 Scaling can be applied to all axes, each with a different scaling factor. For example, if the x-, y- and z-axis are scaled with scaling factors p, q and r, respectively, the transformation matrix is: Shear The effect of a shear transformation looks like “pushing” a geometric object in a direction parallel to a coordinate plane (3D) or a coordinate axis (2D). In the following, the red cylinder is the result of applying a shear transformation to the yellow cylinder: How far a direction is pushed is determined by a shearing factor. On the xy-plane, one can push in the x-direction, positive or negative, and keep the y-direction unchanged. Or, one can push in the y-direction and keep the x-direction fixed. The following is a shear transformation in the x-direction with shearing factor a: 5.4 5.5 Projective transformation Matrix Multiplication and Transformation 46 CHAPTER 5. GEOMETRIC TRANSFORMATION Chapter 6 Curves 6.1 Shape description requirements Generally shape representations have two different uses, an analytic use and a synthetic use. Representations are used analytically to describe shapes that can be measured; just as a curve can be fitted to a set of data points, a surface can be fitted to the measured properties of some real objects. The requirements of the user and the computer combine to suggest a number of properties that our representations must have. The following are some of the important properties that are used to design the curves. The similar properties can also be used for designing the surfaces as surfaces are made up of curves. a) Control Points: The shape of the curve can be controlled easily with the help of a set of control points, means the points will be marked first and curve will be drawn that intersects each of these points one by one in a particular sequence. The more number of control points makes the curve smoother. The following figure shows the curve with control points. b)Multiple values: In general any curve is not a graph of single valued function of a coordinate, irrespective the choice of coordinate system. Generally single valued functions of a coordinate make the curves or graphs that are dependent on axis. The following figure shows multi-valued curve with respect to all coordinate systems. c) Axis independence: The shape of an object should not change when the control points are measured in different coordinate systems, that means when an object is rotated to certain angle in any direction (clockwise or anti-clockwise) the shape of the curve should not be affected. d) Global or local control: The control points of a curve 47 48 CHAPTER 6. CURVES must be controlled globally from any function of the same program or it can also be controlled locally by the particular function used to design that curve by calculating the desired control points. e) Variation-diminishing property: Some of the mathematical functions may diminish the curve at particular points and in some other points it may amplify the points. This leads to certain problems for curves appearance at the time of animations, (just as a vehicle looks curved when it is taking turn). This effect must be avoided with the selection of proper mathematical equations with multiple valued functions. f) Versatility: The functions that define the shape of the curve should not be limited to only few varieties of shapes, instead they must provide vide varieties for the designers to make the curves according to their interest. g) Order of continuity: For any complex shapes or curves or surfaces it is essential to maintain continuity in calculating control points. When we are not maintaining the proper continuity of control points it makes a mesh while marking the curve and the complex object. 6.2 Parametric curves The dominant form used to model curves and surfaces is the parametric or vector valued function. A point on a curve is represented as a vector: P(u)=[x(u) y(u) z(u)] For surfaces, two parametric are required: P(u, v)=[(x(u, v) y(u, v) z(u, v)] As the parametric u and v take on values in a specified range, usually 0 to 1, the parametric functions x, y and z trace out the location of the curve or surface. The parametric functions can themselves take many forms. A single curve be approximated in sever different ways as given below: P(u) = [cos u sin u] P(u) = [(1 u2)/(1 + u2) 2u/(1 + u2)] P(u) = [u (1 u2)1/2] By using simple parametric functions, we cannot expect the designer to achieve a desirable curve by changing coefficients of parametric polynomial functions or of any other functional form. Instead we must find ways to determine the parametric function from the location of control points that are manipulated by the designer. Chapter 7 Fractals 7.1 7.2 7.3 7.4 7.5 7.6 Introduction Generation of fractals Geometric fractals Recapitulation Random fractals Applications 49 50 CHAPTER 7. FRACTALS Chapter 8 Computer Animation 8.1 Introduction Making the pictures to move on the graphical screen is called animation. Animation really makes the use of computers and computer graphics interesting. Animation brought the computers pretty close to the average individuals. It is the well known principle of moving pictures that a succession of related pictures, when flashed with sufficient speed will make the succession of pictures appear to be moving. In movies, a sequence of such pictures is shot and is displayed with sufficient speed to make them appear moving. Computers can do it in another way. The properties of the picture can be modified at a fairly fast rate to make it appear moving. For example, if a hand is to be moved, say, the successive positions of the hand at different periods of time can be computed and pictures showing the position of the hand at these positions can be flashed on the screen. This led to the concept of animation or moving pictures. In the initial stages, animation was mainly used in computer games. However, this led to a host of other possibilities. As we see later on in this course, computers not only allow you to display the figures but also offer you facilities to manipulate them in various ways you can enlarge, reduce, rotate, twist, morph (make one picture gradually change to another like an advertisement showing a cheetah change into a motor bike) and do a whole lot of other things. Thus, a whole lot of films made use of computers to generate tricks. In fact, several advertisement films and cartons strips are built with no actors at all only the computer generated pictures. Animation also plays very important role in training through computer graphics. If you 51 52 CHAPTER 8. COMPUTER ANIMATION have been given a bicycle you might have learn to ride it easily with little effort, but if you have been given a flight, automatically it needs the animated images to study the entire scenario of how flight takes off, on and handling it during flying, contacting with and getting the help from control room etc will be better explained using computers animation technique. 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 Conventional Animation Real time vs image Animation technique Rotoscopy Key framing Spline driven animation Morphing Particle systems Chapter 9 Rendering 9.1 9.2 Illumination model Visibility If the scene contains more than one object, occlusions may occur. That is, some objects may be hidden by others. Only visible objects should be represented. Visibility techniques deal with this issue. One classical algorithm that solves the visibility problem is the so-called painters algorithm. It consists in sorting the objects or polygons from back to front and rasterizing them in this order. This way, front-most polygons cover the more distant polygons that they hide. The ray-tracing algorithm does not use a rasterization phase. It sends one ray from the eye and through each pixel of the image. The intersection between this ray and the objects of the scene is computed, and only the closest intersection is considered. The z-buffer method is the most common nowadays (e.g. for computer graphics cards). It stores the depth (z) of each pixel. When a new polygon is rasterized, for each pixel, the algorithm compares the depth of the current polygon and the depth of the pixel. If the new polygon has a closer depth, 53 54 CHAPTER 9. RENDERING Figure 9.1: Ray-tracing. A ray is sent from the eye and through the pixel. Since the intersection with 2 is closer than the intersection with 1, the pixel is black. the color and depth of the pixel are updated. Otherwise, it means that for this pixel, a formerly drawn polygon hides the current polygon. 9.3 9.4 Polygon shading Solid models