15 Visual Image Interpretation
Dr. Sandeep Gupta
Introduction
The electro-magnetic radiations interact with a wide range of target features lying on the Earth’s surface. Different spatial objects have different reflecting behaviour in different wavelength (spectral) regions. These behaviours are being recorded by sensors that are sensitive towards information retrieval in those particular spectral regions and are finally reproduced digitally in image or pixel form. A digital image is like a container wherein the pixels like squared boxes are arranged in rows and columns. These pixels have an assigned numeric values, called ‘digital number’ (DN), which relates to the amount of reflectivity or emittance received by the sensors indexed in numeric form. Depending on the number of sensors used, visualization of the sensor-derived information is being done preferably in a colored form, in between the ranges of three primary colors of the human vision – Red, Green and Blue (RGB). The human vision system communicating between eye and brain after visual observation is capable of and constantly doing interpretation; drawing conclusions by spontaneous recognition due to object’s knowledge; or, by logical inferences due to orderly build reasoning process; or, sometimes by field observation. Thus, the human vision system is well trained in ‘object understanding’. However, seeing the i) complexity in reflective behaviour of objects, ii) overlapping regions and iii) human eye’s spectral limitations, a necessity was felt to evolve a formal interpretation technique to recognize the objects in an image obtained through remote sensing. In this context, it is noticeable that our eye-brain system is also trained over a period of time in interpreting the objects in any thematic or specialized area in a space.
Visual Image Interpretation has many applications. The technique is important in creating maps of interest as point, lines and polygon (area) with their labeling. On an image of an urban area available in a hardcopy form, we can create a map by delineating the features by overlaying a transparency sheet on it and tracing the features of interest such as post-offices, temples, churches, historical monuments, lakes, inhabitated clusters and the transit road and rail networks connecting to them. This may be useful in finding the shortest distance to minimize the transit time so as to reach the target location such as a post-office in a most economical way. Alternatively, a digital map of the same can be created in the case where the image is available in digital form in a computer by direct ‘digitization’ on-screen using the same interpretation technique.
Learning outcomes
This chapter provides an approach that we utilize to add feature and attribute information to an image captured space or aerially through manual interpretation technique. The first to be considered is the manual delineation and conventional cartographic techniques of features using image interpretation. After getting introduced with the image interpretation elements a relation can be understood or built between the spatial extent of the feature to which they share. For example, tone can be valuable at a pixel level, but texture can be seen for a group of pixels only. The other interpretation elements works often in a combination to identify the object or phenomenon and help in image understanding. All these will help the academia to develop its image interpretation skills, draw the direct or inferential information from the images and aerial photographs and also will enhance its learning ability in a complex world of objects. The limitation lies within the scope of interpreter’s practical experience and diversity of applications.
1. Visual Image Interpretation – Fundamentals
The object or phenomenon recognition develops as the experience grows by visually inspecting a number of aerial photographs, space-borne images of different resolutions with ancillary information such as field inventoried maps, reports etc. Further, the state of complexity about an information – direct or derived, also outlays the growing of image understanding. However, what so ever in manner the understanding can’t be fully developed if the interpreter does not utilizes the brain’s imagination and analytical power to its full potential. In this continuation, it is also important to mention here that the interpreter’s skills can be application specific depending on the number and duration of times the interpretation in a field of application(s).
1.1 Interpretation Elements
The interpretation of image or aerial photograph is different from conventional ones in three different ways where the former is : i) taken from overhead and present a panoramic view ii) captures the features in multi-wavelength spectrum, apart from visible spectrum and iii) image and present the features at different scales and resolutions (Campbell and Wynne, 2011). The characteristics of feature identification is specific for the particular field of application in which they are applied. However, the basic elements considered for image interpretation are tone, texture, shape, size, pattern, association, shadow, aspect etc. These elements are used together, often, in combination.
Tone
Tone is considered as a basic element for all the image interpretation tasks. The tone is referred as relative brightness which is influenced by intensity (total brightness) and angle of illumination. The brightness is actually a result of exposure of distribution and amount of light in a given wavelength spectrum falling on an object. It is important to mention here that the degree of brightness is directly related to amount of energy reflected or emitted. The light or the absence of light relating lightness or darkness can lead to formation of a grey-scale image where tonal variation could be from black to white. Human eye is capable of viewing 40-50 tones. The tone changes when we enhance an image or when we visualize the features in different bands. In day-to-day life the tone is informally referred as ‘color‘ which can be arranged from shortest wavelength to longest in the order of V (violet), I (indigo), B (Blue), G (green), Y (yellow), O (orange) and R (red) or VIBGYOR for memory aid. It is also important to mention here that the tonal variation is analytically more noticeable when features are recorded and produced by sensors digitally than conventionally printed on hard surfaces of paper or plastics. For example, a 8-bit colored image displayed in a combination of three primary colors (RGB) can have a total size of 256 possible combination of digital numbers (Lillesand et al., 2008). Alternatively, the RGB components of a color can be described by intensity (I), hue (H) and saturation(S). system which are basically the total brightness, dominant or average wavelength of light and purity of a color, respectively. Sometimes, the RGB to IHS transformation can lead to tonal or color enhancement for better interpretation of a colored or colored infrared (CIR) imageries or aerial photographs.
Texture
Texture is the frequency of tonal change on an image or aerial photograph which determines how smooth or rough the feature’s surface is when visualized. The texture can be coarse or fine, smooth or rough, even or uneven etc. When an irregular surface is illuminated from an oblique angle, a pattern of highlighted and shadowed areas are created that give a textural appearance of feature’s surface (Campbell and Wynne, 2011). The texture is strongly related to the spatial resolution of the image. As the scale of the image is reduced, the texture of any object or area of the image becomes progressively finer and ultimately disappear. It is easy for an image interpreter to discern between objects of similar tone based on their textural differences such as metallic and non-metallic roads.
Shape
The shape is a general form, configuration or outline of individual objects. In case of stereoscopic images the objects’s height also defines its shape. The shape is an important sign for the interpreter. For example, sprinkler’s irrigated fields when seen in an image looks circular in shape. Similarly, road, canal or river appears in a linear shape.
Size
The size of object in an image is important indication for an interpreter for feature discrimination and estimation of its approximate size. The relative size of an object when compared to its neighbours provide the interpreter a spontaneous impression about its scale and resolution. In case where the interpreter has a pre-knowledge about an object visually, it is not very difficult for him to identify unfamiliar neighbouring objects and estimate its approximate size. For example, flourishing cropland area can be easily identifiable when a canal is passing nearby that serves the water for irrigation purpose. The size act also as a precious interpretation assistance where the dimensions are in direct relation with the object’s identification or serve as a definite criteria to identify the object, for example, residential blocks and commercial complexes can easily be identified based on their size characteristics if other factors are also taken into consideration.
Pattern
The pattern refers to the spatial arrangement of individual objects in an image or aerial photograph into visibly distinct repetitive forms. This kind of spatial orderly repeat for both natural and man-made objects helps the interpreter in recognizing them. For example, an orchard where trees are present in a form that is distinctly arranged at certain spatial intervals can be easily discriminated with that of the forest tree stands.
Association
The association refers to the occurrence of certain features in relation to other or more specifically the relationship between other recognizable objects or features in the neighbourhood to the object in which the interpreter is interested. For example, the water can be associated with inhabited or irrigated area. Similarly, the mining activities can be associated with transportation routes. It is important to note here that the association does not necessarily involve size or pattern.
Shadow
The shadow is important for interpreters in two opposing respects: i) the shape or outline of a shadow affords an impression of the profile view of objects (which help in interpretation) and ii) objects within shadow reflect little light and are difficult to be differentiated on an image (which raises difficulty in interpretation) (Lillesand et al., 2008). For example, shadow cast by various tree species or cultural features (bridges, towers) can help in their identification and area estimation. But, the shadow of a tall building may hinders the delineation of the object on which the building’s shadow is falling. For an investigator working in the field of forestry, it would be of much importance for him/her to know which side is sunlit and which side is sun-shadowed of a hilly region on an image in order to know the spatial distribution of dominant tree species on both side.
Aspect
The aspect or aspect ratio is the ratio of width of shape to its height or it is the estimation of how long the object is compared to its width. When the width is larger than its height, the shape of the object is ‘landscape’ rather than ‘portrait’. The other advantage of aspect of a feature is that the continuously long thin features can be easily discernible even when they are narrower than the spatial resolution of the image, for example roads, streams etc.
2. Image Interpretation Strategies
As earlier mentioned, the image interpretation is not a simple task if knowledge, skills and experiences are not matching as there are a variety of features present on an image or aerial photograph. Also, the level of difficulty is associated with the kind of information, i.e. it is direct or derived. For example, delineating the path of a river course appearing on the surface is very easy compare to identify and delineate a historical river valley. Therefore, the rational thinking, imagination power along with subject and collateral knowledge is a necessity for a better image understanding. Sometimes, the information is available in pieces that need to be join together to know the complete information about an object or phenomenon from space imageries or aerial photographs. For example, identification of a particular crop type from imagery without knowing its seasonal period and duration might misinterpret the outcome. Therefore, it is necessary for an interpreter to know all these concepts before beginning the interpretation task and implement part of the strategies while interpretation.
The following is an example view in tabular form about the land cover appearances in various colors depending on the band combinations used in a satellite image.
Table 1. Land cover classes in appearing in colors in different band combination (SWAC, 2016)
3. Techniques of image interpretation
The technique of image interpretation basically involves space images and/or aerial photographs and collateral materials. The selection of input images or photographs depends entirely on the user side, largely based on the season, month or even dates and the resolution when other factors are taken into account. The collateral material or ancillary data contains existing information of an area, process, type of facility or object, that an interpreter may use as assisting resources during the interpretation process. The ancillary information present in the form of text, tables, maps, graphs or even image metadata such as spatial and radiometric resolution, date of acquisition etc. provide better definition of the scope, objectives and problems of the given task. Examples are socio-economic data, forest boundary, tree species diversity, land use map or weather reports. The collateral materials can be divided into two broad category, interpretation keys and field verification.
3.1 Image Interpretation Keys
Different interpretation classes can be described according to the interpretation elements. After affirming about the features present on the ground, interpretation keys can be constructed based on which object interpretation can be done (Tempfli et al., 2009). Thus, the process of image interpretation is tuned with these keys that basically sum up the complex information stored in the image form. Therefore, the keys are useful in two ways, first it act as a training tool and second, it provide a reference guide for the interpreter to correctly identify the information, even for unknown objects, in a planned and steady manner. A key is generally consist of two parts: (a) A collection of annotated or captioned images or stereograms rendering the object to be identified and (b) a graphic or word description, possible including sketches or diagrams representing the image recognition characteristics of the object of interest (Lillesand et al., 2008; Campbell and Wynne, 2011). Depending upon the way in which the features are organized, two types of keys are generally recognized:
i) Selective keys and
ii) Elimination keys
i) Selective keys
These are basically many example images and/or aerial photographs with the supporting texts and are arranged in such a way that an interpreter simply selects that example that most closely corresponds to the object they are trying to identify, e.g. agriculture, forest, industries, lakes etc.
ii) Elimination Keys
The elimination keys are arranged in such a way that the interpreter follows a precise step-wise process from broad to the particular that leads to the elimination of all items except the one(s) that the interpreter is trying to identify. The elimination key is the most commonly used key type because it can provide more affirmative solutions for an example object in an image than a selective key. But, if the interpreter is not familiar or is uncertain in making choice between the two or more image objects, then it may result in a wrong selection of the right object.
The selection of the type of key depend on the number of objects to be identified and the variability within each feature class within the selected key, for example, variation in texture in an open area.
3.2 Field verification
Ground verification is a type of collateral material since it is normally conducted to assist the interpreter in interpreting, classifying and analyzing the image information. Basically, ground verification help the interpreter in knowing the study area or feature class. This kind of confirmation is done before interpreting the information in order to develop a visual perception in a human vision system to match it how an object of interest appear in the field. Further ground truthing can be done after the interpretation is done to assess the accuracy of information interpreted. It is important for an investigator to chalk out a proper plan before going to ground like season, time to be spent, extent of study area, quantity of information to be collected, method of data collection.
The amount and type of field work required for a given project generally dependent upon the – type of analysis involved; image quality including scale, resolution and information to be interpreted; accuracy requirements for both classification and boundary delineation; experience of the interpreter and the knowledge of the sensor, area, and subject; terrain conditions, and the accessibility of the study area; personnel availability and access to ancillary material; and cost considerations (Estes, 2016).
Summary
Visual image interpretation is an important first step in obtaining the desired information from a remotely sensed data. The information can be readable directly from the imagery or may be in hidden form where it needs to be derived indirectly. For the purpose, the images or aerial photographs present in a digital or in a hardcopy form are visualized. Since the spectral behaviour of an object is different in different spectrum, it is very important for the interpreter to overcome this limitation for a human-vision system by visualizing the information present in the images using color composite technique. The interpretation elements are basically a set of guidelines that aid the interpreter to look the features from different viewpoints and to draw conclusions. The image interpretation involves the tone, texture, shape, size, pattern, association, shadow, aspect as basic elements. These elements are used together, often, in combination to extract the desired information and helps in overall image understanding. The collateral or ancillary information collected from different sources such as field, submitted texts or even transferred verbally through local knowledge helps in quality outcome of the interpretation task. These elements not only help in the immediate feature recognition and delineation but also in classification of the entire image into a group of feature classes as an input strategy. A good interpreter manages the quality and timely output delivery of the information from remotely sensed images and/or aerial photographs to its clients without much time attending the post-delivery error correction on client feedback. Further, the visual image interpretation ability increases with increasing experiences, the skills of the interpreter itself and the quality of the input imageries and aerial photographs.
References
- Campbell, J. B. and Wynne, R. H. (2011). Introduction to remote sensing, 5th ed., The Guilford Press, USA.
- Lillesand, T. M., Kiefer, R. W. and Chipman, J. W. (2008). Remote sensing and image interpretation, 6th ed., John Wiley & Sons, USA.
- Tempfli, K., Kerle, N., Huurneman, G. C., and Janssen, L. L. F. (Eds.) (2009). Principles of remote sensing, ITC Educational Textbook Series 2, ITC, 4th ed., The Netherlands.
- Estes, J. E. Aids to and Techniques of Image Interpretation, http://userpages.umbc.edu/~tbenja1/umbc7/santabar/vol1/lec2/2-4.html (accessed on 29 September, 2016).
- SWAC (Satellites, Weather and Climate) project, www.uvm.edu/~swac/docs/mod4/land_features.ppt (accessed on 29 September, 2016)
- www.hosting.soonet.ca (accessed on 31 May, 2017)