11 Digitization Part – II

H G Hosmani and Yatrik Patel

 

I.  Objectives

 

•   To know different types of digitization equipments

•   To know different software used in digitization process

•   To learn digitization process for audio-video contents

•   To learn digitization process for images

 

 

 

II.   Learning Outcomes 

 

After going through this lesson, learners would attain knowledge about different types of scanners and scanning software as important components of the scanning system. Learners would gain knowledge of the process of digitization of audio and video content as well as the process of organizing digital objects.

 

 

 

III.   Structure 

 

1.  Introduction

2.  Equipment for digitization and Implementation

2.1  Scanners

2.1.1  How Scanner Works?

2.1.1.1  Flatbed Scanners

2.1.1.2  Sheet feed Scanners

2.1.1.3  Drum Scanners

2.1.1.4  Digital Cameras

2.1.1.5  Slide Scanner

2.1.1.6  Microfilm Scanner

2.1.1.7  Video Frame Grabber or VideoDigitizer

2.1.1.8  Hand-held Scanners

2.2  Scanning Software

2.3  Image Editing Applications

3.  Digitization of Audio and Video

4.  Organizing Digital Images

4.1  Organize

4.2  Name

4.3  Describe

5.  Planning Digitizations

6.  Summary

7.  References

 

 

 

 

 

1.  Introduction 

 

In previous session i.e. Digitization Part I you have learnt about digitization basics and concepts along with the steps required in the process of digitization, with a glimpse on technology of digitization, various content compression tools techniques, OCR as well as organizing  and integrating digital images with digital content.

 

As Digitization is the process of converting the content into digital format. In most library applications, digitization normally results in a document that are accessible through network preferably Internet, Hence, the process of actual digitization play a pivotal role that chiefly includes different types of scanners, and other digitization equipments for capturing,encoding and transcoding the audio and video  contents along with softwares and tools to achieve full functionalities of equipments used for digitization (i.e. scanners,converters). Lets have a look at equipments, tools and software used for digitization process.

 

2.  Equipment for digitization and Implementation 

 

2.1  Scanners

 

Digital scanners are used to capture a digital image from an analogue media such as printed page or a microfiche / microfilm at a predefined resolution and dynamic range (bit range). There are two types of image scanners: vector scanner and raster scanners. The vector scanners scan an image as a complex set of x,y coordinates. Vector images are generally used in geographical information systems (GIS). The display software for the vector image interprets the image as function of coordinates and other included information to produce an electronic replica of the original drawing or photograph. Vector images can be zoomed in portion to display minute details of a drawing or a map. Maps, engineering drawings, and architectural blueprints are often scanned as vector images. Raster images are captured by raster scanners by passing lights (laser in some cases) down the page and digitally encoding it row by row. Multiple passes of lights may be required to capture basic (as a set of bits known as bit map) colours in a coloured image. Raster scanners are used in libraries to convert printed publications into electronic forms. The Majority of electronic imaging system generate raster images. The scanners used for digitizing analogue images into digital images come in a variety of shapes and sizes.

 

2.1.1  How Scanner Works? 

 

Scanners are equipped with a lamp that moves with the scanner head to light-up the object being scanned. Most scanners use a cold cathode fluorescent lamp or a xenon lamp. The scan head is made up of the mirrors, lens, filter, and charge-coupled devices (CCD) array. A belt that is connected to the stepper motor makes the scan head move. A stabilizing bar prevents wobbling during the pass. The mirrors reflect what is being scanned into the lens and the image is then focused through a filter on the CCD array. Three smaller images of the original are made by the lens. These images then go through a color filter and onto a section of the CCD array. The data is then combined into a single image.

 

While selecting a scanner, one should consider resolution, sharpness, and rate of image transfer. The resolution is measured in dots per inch (dpi). The average scanner has at least 300×300 dpi. The number or sensors in a row of the CCD array determines a scanners dpi. Sharpness depends on how bright the lamp is and the quality of the lens. Image transfer depends on the connection used to connect the scanner to the computer. The slowest is the parallel port. Universal Serial Bus or USB scanners are affordable, easy to use, and have decent speed.

 

The hardware required for a scanner is a connector such as a USB. The software required is a driver. The driver is needed to communicate with the scanner. TWAIN is the language spoken by scanners. Any program that supports TWAIN can acquire a scanned image.

 

There are following types of Scanners:

 

2.1.1.1  Flatbed Scanners – right angle, prism and overhead flatbed

2.1.1.2  Sheet-Feed Scanners

2.1.1.3  Drum Scanners

2.1.1.4  Digital Cameras

2.1.1.5  Slide Scanners

2.1.1.6  Microfilm Scanner

2.1.1.7  Video Frame Grabber

2.1.1.8  Hand-held scanners

 

The type of scanner selected for an imaging project would be influenced by the type, size and source of documents to be scanned. Many scanners can handle only transparent material, whereas others can handle only reflective materials.

 

2.1.1.1  Flatbed Scanners 

 

Flatbed scanners are most common, and widely used scanners that look like a photocopier and are used in much the same way. Source material in a flatbed scanner is placed face down for scanning. The light source and charge-coupled devices (CCDs) move beneath the platen, while the document remains stationary as in the case of photocopying machine. Flatbed scanner comes in various models like right- angle, prism and planetary/overhead to handle bound volumes and books. Flatbed scanner can scan a document at 600 dpi. Many flatbed scanners offer higher resolution.

 

Fig. 1: Flatbed Scanner

 

2.1.1.2  Sheet feed Scanners

 

In a sheet-feed scanner, as is indicated in the name, the document is fed over a stationary CCD and light source via roller, belt, drum, or vacuum transport. In contrast to a flat-bed scanner, sheet-feed scanner have an optional attachment to auto feed uniform-sized stacks of documents to be scanned.

 

Fig.2: Sheet feed Scanners

 

2.1.1.3  Drum Scanners

 

Source material in a drum scanner is wrapped on a drum, which is then rotated past a high-intensity light source to capture the image. Drum scanners offer superior image quality, but require flexible source material of limited sized that can be wrapped around the photosensitive drum. Drum scanners are specially targeted for graphic art market. Drum scanners offer the highest resolution for grey scale and colour scanning. Drum Scanner use Photo-Multiplier (Vacuum) Tubes (PMTs) instead of CCDs, which offer a greater bit depth (12 to 16 bits).

Fig.3: Drum Scanners

 

2.1.1.4  Digital Cameras

 

Digital cameras mounted on copy cradle resemble microfilming stand. Source material is placed on the stand and the camera is cranked up or down in order to focus the material within the field of view. Digital cameras are most promising scanner development for library and archival applications.

 

Fig.4: Digital Camera

 

 

2.1.1.5  Slide Scanner

 

Slide scanners have a slot in the side to accommodate a 35mm slide. Inside the box, the light passes through the slide to hit a CCD array behind the slide. Slide scanners can generally scan only 35mm transparent source materials.

 

Fig.5: Drum Scanner

 

2.1.1.6  Microfilm Scanner

 

Specially targeted to library/archival application, microfilm scanners have adapters to convert roll film, fiche, and aperture cards in the same model.

Fig. 6: Microfilm Scanner

 

2.1.1.7  Video Frame Grabber or VideoDigitizer

 

Video digitizers are circuit boards placed inside a computer and attached to a standard video camera. Anything that is filmed by the video camera is digitized by the video digitizer.

 

Fig. 7: Video Graber

 

2.1.1.8  Hand-held Scanners

 

Hand-held scanners are used for scanning selective sections of data. It may require multiple pass to capture large area. Moreover, a user should have a steady hand while moving the scanner over the document to be scanned.

 

2.2  Scanning Software 

 

The scanning software is used for scanning the image and capturing it in the computer. This software is provided by the manufacturer of the product to the buyers. These drivers translate the instructions into commands, which the scanner understands.

 

2.3  Image Editing Applications 

 

Image editing applications are used once the process of scanning the image is over and the image is available in the computer for further manipulation. Most image editing software offer features like image editing, sharpening, filter, cropping, colour adjustments, forms conversion, resizing, etc. Most image editing software can also be used for capturing the images.

 

3.  Digitization of Audio and Video 

 

The song or speeches that we generally listen from tape recorder or radio are in an analogue form. The analogue sound tracks can be digitized by attaching an audio player to a system through an audio capture card so as to record the sound to the system. The audio files can be saved as .wav, mp3, midi, etc. MP3 format is highly compact and the sound quality is better in comparison to other formats. Audio files can be further processed using noise reduction software.

 

Like audio, video capture also requires a video capture card with input from video cassette player (VCP / VCR), TV antenna, cable or movie camera. The digitised files can be saved as .mov, .avi, .mpg file formats.

 

4.  Organizing Digital Images 

 

A disc full of digital images without any organization, browse and search options may have no meaning except for one who created it. Scanned images need to be organized in order to be useful. Moreover, images need to be linked to the associated metadata to facilitate their browsing and searching. The following three steps describe the process of organizing the digital images:

 

4.1  Organize 

 

Organize the scanned image files into disc hierarchy that logically maps the physical organization of the document. For example, in a project on scanning of journals, create a folder for each journal, which, in turn, may have folder for each volume scanned. Each volume, in turn, may have a subfolder for each issue. The folder for each issue, in turn, may contain scanned articles that appeared in the issue along with a content page, composed in HTML providing links to articles in that issue.

 

4.2  Name 

 

Name the scanned image files in a strictly controlled manner that reflects their logical relationship. For example, each article may be named after the surname of the first author followed by a volume number and an issue number. For example, file name “smithrkv5n1.pdf” conveys that the article is by “R.K. Smith” that appeared in volume 5 and issue no.1. The file name for each article would, therefore, convey a logical and hierarchical organization of the journal.

 

4.3  Describe 

 

Describe the scanned images file internally using image header and externally using linked descriptive metadata files. The following three types of metadata are associated with the digital objects:

 

•  Descriptive Metadata: Include content or bibliographic description consisting of keywords and subject descriptors.

 

•  Administrative or technical Metadata: Incorporates details  on  original source, date of creation,  version  of  digital  object,  file  format  used, compression technology used, object relationship, etc. Administrative data may reside within or outside the digital object and is required for long-term collection management to ensure longevity of the digital collection.

 

•  Structural Metadata: Elements within digital objects facilitates navigation, e.g. table of contents, index at issue level or volume level, page turning in an electronic book, etc.

 

The simplest and least effective method for providing access is through a table of content and links each item to its respective object / image. Content pages of issues of journals done in HTML would offer browsing facility. Full-text search to HTML pages or OCRed pages can be achieved by installing one of the free Internet search engines Google (http://www.google.com).

 

Large scanning projects would, however, require a back-end database storing images or links to the images and metadata (descriptive / administrative). Back-end database used by most document management system holds the functionality required by most web applications. Important management systems like File Net have now integrated their database with HTML conversion tools. Further, some of the document management system has also signed up with Adobe to incorporate Acrobat and Acrobat capture into their web-based document management system.These databases entertain queries from users through “HTML forms” and generate search results on the fly. Several digital library packages are now available as “open source” or “free- ware” that can be used not only for organizing the digital objects but also for their search and retrieval.

 

5.  Planning Digitizations 

 

Digitization is the first step towards building a digital library. It is highly specialized and cost-intensive activity that requires inputs from diverse branches of knowledge. It is important that objectives, needs and purpose of digitization is established clearly and its objectives are established beyond doubts. The digitization proposal should define its goals, scopes, benefits,costs, time required in developmental phase, feasibility, implementation issues, deliverables and target users. It may be desirable to continue with traditional libraries with i) acquiring of collections in digital media; ii) buying access to electronic resources; and iii) developing subject gateways or library portals, instead of undertaking digitization project. This may be studied in more detail through module named “Digital Library Planning and Implementation”.

 

6.  Summary

 

Digitization is the process of converting the content of physical media (e.g., periodical articles, books, manuscripts, cards, photographs, vinyl disks, etc.) into digital format. In most library applications, digitization normally results in a document that are accessible from the web site of a library, and thus are available on the Internet. Optical scanners and digital cameras are used to digitize images by translating them into bit maps. It is also possible to digitize sound, video, graphics, animations, etc.

 

An image scanning system may consists of a stand-alone workstation where most or all the work is done on the same workstation or as a part of a network of workstation with imaging work being distributed and shared amongst various workstations. The requirement usually includes a scanning station, a server and one or more editing and retrieval stations.

 

This module describes scanners and scanning software as the important components of the scanning system.

 

 

 

7.  References

  1. Coyle, K. (2006). Mass digitization of books. The Journal of Academic Librarianship, 32(6), 641-645.
  2. Hughes, L. M. (2004). Digitizing collections: strategic issues for the information manager.
  3. Lee, T. H., & Sheng, T. (2009). U.S. Patent No. 7,538,915. Washington, DC: U.S. Patent and Trademark Office.
  4. Rangan, P. V., & Vin, H. M. (1991). Designing file systems for digital video and audio (Vol. 25, No. 5, pp. 81-94). ACM.
  5. Vannier, M. W., Pilgram, T., Bhatia, G., Brunsden, B., & Commean, P. (1991). Facial surface scanner. IEEE Computer Graphics and Applications, 11(6), 72-80.