3D-measurement of the object in the scene is a useful feature in ROV video inspection. In this paper, we introduce an accurate 3D-measurement system built upon a stereoscopic camera system. Based on recent research results in Correction of Misaligned Stereoscopic Images by Means of Digital Image Processing, the system we proposed can be implemented as an additional feature to the existing ROV video inspection system.
This paper focuses on high level concept of the system components including stereoscopic video, human factor, and digital image processing. Technical details are available through the references provided. The discussion in this paper is centered at providing a robust and low cost system in industrial applications.
ROV inspection and operation have been broadly used in subsea applications for decades. And nowadays, there is an increasing market at inland application of ROV. Video cameras have been a primary inspection tool in most cases because its simplicity and intuitiveness. Visual images can provide rich information to an operator, your clients or anybody else. The knowledge requirement is absolutely minimum in order to understand visual images.
However in some cases, it is hard to estimate the actual size of an object from an image. For example when inspecting a water supply pipe, an ROV operator will have problem to interpret the size of a crack which is displayed on a screen without other objects as a reference in the scene. The same crack may appear smaller or larger in the image because of the perspective nature of a camera. In such a case, sonar and laser are often used to retrieve 3D information of an object at an additional cost in both manufacture and operation of ROV.
In this paper, we present a video system, which provides accurate 3D measurements. This system is based on a patterned technology , and employs digital image processing and augmented reality techniques by combining stereoscopic video images with stereoscopic computer graphics. The scheme of this system is that a ROV operator, viewing a remote scene using a carefully calibrated stereoscopic video (SV) system, is able to make accurate 3D measurements of dimensions, depths, clearances and object separations within the SV image.
Figure 1 shows the overview of the presented system. The biggest advantage of this system is its consistency with the existing video inspection system. The presented system consists of cameras, communication cables, a display device and maybe a recording device, while most of them are already exist in traditional video inspection system. An ROV operator can switch from the stereoscopic (3D) operation mode to the monoscopic (2D) mode, or vise versa. In the monoscopic mode, the presented system appears as a simple video inspection system, where everything works the same way as in traditional video inspection. In the stereoscopic mode, 3D measurement can be carried out.
The authors have been worked on the relevant topics for many years, including underwater imaging, image-based measurement, stereoscopic video, ergonomics and underwater robotics. The system presented here is based on a sequence of research results, and inspired by the most recent achievements. The rest of the paper is organized as following. Section 2, Graphical Pointer and Virtual Tape Measure (VTM), introduces the concept of stereo graphics superimposed on stereoscopic video and the VTM technology. Section 3, Image Processing to Compensate Camera Misalignment, introduces the concept to process the stereoscopic images by means of digital image processing algorithms so that it is more comfortable for an observer to view. Section 4 is the conlusion.
2. GRAPHICAL POINTER AND VIRTUAL TAPE MEASURE (VTM)
By means of stereoscopic display and based on the principle of triangulation, an observer is able to obtain 3D information from the SV images. However, To estimate the coordinates of a 3D point directly is an absolute estimate task for the operator and hence is very difficult. This 3D-estimation task might become practically impossible if all the objects in the scene are unfamiliar to the observer. Based on the human factor principle, an observer needs a reference to perform this task.
If the specifications of cameras used for taking images are known, it is possible to generate an image of a graphical pointer. In terms of computer graphics, a graphical pointer is generated with a given viewpoint and viewing angle. When superimposing such a graphical pointer onto the video image, the system provides a reference object to the observer, and hence simplified the 3D-measurement task. A stereographical pointer superimposed onto SV image can help observer to make 3D measurement.
Based on the above graphical pointer, a Virtual Tape Measure (VTM) has been developed to facilitate 3D measurements. The fundamental principle behind the VTM is that a human operator, viewing a remote scene using a stereoscopic video (SV) system, is able to make accurate 3D measurements within the SV image. This is accomplished by means of interactively aligning each end of the VTM, in three dimensions, with the points between which a measurement is to be made ¨C in the identical fashion as one would use a real tape measure. A monoscopic, or 2D, illustration of the VTM concept is shown in Figure 2, which is a mocked up underwater scene, with low turbidity. VTM technology has also been tested in Operating Microscope in medical applications .
3. IMAGE PROCEEING TO COMPENSATE CAMERA MISALIGNEMNT
With the technologies present above, an operator is able to make 3D measurement in a intuitive fashion. In the section, we¡¯ll discuss the camera misalignment in stereoscopic video system. And then we¡¯ll describe our solution to this problem: Digital Image Processing.
Fig. 2 Low turbidity underwater scene, with superimposed Virtual Tape Measure.
The fundamental principle of the human stereoscopic (binocular) vision is that two eyes observe two images which are slightly different. It is our brain to fuse these two images and to extract 3D information from them. When we look at something, our eyes are always configured in such a way that two observed images are just fine for our brain to interpret, Or in another word, our brain has been trained to work with our eyes. The situation is different with stereoscopic video. In this case, two images observed by eyes are taken from two cameras and displayed onto a screen. There are certain camera alignment requirements to build a high quality stereoscopic video system. Examples of such requirements include alignment of optical axes and the same focal length for each of the two cameras. In fact, it is a sophisticate topic to study the configuration of stereoscopic video system .
It is well known that when some of alignment requirements are not satisfied, the resulting stereoscopic display is difficult or uncomfortable to view. Departures from desirable alignments may cause that an observer is unable to fuse the left and right images. (The binocular capabilities of the observer/user also influence the fusion and related perception).
In some practical application, to build a stereoscopic camera system often involves a pair of monoscopic cameras with some options in their mounting to adjust the direction (of the optical axis) of one or two mono-cameras, their horizontal separation and their vertical alignment, convergence angle and so on. This approach facilitates adjustability of various parameters. It comes at an additional cost associated with the involvement of a technical specialist who can implement the camera alignment and calibration. Inevitably, this is a time-consuming procedure. In essence, it is not trivial to construct a high quality stereoscopic imaging system that would offer flexibility and precision in the adjustments.
It is interesting to observe the stereoscopic systems that nature has ¡°developed¡±. Human eyes, for example, may have misalignments. However, the brain is capable of compensating for the problems and it fuses images seen by the eyes. It is therefore conceivable to think of an analogous intelligent system, which could similarly correct images resulting from misaligned stereoscopic cameras, so that they would be consistent with each other and thereby fusible.
Our recent research work results in a new framework Correction of Misaligned Stereoscopic Images by Means of Digital Image Processing, which provides the means to solve the camera misalignment problem in the digital world . Fig.3 shows the overview when using the new framework in the 3D measurement.
The system we proposed can be implemented as an additional feature to the existing ROV video inspection system. The presented system consists of cameras, communication cables, a display device and maybe a recording device, while most of them are already exist in traditional video inspection system. An ROV operator can switch from the stereoscopic (3D) operation mode to the monoscopic (2D) mode, or vise versa according to her/his need. This feature provides a friendly operation environment to the ROV operator.
The requirements, to implement the proposed 3D system based on an existing video inspection system, are (1) an additional camera mounted on ROV, (2) a PC computer with operator to provide digital video data (3) a stereoscopic display device to the operator, and (4) the associated software.
A prototype system with in-air cameras and the most recent associated software has been tested . The results are very promising.
 Drascic, D., Milgram, P., Grodski, J., Wong, P., Zhai, S. and Ruffo, K. "ARGOS: A display system for augmenting reality", Video Proc. INTERCHI'93 Amsterdam, April 1993.
 Kim M., Drake J.,Milgram P., ¡°Virtual Tape Measure for the Operating Microscope: System Specifications and Performance Evaluation¡±, submitted to Computer Aided Surgery.
 Diner, Daniel B., and Fender, Derek H., Human Engineering in Stereoscopic Viewing Devices, Plenum Press, New York, 1993.
 Yin, S., and Grodski, J., Correction of Misaligned Stereoscopic Images by Means of Digital Image Processing, Internal report, November, 2000