Real time off-axis quantitative phase imaging using CUDA
In the past decade, quantitative phase imaging (QPI) has attracted increasing scientific interest in the area of cell and tissue imaging as it can study structure and dynamics with nanoscale sensitivity and without exogenous contrast agents. Typically, in order to obtain the pathlength map from an acquired interferogram image, QPI involves off-line post processing. In particular, off-axis methods require an unwrapping algorithm to remove the high-frequency spatial modulation. Phase unwrapping is the process of reconstructing the true phase information from the measured wrapped values which are between –π to +π. High throughput, high speed, real-time phase unwrapping is highly desirable in many applications including applied physics and biomedicine. However, to the best of our knowledge, currently there are no phase unwrapping algorithms that allow QPI operation at video rates (i.e., ~30 frames/s).
Off-axis interferometry takes advantage of the spatial phase modulation introduced by the angularly shifted (tilted) reference plane wave and the spatially-resolved measurement allowed by a 2D detector array such as a CCD. Essentially off-axis interferometry is the spatial equivalent of heterodyne detection in the time domain. Compared to phase-shifting methods, off axis-interferometry allows for single shot measurements and, thus, fast acquisition rates.
We demonstrate real time off-axis Quantitative Phase Imaging (QPI) using a phase reconstruction algorithm based on NVIDIA’s CUDA programming model. The phase unwrapping component is based on Goldstein’s algorithm. Fig. 1 illustrates the phase reconstruction procedure in QPI system.
Fig. 1. Phase reconstruction in QPI system
By mapping the process of extracting phase information and unwrapping to GPU, we are able to speed up the whole procedure by more than 18.8× with respect to CPU processing and ultimately achieve video rate for mega-pixel images. Table 1 compares the run time between the two implementations. The results shown were averaged over 20 images for each image size. Our CUDA implementation also supports processing of multiple images simultaneously. This enables our imaging system to support high speed, high throughput, and real-time image acquisition and visualization.
Table 1: CUDA implementation versus C based sequential implementation
|
Image Size |
CPU/GPU |
Phase extraction (ms) |
Residue Identification (ms) |
Branch cut Placement (ms) |
Unwrap (ms) |
Total (ms) |
| 1024×1024 | CPU |
317.42 |
43.42 |
6.74 |
89.32 |
460.7 |
| 1 frame | GPU |
5.05 |
0.58 |
1.125 |
10.014 |
24.55 |
| Speedup factor |
62.86 |
74.19 |
5.99 |
8.92 |
18.77 |
|
| 1024×1024 | CPU |
3174.2 |
434.2 |
67.4 |
893.2 |
4607.4 |
| 10 frames | GPU |
40.486 |
5.55 |
1.128 |
45.285 |
111.1 |
| Speedup factor |
78.4 |
78.19 |
59.71 |
19.72 |
41.47 |
|
| 512×512 | CPU |
71 |
11 |
5 |
16 |
105 |
| 1 frame | GPU |
2.18 |
0.2 |
0.02 |
1.87 |
8 |
| Speedup factor |
32.61 |
55.84 |
250 |
8.55 |
13.13 |
|
| 512×512 | CPU |
710 |
110 |
50 |
160 |
1050 |
| 10 frames | GPU |
11.57 |
1.4 |
0.02 |
6.722 |
26 |
| Speedup factor |
61.37 |
78.57 |
2500 |
23.8 |
40.38 |
Clearly, the GPU implementation demonstrates tremendous improvement on run time performance. The total run time for a single 1024×1024 image reduced from an average of 460 milliseconds for the sequential C-code implementation to 24.55 milliseconds on GPU, which is now suitable for video rate. The total run time for a single lower resolution (512×512) image is 8 milliseconds, allowing for much higher image acquisition rates.
We anticipate that in the near future, from the unwrapped phase images, CUDA-based modules will compute in real-time quantitative parameters of the imaged objects, e.g., cell volumes, refractive indices, tissue morphological parameters, etc, useful for both basic biological studies and medical diagnosis.
Related Publication
H. Pham, H. Ding, N. Sobh, M. Do, S. Patel and G.Popescu , Off-axis quantitative phase imaging processing using CUDA: toward real-time applications , Biomed. Opt. Exp., 2 (7), (2011).
