# 16.7M pixel 8000fps sparse binarized scientific image sensor

Peng Gao, Sampsa Veijalainen, Jente Basteleus, Gaozhan Cai, Bert Luyssaert, Bart Dierickx Caeleste, Hendrik Consciencestraat 1 b, 2800 Mechelen, Belgium peng.gao@caeleste.be +32488580623

## Introduction

Series of binarized images can be used to construct low/no noise, high dynamic range images at high (>1000fps) frame rates. This approach is a benefit to applications such as scientific, medical and SPAD or QIS [3,4,5] sensors where speed is needed to avoid multiple hits on the same pixel in a single frame, especially for large array sizes. In this paper, a large format (36.1x40.2 mm²), 4k x 4k CMOS binarized image sensor is presented. To further increase the frame rate in case of sparse imaging (which is often the case, especially with higher frame rate), on-chip data reduction has been implemented. Frame rates up to 8000fps can be reached where only the pixel kernels with hits, and their corresponding addresses are read out; otherwise, in brute force mode where all pixels are read out, the frame rate reaches 2800fps. This sensor is an evolution of [1] with increased pixel count, modified sparse readout algorithm, optimized priority encoder, optimized sense amplifier (1-bit ADC) and high speed digital IOs.

# Architecture and signal chain

The sensor architecture is shown in Figure 1. In the middle, the pixel array is realized by 2-2 stitching in order to create 8µm pixels and 4k resolution. Next to the pixel array, sense amplifiers are used in the top and bottom periphery to digitize the selected pixels. Two ways of sending out the digital data are present: first, a conventional horizontal scan where all pixel data is scanned out one after the other. Second is by sparse algorithm with priority encoder (smart scanning), where only the kernels with pixel hits can pass their data to the outside. In this latter



Figure 1 Sensor architecture

case, the sensor's frame rate is no longer limited by IO speed, but rather by the number of read out kernels per row or purely by circuit speed; specifically, column or bus settling.

To the left and right of the pixel array, a vertical shift register is implemented for row selection. The biasing and configuration blocks are placed at the corners. Within a stitching block, the readout circuit is divided into two identical sections, each reading-out 1024 columns. A simplified signal chain diagram of a section is shown in Figure 2. For sparse data reduction, a kernel of 64 pixels (16 x 4) is analysed together. Data for all 64 pixels is sent out when one or more pixels are hit. 8-bits are used to encode the position for every 4 columns. This signal chain starts from 64 4T pixels, 16284 current based sense amplifiers with kernel event detection, 256 steps address and priority encoder, 72 bits data bus and 8 LVDS channels output digital signal at 750Mbps.

# Key building blocks

#### • Sense amplifier

The purpose of the sense amplifier is to distinguish as fast as possible if a pixel is hit. As shown in Figure 2, the conventional column buffer is replaced by a TIA and comparator to improve the column settling speed. In this way



Figure 2 Signal chain of the sensor: from pixel to LVDS IO

the row time in simulation can be smaller than 1µs. Although the pixel is a classic 4T type, the SF (source follower) is no longer a voltage buffer, but a FD (floating diffusion) controlled current source. Its charge modulated output current ( $\Delta I_{out} = \Delta e^{-*}CVF_{FD}^*g_m$ ) is nonlinear due to the nonlinearity of gm, which is acceptable for a binary output pixel. At the TIA output a CVFTIA  $(^{\text{g}_{\text{m}}}*R_{\text{fb}}*CVF_{\text{FD}})$  of 520uV/eobtained. The higher the CVFTIA the less variable the detection threshold. One drawback of this implementation is the SF offset; specifically, the Vth mismatch. This offset will be amplified at the TIA output and can cause large pixel sensitivity variation and even dead pixels. To reduce variations in the sensitivity, offset cancelation is used, as shown in Figure 3. The reset level for each pixel is sampled on C1 and used as the TIA reference during signal phase. The row time increases

due to voltage domain settling, but the pixel sensitivity uniformity is significantly improved. The other feature of the TIA is that its reset crosstalk is sampled on both C2 and C3. Before the clocked comparator acts, the bottom plate of C2 changes from DC1 to DC2 resulting in a fixed offset of (DC1-DC2) \*C3/(C2+C3) applied to the comparator, which can be used to fine tune the detection threshold with a step smaller than 5e-.

#### • Priority encoder and high-speed data bus

The principle of the priority encoder and high-speed data bus are similar to [1]. It samples the kernel flag and only the flagged kernel can propagate its data and address. Each kernel has lookahead logic to check if there is any flag up-front. Any flagged data found will be held on standby. The first flagged kernel will put its 64-bit data and 8-bit coded address on the high-speed data bus and disable its flag simultaneously. Then the second leftmost flagged kernel can pass its data in the next clock period because it is now the first flag. In this way all flagged kernels will be able to output their data one after the other. In this work the look ahead logic is optimized in a way that the segmentation is made to balance the fanout between logics, and the longest flag propagation delay that jumps from first to last kernel (8mm apart on chip) is within 1ns. Once the kernel data is on the 72 8mm long parallel data bus, the delay towards serializer is 2ns when the current mode is in use. This is more than a factor 5 improvement compared to voltage mode.



Figure 3 Sense amplifier block diagram and its control clocks

## Measurement results

The number of false positives as function of the threshold is shown in Figure 4(a), when threshold is set to be larger than 130e-, the false positive rate drops down to 3e-7, which is sufficient for many applications. The number of pixels hit at a given charge input level is shown in Figure 4(b). The mean threshold is 130e- and the variability is 28e-, measured with the highest SF  $g_m$  setting.





Figure 4 (a): % of false hits as function of threshold. (b): Number of pixels hit at a certain input charge level.

The SF's gm is swept. The higher the gm the lower the variability on the detection threshold.

An Fe55 source measurement is shown in Figure 5. A single X-ray photon generates a 1620e- [2] charge packet that is spread between 1,2,3 or 4 pixels, possibly resulting in single, double, triple or quadruple hits. All four types of hits are detected for a threshold smaller than 340e-. Between 400e- and 480e-, no quadruple hits are found. Above 550e-, only single and double hits can still be detected. In Figure 8, a series of five binary images (panels 1-5) are taken with 40e- detection threshold apart, the sixth is a 3 bit image that averages 8 threshold steps. Figure 6 shows the chip photo and Figure 7 shows the test slice result. Key sensor performance is summarized in Table 1.

## Conclusion

This 4k by 4k 8 $\mu$ m pixel sensor is designed for high-speed, sparse imaging. In order to reach high frame rates, the priority encoder analyzes the kernel data and only sends out the kernels with information. This mode allows for a trade-off between the number of read-out kernels and increased frame rate. The internal settling time is reduced by using current mode readout in both the column offset cancelation readout and the long data bus readout. With both methods, the frame rate can be increased from 2800fps in brute force mode to 8000fps in sparse

**Table 1 Summary of performance** 

| Array size          | 4096*4096              |
|---------------------|------------------------|
| Pixel pitch         | 8 μm 4T pixel          |
| Output interface    | 48Gbps (64*750Mbps/ch) |
| Shutter type        | Rolling                |
| ADC                 | 1 bit                  |
| CVF at FD (TIA out) | 66 (580)μV/e-          |
| Technology          | CIS 0.18μm             |
| Frame rate sparse   | 8000 fps               |
| Min threshold (σ)   | 130e- (28e-)           |
| Supply              | 3.3V, 1.8V             |
| Power               | <5W                    |

mode. The detection threshold can be programmed from 130e- to 600e- with variation as small as 28e-.





Figure 5 Upper right: Histogram of the 4 types Fe55 hits at different charge thresholds. Bottom right: Possible shapes of single, double, triple, quadruple hits.

Figure 6 Chip photograph



Figure 8 Images with different detection thresholds, 1-5 threshold increased by 40 e- between each image. 6, the average of 8 different images.



Figure 7 Layout of the optical test pattern (right), electrical (middle), optical (left) measurement result.

#### **References:**

- [1] G. Cai, et al. "Imaging sparse events at high speed," in IISW, 2015
- [2] J. Janesick, et al. "Fundamental performance differences between CMOS and CCD imagers: Part II," Proc. SPIE 2007
- [3] P.Chandramouli, et al." A `Little Bit' Too Much? High Speed Imaging from Sparse Photon Counts", arXiv:1811.02396
- [4] A.Gnanasambandam, et al. "megapixel Photon-Counting Color Imaging using Quanta Image Sensor", arXiv:1903.09036v1
- [5] N. Dutton et al., "A SPAD-Based QVGA Image Sensor for Single-Photon Counting and Quanta Imaging," in IEEE TED Jan, 2016.