Delve into OpenCV

TODOs

Geometric Image Transformations
findCriclesGrid
reprojectImageTo3D
随机pattern可以通过cv::randpattern::RandomPatternGenerator生成，参考opencv doc: Multi-camera calibration
background reduction
computeSailency
__m128i

namespaces and so on

hal: Hardware Acceleration Layer. This seperate mudule is used to accelerate OpenCV for different platforms. The HAL can be developed independently from OpenCV. A HAL doesn’t have to implement all operations; it doesn’t depend on OpenCV API.
SSE2, Streaming SIMD Extensions 2（一种和硬件加速有关的指令集）
ocl: Open Computing Language (OpenCL) is an open standard for writing code that runs across heterogeneous platforms including CPUs, GPUs, DSPs and etc.
IPP: Intel Integrated Performance Primitives
TBB: Intel Threading Building Blocks. Parallelization in Android is done via Intel TBB. 简单使用教程参考这里

main modules

highgui

setWindowProperty

core

core-array

copymakeborder各种bordertypes
perspectiveTransform
gemm：gemeralized matrix multiplication，计算
$$\texttt{dst} = \texttt{alpha} \cdot \texttt{src1} ^T \cdot \texttt{src2} + \texttt{beta} \cdot \texttt{src3} ^T$$
mulTransposed：计算
$$\texttt{dst} = \texttt{scale} ( \texttt{src} - \texttt{delta} )^T ( \texttt{src} - \texttt{delta} )$$

如calcCovarMatrix中使用
convertPointsToHomogeneous和convertPointsFromHomogeneous
reduce：把一个矩阵通过某种方式“简化”成一个向量这种方式可以是按行或列求最值CV_REDUCE_MAX/MIN、均值CV_REDUCE_AVG或求和CV_REDUCE_SUM

calcCovarMatrix：计算协方差矩阵，对于未normalized的矩阵，需要手动除以$n-1$，例如

1
2
3
4
5


cv::Mat data = (cv::Mat_<double>(2, 6) << -2.5, -1.5, -0.5, 0.5, 1.5, 2.5,
                                        -0.8, -1.2, 0.4, -0.5, 0.3, 1.8);
cv::Mat mean, covar;
cv::calcCovarMatrix(data, covar, mean, cv::COVAR_NORMAL + cv::COVAR_COLS);
// 结果covar是[[17.5,8.3], [8.3,5.82]]，都应该再除以6-1=5

SVD::compute
LDA降维/分类的“投影轴”数量，和有标签的输入数据的类别数量有关（而不是PCA、和输入数据特征维度数量有关）
solve
eigen
norm
dft
- Meaning of cv::idft() with DFT_REAL_OUTPUT option
LUT
x

core-optim

solveLP求解(non-integer) linear programming线性规划问题（是Introduction to Algorithms (Third Edition).2009.Cormen中Simplex的实现）

core-utils

参考：opencvdoc: Optimization Algorithms
setUseOptimized和useOptimized

getTickCount和getTickFrequency获取程序执行时间

1
2
3
4
5


int64 e1 = cv::getTickCount();
// do some workd
int64 e2 = cv::getTickCount();
std::cout << (e2 - e1) / cv::getTickFrequency() << std::endl;
std::cout << std::boolalpha << cv::useOptimized() << std::endl;

features2d

imgproc

getAffineTransform：利用3组对应点计算一个平面/二维坐标变换矩阵，示例参考这里；参考源码只接受三组对应点的输入；建立线性方程组，利用solve函数进行LU分解求解
getPerspectiveTransform：利用4组对应点计算一个空间/三维坐标变换矩阵，示例参考这里；参考源码只接受四组对应点的输入；建立线性方程组，利用solve函数进行SVD分解求解
warpAffine
warpPerspective
grubcut：从围绕要分割的对象的用户指定的边界框开始，算法使用高斯混合模型估计目标对象和背景的颜色分布。这用于在像素标签上构建马尔可夫随机场，其具有优选具有相同标签的连接区域的能量函数，并且运行基于图切割的优化以推断它们的值。由于这个估计可能比边界框中的原始估计更准确，所以重复这个两步程序直到收敛。由于分割过程需要训练高斯混合模型，所以大图像比较耗时。参考zhihu：opencv实战从0到n （15）- grabcut分割（抠图）
matchshapes
matchtemplate（一些相关讨论参考opencv. how to find correlation coefficient for two images可以用该函数计算两个图像的相关系数）；实现文件在templmatch.cpp）
cv::integral，例如用积分图$\texttt{sum}$计算图像$\texttt{image}$的区间$y_1\leq y<y_2, x_1\leq x<x_2$内的和：
$$\sum_{y_1\leq y<y_2}\sum_{x_1\leq x<x2}\texttt{image}(x,y)=\texttt{sum}(x_2,y_2)-\texttt{sum}(x_1,y_2)-\texttt{sum}(x_2,y_1)+\texttt{sum}(x_1,y_1)$$

structural analysis and shape descriptors

findContours¹可以用灰度图，但一般建议用二值图；返回值是一个个点

计算中心点坐标为整数，曾经出现的原因：

1
2
3


img.setto(2, 255-mask); // 注意这里“2”写错了，导致后面坐标为整数
cv::findcontours(img, contours, cv::rete_list, cv::chain_approx_none);
// 后续用norm或者fitellipse得到的轮廓中心都是整数，或者.5

contertourArea利用格林公式求轮廓积分；不会自动减去中间孔洞面积（无论oriented是否为真），且如果检测目标是黑色背景上的白色目标时oriented有意义
convexHull, moments参考数学/计算机 - 图形学Graphics

calib3d

initCameraMatrix2D/cvInitIntrinsicParams2D初始化相机内参矩阵（使用如源码samples中的stereo_calib示例）

// extract varnishing points in order t obtain initial value of the focal length

cvFindHomography，参考Fixed Single-Camera 3D Laser Scanning.2020.Zausa利用平面已知物理间距点及其图像坐标、使用该函数计算$H$后，利用下式计算平面到相机坐标系的转换
$$\frac{K^{-1}H}{|K^{-1}H|}=[r_1,r_2,t]$$

当然上式直接计算的结果还可以通过SVD分解以使得$r_0$和$r_1$正交；且可以此计算平面方程。

image warp：将标定板图案利用homograph转换成frontal，检测角点/圆心提高精度？？流程参考Fixed Single-Camera 3D Laser Scanning.2020.Zausa附录代码，大致流程为

graph LR
a[`findContours`<br>查找轮廓] --> b[`approxPolyDP`<br>筛选出矩形] --> c[`erode`缩放筛选出宽黑边<br>包围的矩形区域] --> d[根据已知的pattern长宽比<br>用函数`findHomography`计算H] --> e[`warpPerspective`<br>转换成frontal图] --> 检测角点 --> f[`perspectiveTransform`利用H_inv<br>将检测到的角点转换回原图]

另一个应用示例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


K_inv = np.linalg.inv(K) # K: camera intrinsic matrix
metricPoints: np.ndarray = np.array(
    [[[0, rectHeight]], [[0, 0]], [[rectWidth, 0]], [[rectWidth, rectHeight]]])
H = cv2.findHomography(imgPoints, metricPoints)[0]
# First, apply the inverse intrinsics to the homography to remove the camera
result = np.matmul(K_inv, H)
# We need to normalize our matrix to remove the scale factor
result /= cv2.norm(result[:, 1])
# Split homography columns to get two 2D rotation basis vectors and translation
r0, r1, t = np.hsplit(result, 3)
# To get the third rotation basis vector simply make the cross product
r2 = np.cross(r0.T, r1.T).T
# Since r0 and r1 may not be orthogonal, we use the Zhang aproximation:
# we keep only the u and vt part of the SVD of the new rotation basis matrix
# to minimize the Frobenius norm of the difference
_, u, vt = cv2.SVDecomp(np.hstack([r0, r1, r2]))
R = np.matmul(u, vt)
# We finally have our origin center and normal vector
origin = t[:, 0]
normal = R[:, 2]

cvRodrigues2
cvProjectPoints2
matMulDeriv/计算两个矩阵相乘操作对两个矩阵的偏导数，使用如composeRT函数，参考数学 - 微积分Calculus
composeRT/cvComposeRT将两个变换矩阵$[R_1(\text{om}_1)|T_1]$和$[R_2(\text{om}_2)|T_2]$合成为一个$[R_3(\text{om}_3)|T_3]$（函数输入、输出参数中旋转都是用旋转向量$\text{om}$表示的，所以求导中有一步Rodirgues转换过程）
$$\begin{bmatrix}R_3&T_3\cr0&1\end{bmatrix}=\begin{bmatrix}R_2&T_2\cr0&1\end{bmatrix}\begin{bmatrix}R_1&T_1\cr 0&1\end{bmatrix}=\begin{bmatrix}R_2R_1&R_2T_1+T_2\cr0&1\end{bmatrix}$$

并计算导数
$$ \frac{\partial \text{om}_3}{\partial \text{om}_1}, \frac{\partial \text{om}_3}{\partial \text{om}_2}, \frac{\partial \text{om}_3}{\partial T_1}, \frac{\partial \text{om}_3}{\partial T_2}, \frac{\partial T_3}{\partial \text{om}_1}, \frac{\partial T_3}{\partial \text{om}_2}, \frac{\partial T_3}{\partial T_1}, \frac{\partial T_3}{\partial T_2} $$
stereoCalibrate/cvStereoCalibrate（见stereoCalib.md）
stereoRectify
$Q$矩阵查看0项
reprojectImageTo3D：把双目矫正过图像中计算出的disparity视差图（稠密）映射成三维点云（$z=bf/d=T_xf/(c_x-c_x’-d)=Z/W$），主要原理：
$$Q\begin{bmatrix} u\cr v \cr d \cr 1 \end{bmatrix}= \begin{bmatrix}1&0&0&-c_x\cr0&1&0&-c_y\cr0&0&0&f\cr0&0&-\frac{1}{T_x}&\frac{c_x-c_x’}{T_x}\end{bmatrix}\begin{bmatrix} u\cr v \cr d \cr 1 \end{bmatrix}=\begin{bmatrix} u-c_x\cr v-c_y \cr f \cr \frac{c_x-c_x’-d}{T_x} \end{bmatrix}= \begin{bmatrix} X\cr Y \cr Z \cr W \end{bmatrix}$$

参考OpenCV: derivation for perspective transformation matrix (Q).2018和csdn: reprojectImageTo3D函数.2019；也可以自行实现该函数功能，如stackoverflow: reprojectImageTo3D() in OpenCV.2014；OpenCV中该函数的实现中，==考虑了$Q$矩阵中的0项为什么==？
estimateAffine2D
$$\begin{bmatrix} x\cr y\cr \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12}\ a_{21} & a_{22}\cr \end{bmatrix} \begin{bmatrix} X\cr Y\cr \end{bmatrix} + \begin{bmatrix} b_1\cr b_2\cr \end{bmatrix}$$

和getAffineTransform的区别？
getOptimalNewCameraMatrix
stereoBM
- pre-filter：类似边缘检测的过程，“intensity normalization”
- cost：SAD
- post-filter
  - uniqueness ratio：最好匹配视差$d$的cost应该比次好比配$d^{*}$的cost“小”的比例
    $$\texttt{SAD}(d)\geq\texttt{SAD}(d^*)\cdot(1+\frac{\texttt{uniquenessRatio}}{100})$$
  - texture：窗口内所有像素纹理（pre-filter的结果；类似边缘纹理）求和
  - speckle：四邻域内视差（注意是否需要乘16）差别不大于speckleRange的像素点连成一个区域blob，若该区域像素个数不小于speckleWindowSize的区域保留，否则滤除
- 关于加速：
  - 按行分块并行计算
  - 避免窗口内重复计算(?)integral image的方式（？）
其他

ml

references
- Additive logistic regression：a statistical view of boosting.2000.Friedman

gpu

CUDA-accelerated Computer vision

legacy

objdetect

photo

stiching

extra modules

ximgproc

Extended Image Processing，包含功能参考这里
命名空间ximgproc
Spare match interpolators 稀疏匹配插值
Disparity map filters 视差图滤波
- Adaptive Manifold filter
- Weighted Least Squares filter
- Domain Transform filter
- Fast Bilateral Solver
- Fast Global Smoother filter
- Guided filter
- Edge Aware Interpolator
- Robust Interpolation method of Correspondences (RIC)
- Ridge Detection filter
其他一些函数
- getDisparityVis可视化显示视差图
- computeBadPixelPercent计算和ground truth相比的错误率
- computeMSE计算和ground truth相比的mean square error
- readGT读取ground truth视差

phase_unwrapping

structured_light²

未做加速优化，速度慢

参考OpenCV: Structured light tutorials中的decode gray code pattern tutorial理解

graph LR
a["stereoRectify"] --> b["initUnidistortRectifyMap"] --> c["remap"] --> d["decode"] --"disparity=Δx"--> e["reprojectImageTo3D"]

graycode：
- 双目或多目；
- 双向格雷码；
- 计算对应投影分辨率的disparity视差图；
- 已经经过双目矫正，所以disparity是对应同一投影像素坐标的两个相机横坐标之差
- 通过disparity计算点云
sinusoidal
- 有三种计算相对相位的方法
  - PSP（Phase Shifting Profilometry；需要三张图）
    $$\phi=\arctan{\frac{(1-\cos(\psi)\cdot(I_3-I_2))}{\sin(\psi)\cdot(2I_1-I_2-I_3)}}$$
  - FTP³（Fourier Transform Profilometry；；只需一张图，但为了平均计算mask，也需要三张）
    $$ \phi=\operatorname{FTP}(I_1)=\arctan{\frac{\operatorname{re}(\mathcal{F}^{-1}(\mathcal{F}(I_1)))}{\operatorname{im}(\mathcal{F}^{-1}(\mathcal{F}(I_1)))}} $$
  - FAPS³（Fourier-Assisted Phase Shifting）
    - 基本的解算相位仍是对一幅图使用FTP，而为了补偿目标运动，需要使用相邻的三张图
      $$\phi=\arctan{\frac{1-\cos\theta_2+(1-\cos\theta_1)h}{\sin\theta_1 h-\sin\theta_2}},h=\frac{I_2-I_3}{I_1-I2},\\ \theta_1=\operatorname{FTP}(I_2)-\operatorname{FTP}(I_1),\theta_2=\operatorname{FTP}(I_3)-\operatorname{FTP}(I_2)$$
    - 投影图案上有规则分布的一些小白点，是为了spatial phase unwrapping，具体参考原文
- phase unwrap

The mathematics principles behind the OpenCV function Calibration：OpenCV forum，2020-02-04，not answered
Stereo Calibration - what is optimised?：OpenCV forum，2016-02-25，answer not accepted

Delve into OpenCV

TODOs

namespaces and so on

main modules

highgui

core

core-array

core-optim

core-utils

features2d

imgproc

structural analysis and shape descriptors

calib3d

ml

gpu

CUDA-accelerated Computer vision

legacy

objdetect

photo

stiching

extra modules

ximgproc

phase_unwrapping

structured_light²

References

What's on this Page

Delve into OpenCV

TODOs

namespaces and so on

main modules

highgui

core

core-array

core-optim

core-utils

features2d

imgproc

structural analysis and shape descriptors

calib3d

ml

gpu

CUDA-accelerated Computer vision

legacy

objdetect

photo

stiching

extra modules

ximgproc

phase_unwrapping

structured_light2

Related links

References

What's on this Page

structured_light²