TODOs
- Geometric Image Transformations
-
findCriclesGrid -
reprojectImageTo3D - 随机pattern可以通过
cv::randpattern::RandomPatternGenerator生成,参考opencv doc: Multi-camera calibration - background reduction
- computeSailency
- __m128i
namespaces and so on
- hal: Hardware Acceleration Layer. This seperate mudule is used to accelerate OpenCV for different platforms. The HAL can be developed independently from OpenCV. A HAL doesn’t have to implement all operations; it doesn’t depend on OpenCV API.
- SSE2, Streaming SIMD Extensions 2(一种和硬件加速有关的指令集)
- ocl: Open Computing Language (OpenCL) is an open standard for writing code that runs across heterogeneous platforms including CPUs, GPUs, DSPs and etc.
- IPP: Intel Integrated Performance Primitives
- TBB: Intel Threading Building Blocks. Parallelization in Android is done via Intel TBB. 简单使用教程参考这里
main modules
highgui
-
setWindowProperty
core
core-array
-
copymakeborder各种bordertypes -
perspectiveTransform -
gemm:gemeralized matrix multiplication,计算
$$\texttt{dst} = \texttt{alpha} \cdot \texttt{src1} ^T \cdot \texttt{src2} + \texttt{beta} \cdot \texttt{src3} ^T$$ -
mulTransposed:计算
$$\texttt{dst} = \texttt{scale} ( \texttt{src} - \texttt{delta} )^T ( \texttt{src} - \texttt{delta} )$$如
calcCovarMatrix中使用 -
convertPointsToHomogeneous和convertPointsFromHomogeneous -
reduce:把一个矩阵通过某种方式“简化”成一个向量这种方式可以是按行或列求最值CV_REDUCE_MAX/MIN、均值CV_REDUCE_AVG或求和CV_REDUCE_SUM -
calcCovarMatrix:计算协方差矩阵,对于未normalized的矩阵,需要手动除以$n-1$,例如1 2 3 4 5cv::Mat data = (cv::Mat_<double>(2, 6) << -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, -0.8, -1.2, 0.4, -0.5, 0.3, 1.8); cv::Mat mean, covar; cv::calcCovarMatrix(data, covar, mean, cv::COVAR_NORMAL + cv::COVAR_COLS); // 结果covar是[[17.5,8.3], [8.3,5.82]],都应该再除以6-1=5 -
SVD::compute -
LDA降维/分类的“投影轴”数量,和有标签的输入数据的类别数量有关(而不是PCA、和输入数据特征维度数量有关) -
solve -
eigen -
norm -
dft -
LUT -
x
core-optim
-
solveLP求解(non-integer) linear programming线性规划问题(是Introduction to Algorithms (Third Edition).2009.Cormen中Simplex的实现)
core-utils
-
setUseOptimized和useOptimized -
getTickCount和getTickFrequency获取程序执行时间1 2 3 4 5int64 e1 = cv::getTickCount(); // do some workd int64 e2 = cv::getTickCount(); std::cout << (e2 - e1) / cv::getTickFrequency() << std::endl; std::cout << std::boolalpha << cv::useOptimized() << std::endl;
features2d
imgproc
-
getAffineTransform:利用3组对应点计算一个平面/二维坐标变换矩阵,示例参考这里;参考源码只接受三组对应点的输入;建立线性方程组,利用solve函数进行LU分解求解 -
getPerspectiveTransform:利用4组对应点计算一个空间/三维坐标变换矩阵,示例参考这里;参考源码只接受四组对应点的输入;建立线性方程组,利用solve函数进行SVD分解求解 -
warpAffine -
warpPerspective -
grubcut:从围绕要分割的对象的用户指定的边界框开始,算法使用高斯混合模型估计目标对象和背景的颜色分布。这用于在像素标签上构建马尔可夫随机场,其具有优选具有相同标签的连接区域的能量函数,并且运行基于图切割的优化以推断它们的值。由于这个估计可能比边界框中的原始估计更准确,所以重复这个两步程序直到收敛。由于分割过程需要训练高斯混合模型,所以大图像比较耗时。参考zhihu:opencv实战从0到n (15)- grabcut分割(抠图) -
matchshapes -
matchtemplate(一些相关讨论参考opencv. how to find correlation coefficient for two images可以用该函数计算两个图像的相关系数);实现文件在templmatch.cpp) -
cv::integral,例如用积分图$\texttt{sum}$计算图像$\texttt{image}$的区间$y_1\leq y<y_2, x_1\leq x<x_2$内的和:
$$\sum_{y_1\leq y<y_2}\sum_{x_1\leq x<x2}\texttt{image}(x,y)=\texttt{sum}(x_2,y_2)-\texttt{sum}(x_1,y_2)-\texttt{sum}(x_2,y_1)+\texttt{sum}(x_1,y_1)$$
structural analysis and shape descriptors
-
findContours1可以用灰度图,但一般建议用二值图;返回值是一个个点-
计算中心点坐标为整数,曾经出现的原因:
1 2 3img.setto(2, 255-mask); // 注意这里“2”写错了,导致后面坐标为整数 cv::findcontours(img, contours, cv::rete_list, cv::chain_approx_none); // 后续用norm或者fitellipse得到的轮廓中心都是整数,或者.5
-
-
contertourArea利用格林公式求轮廓积分;不会自动减去中间孔洞面积(无论oriented是否为真),且如果检测目标是黑色背景上的白色目标时oriented有意义 -
convexHull,moments参考数学/计算机 - 图形学Graphics
calib3d
-
initCameraMatrix2D/cvInitIntrinsicParams2D初始化相机内参矩阵(使用如源码samples中的stereo_calib示例)// extract varnishing points in order t obtain initial value of the focal length
-
cvFindHomography,参考Fixed Single-Camera 3D Laser Scanning.2020.Zausa利用平面已知物理间距点及其图像坐标、使用该函数计算$H$后,利用下式计算平面到相机坐标系的转换
$$\frac{K^{-1}H}{|K^{-1}H|}=[r_1,r_2,t]$$当然上式直接计算的结果还可以通过SVD分解以使得$r_0$和$r_1$正交;且可以此计算平面方程。
-
image warp:将标定板图案利用homograph转换成frontal,检测角点/圆心提高精度??流程参考Fixed Single-Camera 3D Laser Scanning.2020.Zausa附录代码,大致流程为
graph LR a[`findContours`<br>查找轮廓] --> b[`approxPolyDP`<br>筛选出矩形] --> c[`erode`缩放筛选出宽黑边<br>包围的矩形区域] --> d[根据已知的pattern长宽比<br>用函数`findHomography`计算H] --> e[`warpPerspective`<br>转换成frontal图] --> 检测角点 --> f[`perspectiveTransform`利用H_inv<br>将检测到的角点转换回原图] -
另一个应用示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20K_inv = np.linalg.inv(K) # K: camera intrinsic matrix metricPoints: np.ndarray = np.array( [[[0, rectHeight]], [[0, 0]], [[rectWidth, 0]], [[rectWidth, rectHeight]]]) H = cv2.findHomography(imgPoints, metricPoints)[0] # First, apply the inverse intrinsics to the homography to remove the camera result = np.matmul(K_inv, H) # We need to normalize our matrix to remove the scale factor result /= cv2.norm(result[:, 1]) # Split homography columns to get two 2D rotation basis vectors and translation r0, r1, t = np.hsplit(result, 3) # To get the third rotation basis vector simply make the cross product r2 = np.cross(r0.T, r1.T).T # Since r0 and r1 may not be orthogonal, we use the Zhang aproximation: # we keep only the u and vt part of the SVD of the new rotation basis matrix # to minimize the Frobenius norm of the difference _, u, vt = cv2.SVDecomp(np.hstack([r0, r1, r2])) R = np.matmul(u, vt) # We finally have our origin center and normal vector origin = t[:, 0] normal = R[:, 2]
-
-
cvRodrigues2 -
cvProjectPoints2 -
matMulDeriv/计算两个矩阵相乘操作对两个矩阵的偏导数,使用如composeRT函数,参考数学 - 微积分Calculus -
composeRT/cvComposeRT将两个变换矩阵$[R_1(\text{om}_1)|T_1]$和$[R_2(\text{om}_2)|T_2]$合成为一个$[R_3(\text{om}_3)|T_3]$(函数输入、输出参数中旋转都是用旋转向量$\text{om}$表示的,所以求导中有一步Rodirgues转换过程)
$$\begin{bmatrix}R_3&T_3\cr0&1\end{bmatrix}=\begin{bmatrix}R_2&T_2\cr0&1\end{bmatrix}\begin{bmatrix}R_1&T_1\cr 0&1\end{bmatrix}=\begin{bmatrix}R_2R_1&R_2T_1+T_2\cr0&1\end{bmatrix}$$并计算导数
$$ \frac{\partial \text{om}_3}{\partial \text{om}_1}, \frac{\partial \text{om}_3}{\partial \text{om}_2}, \frac{\partial \text{om}_3}{\partial T_1}, \frac{\partial \text{om}_3}{\partial T_2}, \frac{\partial T_3}{\partial \text{om}_1}, \frac{\partial T_3}{\partial \text{om}_2}, \frac{\partial T_3}{\partial T_1}, \frac{\partial T_3}{\partial T_2} $$ -
stereoCalibrate/cvStereoCalibrate(见stereoCalib.md) -
stereoRectify
$Q$矩阵查看0项 -
reprojectImageTo3D:把双目矫正过图像中计算出的disparity视差图(稠密)映射成三维点云($z=bf/d=T_xf/(c_x-c_x’-d)=Z/W$),主要原理:
$$Q\begin{bmatrix} u\cr v \cr d \cr 1 \end{bmatrix}= \begin{bmatrix}1&0&0&-c_x\cr0&1&0&-c_y\cr0&0&0&f\cr0&0&-\frac{1}{T_x}&\frac{c_x-c_x’}{T_x}\end{bmatrix}\begin{bmatrix} u\cr v \cr d \cr 1 \end{bmatrix}=\begin{bmatrix} u-c_x\cr v-c_y \cr f \cr \frac{c_x-c_x’-d}{T_x} \end{bmatrix}= \begin{bmatrix} X\cr Y \cr Z \cr W \end{bmatrix}$$参考OpenCV: derivation for perspective transformation matrix (Q).2018和csdn: reprojectImageTo3D函数.2019;也可以自行实现该函数功能,如stackoverflow: reprojectImageTo3D() in OpenCV.2014;OpenCV中该函数的实现中,==考虑了$Q$矩阵中的0项为什么==?
-
estimateAffine2D
$$\begin{bmatrix} x\cr y\cr \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12}\ a_{21} & a_{22}\cr \end{bmatrix} \begin{bmatrix} X\cr Y\cr \end{bmatrix} + \begin{bmatrix} b_1\cr b_2\cr \end{bmatrix}$$和
getAffineTransform的区别? -
getOptimalNewCameraMatrix -
stereoBM- pre-filter:类似边缘检测的过程,“intensity normalization”
- cost:SAD
- post-filter
-
uniqueness ratio:最好匹配视差$d$的cost应该比次好比配$d^{*}$的cost“小”的比例
$$\texttt{SAD}(d)\geq\texttt{SAD}(d^*)\cdot(1+\frac{\texttt{uniquenessRatio}}{100})$$ -
texture:窗口内所有像素纹理(pre-filter的结果;类似边缘纹理)求和
-
speckle:四邻域内视差(注意是否需要乘16)差别不大于
speckleRange的像素点连成一个区域blob,若该区域像素个数不小于speckleWindowSize的区域保留,否则滤除
-
- 关于加速:
- 按行分块并行计算
- 避免窗口内重复计算(?)integral image的方式(?)
-
其他
ml
- references
- Additive logistic regression:a statistical view of boosting.2000.Friedman
gpu
CUDA-accelerated Computer vision
legacy
objdetect
photo
stiching
extra modules
ximgproc
- Extended Image Processing,包含功能参考这里
- 命名空间
ximgproc - Spare match interpolators 稀疏匹配插值
- Disparity map filters 视差图滤波
- Adaptive Manifold filter
- Weighted Least Squares filter
- Domain Transform filter
- Fast Bilateral Solver
- Fast Global Smoother filter
- Guided filter
- Edge Aware Interpolator
- Robust Interpolation method of Correspondences (RIC)
- Ridge Detection filter
- 其他一些函数
-
getDisparityVis可视化显示视差图 -
computeBadPixelPercent计算和ground truth相比的错误率 -
computeMSE计算和ground truth相比的mean square error -
readGT读取ground truth视差
-
phase_unwrapping
structured_light2
-
未做加速优化,速度慢
-
参考OpenCV: Structured light tutorials中的decode gray code pattern tutorial理解
graph LR a["stereoRectify"] --> b["initUnidistortRectifyMap"] --> c["remap"] --> d["decode"] --"disparity=Δx"--> e["reprojectImageTo3D"] -
graycode:
- 双目或多目;
- 双向格雷码;
- 计算对应投影分辨率的disparity视差图;
- 已经经过双目矫正,所以disparity是对应同一投影像素坐标的两个相机横坐标之差
- 通过disparity计算点云
-
sinusoidal
- 有三种计算相对相位的方法
- PSP(Phase Shifting Profilometry;需要三张图)
$$\phi=\arctan{\frac{(1-\cos(\psi)\cdot(I_3-I_2))}{\sin(\psi)\cdot(2I_1-I_2-I_3)}}$$ - FTP3(Fourier Transform Profilometry;;只需一张图,但为了平均计算mask,也需要三张)
$$ \phi=\operatorname{FTP}(I_1)=\arctan{\frac{\operatorname{re}(\mathcal{F}^{-1}(\mathcal{F}(I_1)))}{\operatorname{im}(\mathcal{F}^{-1}(\mathcal{F}(I_1)))}} $$ - FAPS3(Fourier-Assisted Phase Shifting)
- 基本的解算相位仍是对一幅图使用FTP,而为了补偿目标运动,需要使用相邻的三张图
$$\phi=\arctan{\frac{1-\cos\theta_2+(1-\cos\theta_1)h}{\sin\theta_1 h-\sin\theta_2}},h=\frac{I_2-I_3}{I_1-I2},\\ \theta_1=\operatorname{FTP}(I_2)-\operatorname{FTP}(I_1),\theta_2=\operatorname{FTP}(I_3)-\operatorname{FTP}(I_2)$$ - 投影图案上有规则分布的一些小白点,是为了spatial phase unwrapping,具体参考原文
- 基本的解算相位仍是对一幅图使用FTP,而为了补偿目标运动,需要使用相邻的三张图
- PSP(Phase Shifting Profilometry;需要三张图)
- phase unwrap
- 有三种计算相对相位的方法
Related links
- The mathematics principles behind the OpenCV function Calibration:OpenCV forum,2020-02-04,not answered
- Stereo Calibration - what is optimised?:OpenCV forum,2016-02-25,answer not accepted
References
-
A Fast Operator for Detection and Precise Location of Distinct Points, Corners and Centres of Circular Features.1987.Forstner,中文解读参考cnblogs: 图像特征点检测相关 ↩︎
-
结构光模块文献来源:3DUNDERWORLD-SLS An Open-Source Structured-Light Scanning System for Rapid Geometry Acquisition.2014.Gu ↩︎
-
Accurate Dynamic 3D Sensing with Fourier-Assisted Phase Shifting.2015.Cong ↩︎ ↩︎