Multiple shape matching – Holder’s inequality

 

Theory


Given two density functions f and g, we can measure the similarity between them using the Cauchy-Schwarz divergence (\mathcal{D_{CS}})  derived from the Cauchy-Schwarz inequality \left(\int{f(x)g(x)dx}\right)^2 \leq \int{f^2(x)dx}\int{g^2(x)dx}, where, for pdfs, the equality holds if and only if f(x)=Cg(x) with C=1.

\mathcal{I_{CS}} can be extended to three or more functions: \left(\int{f(x)g(x)h(x)dx}\right)^2 \leq \int{f^2(x)dx}\int{g^2(x)dx}\int{h^2(x)dx}. However, we cannot show when equality holds in this case. Even if the pdfs are identical, it does not guarantee equality. As a result, we provide the extension to more than two pdfs using Holder’s inequality:

\left(\int{f(x)g(x)h(x)dx}\right)^3 \leq \int{f^3(x)dx}\int{g^3(x)dx}\int{h^3(x)dx},

which for more than three terms is written as:

\left(\int{\prod^{n}_{k=1}f_k(x)dx}\right)^n \leq \prod^{n}_{k=1} \int{f_k^n(x)dx}.

The H\”{o}lder’s Divergence then becomes:

\mathcal{D_{H}}(f_k) = -\log\frac{ \left(\int{\prod^{n}_{k=1}f_k(x)dx}\right)^n } { \prod^{n}_{k=1} \int{f_k^n(x)dx} }.

\mathcal{D_{H}}(f_k) can be broken down to and rewritten as:

\mathcal{D_{H}}(f_k) = -n\log\int{\prod^{n}_{k=1}f_k(x)dx} + \sum^{n}_{k=1} \log\int{f_k^n(x)dx}.

This is interesting because the two terms are the geometric and arithmetic means of the functions f_k, and the equality holds (\mathcal{D_{H}} = 0) if they are the same.

Suppose we have a point set {\bf X} = ({\bf x}_1,\dots,{\bf x}_N)^T, where {\bf x}_i,\in \mathbb{R}^d. The kernel (Parzen) estimate of the PDF of an arbitrary point {\bf x} using an arbitrary kernel function \kappa(\cdot) is given by \hat{f}_{\bf X} = \frac{1}{N}\sum^N_{i=1}{\kappa\left(\frac{{\bf x}-{\bf x}_i}{\sigma}\right)}, where \sigma is the bandwidth parameter. The Gaussian kernel, G_\sigma(x,y) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{\left\|x-y\right\|^2}{2\sigma^2}\right), is considered as the kernel of choice for its properties: symmetric, positive definite, approaches zero as points move away from the center, and the decaying factor can be controlled via the kernel bandwidth. Substituting the Gaussian kernel and performing some manipulations results in this estimator for the argument of the first term:

\int{\prod^{n}_{k=1}f_k(x)dx} \\ \cong \int{\prod^{n}_{k=1}{ \sum^{k_N}_{k_i=1}{G_\sigma(x - x_{k_i}) dx}}} \\ = \prod^{n}_{k=1}{ \sum^{k_N}_{k_i=1}{\int{G_\sigma(x - x_{k_i}) dx}}} \\ = \sum_{1_i=1}^{1_N} \cdots \sum_{i_n}^{N_n} \prod_{p=1}^{n-1} \prod_{q=p+1}^{n} G_{\sqrt{n}\sigma}(x_{i_p} - x_{i_q})

The argument of the second term can be estimated in a similar way. Even though these estimators provide a closed-form solution, they have very high complexity making them feasible only for few shapes with limited number of data points.

 

Examples


Below is a simple example of the shape of a bat which is replicated four times and then rotated to and scaled to various degrees and sizes. The left figure shows the starting orientation and size of the four shapes, and the right figure shows the final result.

                

 

(more results coming soon)