Embedding

Comparison between Ng Laplacian and ShiMalik Laplacian

Given a weight matrix $W$, the ShiMalikLaplacian embedding method obtains $V_1$, the $k$ eigenvectors corresponding to the $k$ smallest eigenvalues different from 0 of the matrix $NL = D^{-\frac{1}{2}}(D-W)D^{-\frac{1}{2}}$. Then computes $X_1 = D^{\frac{1}{2}} V_1$ to obtain the solution of the generalized eigenvalue problem $(D-W)X_1 = \lambda D X_1$.

The NgLaplacian embedding method computes the $k$ eigenvectors corresponding to the $k$ largest eigenvalues of $L = D^{-\frac{1}{2}}WD^{-\frac{1}{2}}$.

The eigenvectors obtained with both metods should be the same. The eigenvalues are shifted: the eigenvectors corresponding to the eigenvalue $0$ of $NL$ are the eigenvectors corresponding to the eigenvalue $1$ of $L$.

In [1]:
using RDatasets
using SpectralClustering
using Distances
using GraphRecipes

function weight(i::Integer, neigh, v, m, l_scale)
    return exp.(-Distances.colwise(SqEuclidean(),m,v)./(l_scale[i].*l_scale[neigh]))
end
function local_scale_fun(v, m)
    return Distances.colwise(Euclidean(),m,v)
end
d = dataset("datasets", "iris")
d = convert(Matrix,d[:,[2,4]])'
knnconfig = KNNNeighborhood(d,10)
l_scale = local_scale(knnconfig, local_scale_fun, d  ) .+ eps()
weight_oracle = (i,neigh,v,m)->weight(i,neigh,v,m,l_scale)
G = create(knnconfig,weight_oracle,d);
ng_vec = embedding(NgLaplacian(3), G)
sm_vec = embedding(ShiMalikLaplacian(3),G);

using Plots
 
fig1 = Plots.plot(ng_vec[:,1:3], layout = (@layout grid(3,1)), legend=false)
title!(fig1.subplots[1], "Ng Laplacian largest eigenvectors")
fig2 = Plots.plot(sm_vec[:,1:3], layout = (@layout grid(3,1)), legend=false) 
title!(fig2.subplots[1], "Shi-Malik laplacian smallest eigenvectors")
plt = graphplot(G, markersize=5, func=(source;kw...)->fixed_layout(d, source)) 
title!(plt, "Simmilarity Graph")
Plots.plot(plt, fig1,fig2,   layout =( @layout [a{1w}; grid(1,2){1w}]), size=(800,800)  )
Out[1]:
Simmilarity Graph 0 50 100 150 -1.0 -0.5 0.0 0.5 Ng Laplacian largest eigenvectors 0 50 100 150 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 0 50 100 150 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 0 50 100 150 -0.5 0.0 0.5 1.0 Shi-Malik laplacian smallest eigenvectors 0 50 100 150 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 0 50 100 150 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75

Segmentation Given Partial Grouping Constraints

If partial information of the clustering is available, in Segmentation Given Partial Grouping Constraints the authors showed that it is possible to incorporate that information to the optimization problem that spectral clustering algorithms solve.

Let $X \in R^{N \times K}$ a partition matrix where $X=\left[ X_1,...,X_K \right]$ and $X(i,l) = 1$ if $i \in V_l$ and 0 otherwise. Given that every $X_t$ is a binary indicator for partition $V_t$ and since envery pattern is asigned to one partition: $X \mathbf{1}_K = \mathbf{1}_N$.

Let $C=\left\{ c_1, c_2, ...,c_n \right\}$ be the partial information of the clustering. $c_i$ is a set of the patterns belonging to the same cluster. Each set produces |c_i| - 1 constraints. Each contraint can be represented by an $N \times 1$ vector $U_k$ with only two non zero elements $U_k(i) = 1$ and $U_k(j)=-1$ if $i,j \in c_k$. Let $U$ be the matrix forme by every contraint vector located in the columns. Then the constraints satisfy $U^T X = 0$.

The constrained clustering problem for the normalized cuts criterion is the following: $$ \begin{array}{cc} \max & \frac{1}{K} \sum\limits_{l=1}^K \dfrac{X_l^T W X_l}{X_l^T D X_l} \\ s.t. & X \in \left\{ 0, 1\right\}^{N \times K}, \\ & X\mathbf{1}_K =\mathbf{1}_N, \\ & U^T X = \mathbf{0} \end{array} $$ where $D=\mbox{diag}(W \mathbf{1}_N)$.

In [2]:
using TestImages, Images, Plots, SpectralClustering, Distances, LinearAlgebra, Statistics
function weight(i::Integer,j::Vector{<:Integer},pixel_i, neighbors_data)
    dist_sim =  exp.(-Distances.colwise(SqEuclidean(), pixel_i[1:2],neighbors_data[1:2,:])./(2*5^2))
    col_sim =  exp.(-Distances.colwise(SqEuclidean(), pixel_i[3:5],neighbors_data[3:5,:])./(2*0.05^2))
    return dist_sim.*col_sim
end
img = testimage("fabio_color_256");
nconfig = PixelNeighborhood(3);
G = create(Float64, nconfig, weight, img);
ng_vec = embedding(NgLaplacian(3), G);
fabio_pixels = ([100;100;230], [115;150;150])
back_pixels = ([140;200;50;20;150], [20;250;240;20;230])
fabio_linear_ind = LinearIndices(size(img))[[CartesianIndex(i,j) for (i,j) in zip(fabio_pixels...)]]
back_linear_ind = LinearIndices(size(img))[[CartesianIndex(i,j) for (i,j) in zip(back_pixels...)]];
constraints = Vector{Integer}[fabio_linear_ind, back_linear_ind ];


cg_vec = embedding(PartialGroupingConstraints(3, smooth=false, normalize=true), G, constraints)
cg_vec = cg_vec[:, 1:3]
ng_vec = ng_vec[:, 1:3]
img_ng = reshape(colorview(RGB, imadjustintensity(ng_vec)'), size(img)) 
img_cg = reshape(colorview(RGB, imadjustintensity(cg_vec)'), size(img))



using Measures
plt1 = Plots.plot(img, title="Original image", legend=false,top_margin=10.0mm); 
Plots.scatter!(plt1,fabio_pixels[2], fabio_pixels[1], c=:red, legend=false);
Plots.scatter!(plt1,back_pixels[2], back_pixels[1], c=:blue, legend=false);


plt2 = Plots.plot(img_cg, title="Embedding given\nthe constraints",top_margin=10.0mm, legend=false); 
Plots.scatter!(plt2, fabio_pixels[2], fabio_pixels[1],c=:red, legend=false);
Plots.scatter!(plt2, back_pixels[2], back_pixels[1], c=:blue, legend=false);

plt3 = Plots.plot(img_ng, title="Embedding without\nconstraints",top_margin=10.0mm);

Plots.plot(plt1, plt2,plt3,   layout =( @layout grid(1,3)), size=(size(img,1)*3,size(img,2))  )
Out[2]:
0 50 100 150 200 250 0 50 100 150 200 250 Original image 0 50 100 150 200 250 0 50 100 150 200 250 Embedding given the constraints 50 100 150 200 250 50 100 150 200 250 Embedding without constraints

Negative weights

In the model proposed by ref there are two non-negative weight matrices, $A$ and $R$. The matrix $A$ encodes the pairwise attraction, whereas, the matrix $R$ encodes the repulsion information. "The attraction measures the degree of association by feature similarity, the repulsion measures the segregation by feature dissimilarity". The idea is "having tight attraction within clusters and loose attraction between clusters at the same ime, strong repulsion between clusters and weak repulsion within clusters at the same time".

Let $$\begin{array}{cc} W_{eq} &= A - R + D_R \\ D_{eq} &= D_A + D_R \end{array}$$ where $D_R = \text{diag}(R \mathbf{1})$ and $D_A = \text{diag}(A \mathbf{1})$ and

In [11]:
using Distributions, Distances, SpectralClustering, Images, Plots, Measures
function popout_weight(i::Integer, ineigh, vi, vneigh)
    intensity_dist = Distances.colwise(Euclidean(), vi[3:end], vneigh[3:end, :])
    xy_dist = Distances.colwise(Euclidean(), vi[1:2], vneigh[1:2, :])
    a = 5
    b = 0.05
    return (pdf.(Normal(0, a*15), xy_dist) - pdf.(Normal(0, a), xy_dist)) .*
           (pdf.(Normal(0, b*100), intensity_dist) - pdf.(Normal(0, b), intensity_dist))
end
function attraction(i::Integer, ineigh, vi, vneigh)
    diff = popout_weight(i, ineigh, vi, vneigh)
    diff[diff.<0] .= 0
    return diff
end
function repulsion(i::Integer, ineigh, vi, vneigh)
    diff = popout_weight(i, ineigh, vi, vneigh)
    diff[diff.>0] .= 0
    return abs.(diff)
end
img = zeros(31,31)
img[8:25, 3:12] .= 0.9
img[3:12, 5:28] .= 0.2
img[8:25, 25:30] .= 0.6
img = Gray.(img + randn(31, 31)*0.03)

labels = zeros(Integer, 31, 31)
labels[8:25, 3:12] .= 1
labels[3:12, 5:28] .= 2
labels[8:25, 25:30] .= 3

nconfig = PixelNeighborhood(4)
graph_attraction = create(nconfig, attraction, img);
graph_repulsion = create(nconfig, repulsion, img);

emb_config = YuShiPopout(3,  false)
emb = embedding(emb_config, graph_attraction, graph_repulsion)

embedding_image = reshape(colorview(RGB, hcat([imadjustintensity(emb[:,i]) for i=1:3]...)'), (31,31))

plt1 = Plots.plot(img, title="Original image", legend=false,top_margin=10.0mm); 
plt2 = Plots.plot(embedding_image, title="Embedding image", legend=false, left_margin=10.0mm, top_margin=10.0mm); 
Plots.plot(plt1, plt2, layout =( @layout grid(1,2)))
Out[11]:
5 10 15 20 25 30 5 10 15 20 25 30 Original image 5 10 15 20 25 30 5 10 15 20 25 30 Embedding image

References

using DocUtils display("text/html",bibliography(["ng2002spectral","shi2000normalized","yu2001understanding"]))