Landmark Selection
In order to avoid the construction of a complete similarity matrix some spectral clustering methods compute the simmilarity function between a subset of patterns. This module provides an interface to sample points from diferentes data structures.
Methods availaible:
Random
. This selection method samples $k$ random points from a datasetEvenlySpaced
. This selection method samples spaced evenly acorrding ther index.
Detailed Description
Random Landmark Selection
using SpectralClustering
number_of_points = 20
dimension = 5
data = rand(dimension,number_of_points)
selector = RandomLandmarkSelection()
number_of_landmarks = 7
select_landmarks(selector, number_of_landmarks, data )
7-element Array{Int64,1}:
13
7
4
14
17
12
15
Evenly Spaced Landmark Selection
using SpectralClustering
number_of_points = 20
dimension = 5
data = rand(dimension,number_of_points)
selector = EvenlySpacedLandmarkSelection()
number_of_landmarks = 5
select_landmarks(selector, number_of_landmarks, data )
5-element Array{Int64,1}:
1
5
9
13
17
Index
Content
abstract type AbstractLandmarkSelection end
Abstract type that defines how to sample data points. Types that inherint from AbstractLandmarkSelection
has to implements the following interface:
select_landmarks{L<:AbstractLandmarkSelection}(c::L, X)
The select_landmarks
function returns an array with the indices of the sampled points.
Arguments
c::T<:AbstractLandmarkSelecion
. The landmark selection type.d::D<:DataAccessor
. TheDataAccessor
type.X
. The data. The data to be sampled.
struct EvenlySpacedLandmarkSelection <: AbstractLandmarkSelection
The EvenlySpacedLandmarkSelection
selection method selects n
evenly spaced points from a dataset.
SpectralClustering.select_landmarks
— Method.select_landmarks(c::EvenlySpacedLandmarkSelection,n::Integer, X)
SpectralClustering.select_landmarks
— Method.select_landmarks(c::RandomLandmarkSelection,d::T,n::Integer, X)
The function returns n
random points according to RandomLandmarkSelection
Arguments
- c::RandomLandmarkSelection.
- n::Integer. The number of data points to sample.
- X. The data to be sampled.