Skip to content

Persistence Clustering 0

Persistence Clustering 0 screenshot

Pipeline description

This example performs a persistence driven clustering of a 2D toy data set, taken from the scikit-learn examples. Please check out the Karhunen-Love Digits 64-Dimensions example for an application of this pipeline to a real-life, high-dimensional, data set.

The pipeline starts by estimating the density of the input point cloud with a Gaussian kernel, by the GaussianResampling filter, coupled with the Slice filter (to restrict the estimation to a 2D plane).

Next, the PersistenceDiagram of the density field is computed and only the 2 most persistent density maxima are selected (corresponding to the desired 2 output clusters, bottom left view in the above screenshot).

Next, the simplified persistence diagram is used as a constraint for the TopologicalSimplification of the density field (top right view, above screenshot). The simplified density field then contains only 2 maxima and it is used as an input to the Morse-Smale complex computation, for the separation of the 2D space into the output clusters (background color in the bottom right view, above screenshot).

Finally, the cluster identifier of each input point is given by the identifier of the corresponding ascending manifold of the Morse-Smale complex (AscendingManifold), with the ResampleWithDataset filter.

ParaView

To reproduce the above screenshot, go to your ttk-data directory and enter the following command:

$ paraview --state=states/persistenceClustering0.pvsm

Python code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#!/usr/bin/env python

from paraview.simple import *

# paraview 5.9 VS 5.10 compatibility ===========================================
def ThresholdBetween(threshold, lower, upper):
    try:
        # paraview 5.9
        threshold.ThresholdRange = [lower, upper]
    except:
        # paraview 5.10
        threshold.ThresholdMethod = "Between"
        threshold.LowerThreshold = lower
        threshold.UpperThreshold = upper
# end of comphatibility ========================================================

# create a new 'CSV Reader'
clustering0csv = CSVReader(FileName=['clustering0.csv'])

# create a new 'Table To Points'
tableToPoints1 = TableToPoints(Input=clustering0csv)
tableToPoints1.XColumn = 'X'
tableToPoints1.YColumn = 'Y'
tableToPoints1.a2DPoints = 1
tableToPoints1.KeepAllDataArrays = 1

# create a new 'Gaussian Resampling'
gaussianResampling1 = GaussianResampling(Input=tableToPoints1)
gaussianResampling1.ResampleField = ['POINTS', 'ignore arrays']
gaussianResampling1.ResamplingGrid = [256, 256, 3]
gaussianResampling1.SplatAccumulationMode = 'Sum'

# create a new 'Slice'
slice1 = Slice(Input=gaussianResampling1)
slice1.SliceType = 'Plane'

# init the 'Plane' selected for 'SliceType'
slice1.SliceType.Normal = [0.0, 0.0, 1.0]

# create a new 'TTK PersistenceDiagram'
tTKPersistenceDiagram1 = TTKPersistenceDiagram(Input=slice1)
tTKPersistenceDiagram1.ScalarField = ['POINTS', 'SplatterValues']

# create a new 'Threshold'
persistenceThreshold0 = Threshold(Input=tTKPersistenceDiagram1)
persistenceThreshold0.Scalars = ['CELLS', 'Persistence']
ThresholdBetween(persistenceThreshold0, 10.0, 999999999)

# create a new 'TTK TopologicalSimplification'
tTKTopologicalSimplification1 = TTKTopologicalSimplification(Domain=slice1,
    Constraints=persistenceThreshold0)
tTKTopologicalSimplification1.ScalarField = ['POINTS', 'SplatterValues']

# create a new 'TTK MorseSmaleComplex'
tTKMorseSmaleComplex1 = TTKMorseSmaleComplex(Input=tTKTopologicalSimplification1)
tTKMorseSmaleComplex1.ScalarField = ['POINTS', 'SplatterValues']

# create a new 'Resample With Dataset'
resampleWithDataset1 = ResampleWithDataset(SourceDataArrays=OutputPort(tTKMorseSmaleComplex1,3),
    DestinationMesh=tableToPoints1)
resampleWithDataset1.CellLocator = 'Static Cell Locator'

SaveData("OutputClustering.csv", resampleWithDataset1)

Inputs

Outputs

  • OutputClustering.csv: the output clustering of the input point cloud (output cluster identifier: AscendingManifold column)

C++/Python API

Morse-Smale complex

PersistenceDiagram

TopologicalSimplification