About | Contact | Carrot2 @ sf.net | Search Clustering Engine | Carrot Search

Algorithms

......
Carrot2 Project......
Carrot2: open source framework for building search clustering engines

Below is a summary of clustering algorithms that work within the Carrot2 framework.

AlgorithmAuthorSpeed [s]*Hierarchical
clustering
Other featuresPapersExample results
100200400
Lingo***Stanislaw Osinski0.48 0.16**0.34 0.17**0.74 0.31**no[2][3]london
STCOren Zamir
(impl: Dawid Weiss)
0.010.020.06no[5]london
Lingo3G****Stanisław Osiński
(Carrot Search)
0.010.030.05yesmultilingual clustering, synonyms, advanced tuning, scalability (10,000 snippets in 530ms*)london

*) Clustering speed measurements were done for 100, 200, 400 snippets downloaded from Yahoo! for query 'lucene'. Benchmark environment: Intel Core2 Duo E8400 3GHz, 3GB MB RAM, Windows XP. Java Virtual Machine: Sun JDK 1.6.0, JVM switches: -server -Xmx512m. Time presented in the table is an average of 100 runs, for each algorithm time measurement was preceded by 100 untimed warm-up runs.

**) Clustering time when native matrix computations are enabled.

***) Lingo is the default clustering algorithm used in the Carrot2 live demos.

****) Lingo3G is a commercial document clustering engine and is not available in the Open Source part of Carrot2. Please contact Carrot Search for details.

...