Below is a summary of clustering algorithms that work within the Carrot2 framework.
| Algorithm | Author | Speed [s]* | Hierarchical clustering | Other features | Papers | Example results |
|---|
| 100 | 200 | 400 |
|---|
| FuzzyAnts | Steven Schockaert | 2.17 | 8.70 | 16.93 | yes | | [1] | london |
| HAOG-STC | Karol Gołembniak | 0.04 | 0.11 | 0.28 | yes | | | london |
| Lingo** | Stanislaw Osinski | 0.34 | 0.52 | 0.84 | no | multilingual clustering | [2][3] | london |
| Rough k-means | Ngo Chi Lang | 1.38 | 6.76 | 27.73 | no | | [4] | london |
| STC | Oren Zamir (impl: Dawid Weiss) | 0.04 | 0.10 | 0.23 | no | | [5] | london |
| Lingo3G*** | Stanisław Osiński (Carrot Search) | 0.03 | 0.06 | 0.13 | yes | multilingual clustering, synonyms, advanced tuning, scalability (5000 snippets in 1.3s*) | | london |
*) Clustering speed measurements were done for 100, 200, 400 snippets
downloaded from Yahoo! for query 'london', using the Carrot2
standalone GUI application. Benchmark environment: Pentium M 1.3 GHz, 768
MB RAM, Windows XP. Java Virtual Machine: Sun JDK 1.4.2, JVM switches:
-Xmx512m -Xms128m -XX:NewRatio=1 -server. Time presented in the table is
an average of 75 runs, for each algorithm time measurement was followed
by 25 untimed warm-up runs.
**) Lingo is the default clustering algorithm used in the
Carrot2 live demos.
***) Lingo3G is a
commercial document clustering
engine and is not available in the Open
Source part of Carrot2. Please contact
Carrot Search
for details.