About | Contact | Carrot2 @ sf.net | Search Clustering Engine | Carrot Search


Carrot2 Project......
Carrot2: open source framework for building search clustering engines
Please see User and Developer Manual for a number of typical Carrot2 testing scenarios.
No. Carrot2 can add clustering of search results to an existing search engine. You can use an Open Source project called Nutch to crawl your website. Nutch has a Carrot2-based search clustering plugin, so you'll get all crawling, searching and clustering in one piece. If you need help with any of these, please contact us.
Absolutely. Carrot2 came about as a framework for building search results clustering engines but its algorithms should successfully cluster up to about a thousand text documents, a few paragraphs each.
No. Assigning documents to a set of predefined categories is a problem called text classification / categorization and Carrot2 was not designed to solve it. For text classification components you may want to see the LingPipe project.
The most important characteristic of Carrot2 algorithms to keep in mind is that they perform in-memory clustering. For this reason, as a rule of thumb, Carrot2 should successfully deal with up to a thousand of documents, a few paragraphs each. For algorithms designed to process millions of documents, you may want to check out the Mahout project.
Yes. While the query is usually very helpful to get rid of the obvious meanings related to the documents in the search results set, it is not obligatory -- the clustering algorithms will cope without the query.
Please see User and Developer Manual for more information on compiling Carrot2 source code.
Yes. The only requirement is that you include the license text in your binary distribution. It'd be great if you let us know about your project and/or acknowledged the use of Carrot2 on your project's website or documentation. It's optional, but keeps us motivated :-)
Please put a statement equivalent to "This product includes software developed by the Carrot2 Project" on your site and link it to Carrot2's website (http://www.carrot2.org). Additionally, you can use some of our powered-by logos if you like.
Source code of the visualization is not publicly available. For a fully brandable version, please see the Circles interative clusters visualization from Carrot Search.
The focus of the Carrot2 project is on clustering algorithms. We provide several higher-level applications such as the web application hosted at http://search.carrot2.org, an RCP-based Workbench desktop application for tuning purposes and a simple REST-service server DCS which is a command-line application. All these applications are to some point extensible but are not the core concern of developers, so before you ask a question on the mailing list it's best to checkout the project, see how these applications work first (in particular look at the build files that collect data for these applications) and try to modify them on your own. For generic questions such as "how can I tune/ modify the web application" we have a generic answer: "by modifying the source code". Ask specific questions and you'll get specific answers.
Microsoft provides a search API for Bing with a free monthly limit of 5000 requests. Once this free limit is used up, Bing-powered tabs on Carrot2 will cease working. If you'd like to use Bing then issue an API key for yourself, download Carrot2 applications and run them locally with your key.

We provide the search interface as a demo of the technology and we use partnership with a company called Comcepta (eTools) for providing a limited number of free search requests. Unfortunately some people have been abusing this free service and we had to introduce per-IP limitations.

If you wish to extend your query limits please install Carrot2 locally and contact Comcepta for custom query limit arrangements. Or use Microsoft Bing's search API: issue an API key for yourself then copy-paste that key in the "Application API key" form box of the Bing tab (shown after you click on "More advanced options").

Apologies for inconvenience.

This is typically a Linux/ Unix issue - we cannot test on all the environments and packages out there. For Workbench to work you must have WebKit browser libraries present in your system. On Ubuntu 12.04 this means installing:
sudo apt-get install libwebkitgtk-1.0-0