• eccenca
  • Support
  • Wiki
  • Download
HomeShopPublishingGetting Started

Browse Components...

by Category

Crawlers & Connectors (6) Document Converter (1) Text Analytics (16)    Classification (3)    Entity Extraction (2)    Linguistic Analyzers (11) Imaging Technology (8) Knowledge Modeling (5) Data Stores (1) User Interface Enhancement (2) System Tools (2) Cool Tools (3) Search Query Optimization (2)

by Provider


Open Source and Academic

by Search

All

Browse all components

User

Your Account






Lost Password?
Forgot your username?
No account yet? Register

Download Area

Your Cart

Your Cart is currently empty.

You are here: Home Shop Text Analytics Classification The Dragon Toolkit

The Dragon Toolkit

The Dragon Toolkit

The Dragon Toolkit is a Java-based development package for academic use in information retrieval (IR) and text mining (TM, including text classification, text clustering, text summarization, and topic modeling). It is tailored for researchers who work on large-scale IR and TM and prefer Java programming.

Classification
Open Source and Academic
unsupported
free
Print
Tell a friend
Ask a question
This component is listed in following categories:

This component is not supported.


http://dragon.ischool.drexel....
http://dragon.ischool.drexel....
  • Overview
  • Reviews
  • Features
  • Support
  • Provider

Different from Lucene and Lemur, it provides built-in supports for semantic-based IR and TM. The dragon toolkit seamlessly integrates a set of NLP tools, which enable the toolkit to index text collections with various representation schemes including words, phrases, ontology-based concepts and relationships. However, to minimize the learning time, we intentionally keep the package small and simple. The toolkit does not have some features including distributed IR and cross-language IR which is a part of Lemur toolkit.

Another important feature of the toolkit is its scalability. Unlike many text mining tools such as Weka, the dragon toolkit is specially designed for large-scale application. The toolkit uses sparse matrix to implement text representations and does not have to load all data into memory in the running time. Therefore, it can handle hundred thousands of documents with very limited memory.

Customer Reviews:

There are yet no reviews for this product.
Please log in to write a review.
  1. Implemented by Java
  2. Sparse matrix represenation and computationally efficient
  3. Highly scalable to large data set
  4. Well designed Programming API and XML-based Interface
  5. Various document representations including words, multiword phrases, ontology-based concepts, and concept pairs
  6. Various text retrieval models
  7. Text classification, clustering, summarization and topic modeling
This component is not supported.


http://dragon.ischool.drexel....
http://dragon.ischool.drexel....

Open Source and Academic

Components listed with this provider are either open source or from academic institutions.
Look at the component detail page for project websites and related links.

JoomlaWatch Stats 1.2.9 by Matej Koval
Copyright by eccenca, 1998-2009.
  • Home
  • Sitemap
  • Imprint