This paper presents textual document clustering
using two approaches namely cosine similarity and frequency
and inverse document frequency. With the combination of
these approaches a similarity measure values are generated
between keywords in the documents and between the
documents. Using this approach, the best related document
can be identified on the basis of clustering method called
correlation preserving index in which related documents are
stored in an index format.
Real Time Impact Factor:
Pending
Author Name: B Sindhuja, Mrs. VeenaTrivedi
URL: View PDF
Keywords: Document Clustering, Cosine similarity, Tf-idf, Correlation preserving index.
ISSN: 2347-5552
EISSN: 2347-5552
EOI/DOI:
Add Citation
Views: 4942