Tuesday, May 28, 2019

What to use for document vector

Jaccard Similarity - 


It is shown for news articles that stop words following two other words can improve an application for finding similar news articles in web. There are conditions that this features showed effective but in general for document similarity we should remove stop words before building bag of words for documents.

Mining of Massive Datasets




This book is suggested for Data mining course in Stanford university. You can download the book from Stanford university


What to use for document vector

Jaccard Similarity -  It is shown for news articles that stop words following two other words can improve an application for finding s...