document-similarity-social-network
Document Similarity Social Network
Flow
Calculate tf-idf feature.
TF-IDF stands for “Term Frequency, Inverse Document Frequency.” It’s a way to score the importance of words (or “terms”) in a document based on how frequently they appear across multiple documents.
Calculate cosine similarity matrix.
Calculate nodes relationships with threshold-based neighbors.
What is Vis.js
A dynamic, browser based visualization library. The library is designed to be easy to use, to handle large amounts of dynamic data, and to enable manipulation of and interaction with the data. The library consists of the components DataSet, Timeline, Network, Graph2d and Graph3d.
Document Social Network Result
Practical application
Methodology references:
- The TF-IDF algorithm explained.
- Calculate cosine similarity matrix.
- Calculate nodes relationships with threshold-based neighbors.
Tool references:
- Social Network Visualization.