Gismo logo

A Generic Information Search… With a Mind of its Own!

https://img.shields.io/pypi/v/gismo.svg Build Status Documentation Status Code Coverage

GISMO is a NLP tool to rank and organize a corpus of documents according to a query.

Gismo stands for Generic Information Search… with a Mind of its Own.

Features

Gismo combines three main ideas:

  • TF-IDTF: a symmetric version of the TF-IDF embedding.
  • DIteration: a fast, push-based, variant of the PageRank algorithm.
  • Fuzzy dendrogram: a variant of the Louvain clustering algorithm.

Quickstart

Install gismo:

$ pip install gismo

Import gismo in a Python project:

import gismo as gs

To get the hang of a typical Gismo workflow, you can check the Toy Example notebook. For more advanced uses, look at the other tutorials or directly the reference section.

Credits

Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong.

This package was created with Cookiecutter and the francois-durand/package_helper project template.