lisc.analysis.counts.compute_similarity¶

lisc.analysis.counts.compute_similarity(data, dim='A')[source]¶

Calculate the similarity across the co-occurrence data.

Parameters:

data2d array: Counts of co-occurrence of terms.
dim{‘A’, ‘B’}, optional: Which set of terms to compute similarity across. ‘A’ is equivalent to across rows, ‘B’ to across columns.

Returns:

Notes

This function computes the cosine similarity.

Cosine similarity is normalized, so this function will give the same result if computed on raw counts, or normalized data.

The implementation is adapted from here: https://stackoverflow.com/a/20687984