lisc.analysis.counts.compute_similarity

lisc.analysis.counts.compute_similarity(data, dim='A')[source]

Calculate the similarity across the co-occurrence data.

Parameters
data2d array

Counts of co-occurrence of terms.

dim{‘A’, ‘B’}, optional

Which set of terms to compute similarity across. ‘A’ is equivalent to across rows, ‘B’ to across columns.

Returns
cosine2d array

The cosine similarity of the co-occurrence data.

Notes

This function computes the cosine similarity.

Cosine similarity is normalized, so this function will give the same result if computed on raw counts, or normalized data.

The implementation is adapted from here: https://stackoverflow.com/a/20687984