lisc.Counts

class lisc.Counts[source]

A class for collecting and analyzing co-occurrence data for specified terms list(s).

Attributes
termsdict

Search terms to use.

counts2d array

The number of articles found for each combination of terms.

score2d array

A transformed ‘score’ of co-occurrence data. This may be normalized count data, or a similarity or association measure.

score_infodict

Information about the computed score data.

squarebool

Whether the count data matrix is symmetrical.

meta_dataMetaData

Meta data information about the data collection.

__init__()[source]

Initialize LISC Counts object.

Methods

__init__()

Initialize LISC Counts object.

add_labels(terms[, directory, dim])

Add labels for terms to the object.

add_terms(terms[, term_type, directory, dim])

Add search terms to the object.

check_counts([dim])

Check how many articles were found for each term.

check_data([data_type, dim])

Prints out the highest value count or score for each term.

check_top([dim])

Check the terms with the most articles.

compute_score([score_type, dim, return_result])

Compute a score, such as an index or normalization, of the co-occurrence data.

copy()

Return a copy of the current object.

drop_data(n_articles[, dim])

Drop terms based on number of article results.

run_collection([db, field, api_key, ...])

Collect co-occurrence data.

Attributes

has_data

Indicator for if the object has collected data.

add_labels(terms, directory=None, dim='A')[source]

Add labels for terms to the object.

Parameters
labelslist of str or str

Labels for each term to add to the object. If list, is assumed to be labels. If str, is assumed to be a file name to load from.

directorySCDB or str, optional

Folder or database object specifying the file location, if loading from file.

dim{‘A’, ‘B’}, optional

Which set of labels to add.

add_terms(terms, term_type='terms', directory=None, dim='A')[source]

Add search terms to the object.

Parameters
termslist or dict or str

Terms to add to the object. If list, assumed to be terms, which can be a list of str or a list of list of str. If dict, each key should reflect a term_type, and values the corresponding terms. If str, assumed to be a file name to load from.

term_type{‘terms’, ‘inclusions’, ‘exclusions’}, optional

Which type of terms are being added.

directorySCDB or str, optional

A string or object containing a file path.

dim{‘A’, ‘B’}, optional

Which set of terms to add.

Examples

Add one set of terms, from a list:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])

Add a second set of terms, from a list:

>>> counts.add_terms(['attention', 'perception'], dim='B')

Add some exclusion words, for the second set of terms, from a list:

>>> counts.add_terms(['', 'extrasensory'], term_type='exclusions', dim='B')
check_counts(dim='A')[source]

Check how many articles were found for each term.

Parameters
dim{‘A’, ‘B’}

Which set of terms to check.

Examples

Print the number of articles found for each term (assuming counts already has data):

>>> counts.check_counts() 
check_data(data_type='counts', dim='A')[source]

Prints out the highest value count or score for each term.

Parameters
data_type{‘counts’, ‘score’}

Which data type to use.

dim{‘A’, ‘B’}, optional

Which set of terms to check.

Examples

Print the highest count for each term (assuming counts already has data):

>>> counts.check_data() 

Print the highest score value for each term (assuming counts already has data):

>>> counts.check_data(data_type='score') 
check_top(dim='A')[source]

Check the terms with the most articles.

Parameters
dim{‘A’, ‘B’}, optional

Which set of terms to check.

Examples

Print which term has the most articles (assuming counts already has data):

>>> counts.check_top() 
compute_score(score_type='association', dim='A', return_result=False)[source]

Compute a score, such as an index or normalization, of the co-occurrence data.

Parameters
score_type{‘association’, ‘normalize’, ‘similarity’}, optional

The type of score to apply to the co-occurrence data.

dim{‘A’, ‘B’}, optional

Which dimension of counts to use to normalize by or compute similarity across. Only used if ‘score’ is ‘normalize’ or ‘similarity’.

return_resultbool, optional, default: False

Whether to return the computed result.

Examples

Compute association scores of co-occurrence data collected for two lists of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.add_terms(['attention', 'perception'], dim='B')
>>> counts.run_collection() 
>>> counts.compute_score() 

Once you have co-occurrence scores calculated, you might want to plot this data.

You can plot the results as a matrix:

>>> from lisc.plts.counts import plot_matrix  
>>> plot_matrix(counts)  

And/or as a clustermap:

>>> from lisc.plts.counts import plot_clustermap  
>>> plot_clustermap(counts)  

And/or as a dendrogram:

>>> from lisc.plts.counts import plot_dendrogram  
>>> plot_dendrogram(counts)  
copy()[source]

Return a copy of the current object.

drop_data(n_articles, dim='A')[source]

Drop terms based on number of article results.

Parameters
n_articlesint

Minimum number of articles required to keep each term.

dim{‘A’, ‘B’}, optional

Which set of terms to drop.

Examples

Drop terms with less than 20 articles (assuming counts already has data):

>>> counts.drop_data(20) 
property has_data

Indicator for if the object has collected data.

run_collection(db='pubmed', field='TIAB', api_key=None, logging=None, directory=None, verbose=False, **eutils_kwargs)[source]

Collect co-occurrence data.

Parameters
dbstr, optional, default: ‘pubmed’

Which database to access from EUtils.

fieldstr, optional, default: ‘TIAB’

Field to search for term in. Defaults to ‘TIAB’, which is Title/Abstract.

api_keystr, optional

An API key for a NCBI account.

logging{None, ‘print’, ‘store’, ‘file’}, optional

What kind of logging, if any, to do for requested URLs.

directorystr or SCDB, optional

Folder or database object specifying the save location.

verbosebool, optional, default: False

Whether to print out updates.

**eutils_kwargs

Additional settings for the EUtils API.

Examples

Collect co-occurrence data from added terms, across one set of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.run_collection() 

Collect co-occurrence data from added terms, across two sets of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.add_terms(['attention', 'perception', 'cognition'], dim='B')
>>> counts.run_collection() 

Examples using lisc.Counts

Tutorial 03: Counts Collection

Tutorial 03: Counts Collection

Tutorial 04: Counts Analysis

Tutorial 04: Counts Analysis

Tutorial 06: MetaData

Tutorial 06: MetaData