lisc.Counts¶

class lisc.Counts[source]¶

A class for collecting and analyzing co-occurrence data for specified terms list(s).

Attributes

termsdict: Search terms to use.
counts2d array: The number of articles found for each combination of terms.
score2d array: A transformed ‘score’ of co-occurrence data. This may be normalized count data, or a similarity or association measure.
score_infodict: Information about the computed score data.
squarebool: Whether the count data matrix is symmetrical.
meta_dataMetaData: Meta data information about the data collection.

__init__()[source]¶: Initialize LISC Counts object.

Methods

`__init__`()	Initialize LISC Counts object.
`add_labels`(terms[, directory, dim])	Add labels for terms to the object.
`add_terms`(terms[, term_type, directory, dim])	Add search terms to the object.
`check_counts`([dim])	Check how many articles were found for each term.
`check_data`([data_type, dim])	Prints out the highest value count or score for each term.
`check_top`([dim])	Check the terms with the most articles.
`compute_score`([score_type, dim, return_result])	Compute a score, such as an index or normalization, of the co-occurrence data.
`copy`()	Return a copy of the current object.
`drop_data`(n_articles[, dim])	Drop terms based on number of article results.
`run_collection`([db, field, api_key, ...])	Collect co-occurrence data.

Attributes

has_data

Indicator for if the object has collected data.

add_labels(terms, directory=None, dim='A')[source]¶

Add labels for terms to the object.

Parameters

labelslist of str or str: Labels for each term to add to the object. If list, is assumed to be labels. If str, is assumed to be a file name to load from.
directorySCDB or str, optional: Folder or database object specifying the file location, if loading from file.
dim{‘A’, ‘B’}, optional: Which set of labels to add.

add_terms(terms, term_type='terms', directory=None, dim='A')[source]¶

Add search terms to the object.

Parameters

termslist or dict or str: Terms to add to the object. If list, assumed to be terms, which can be a list of str or a list of list of str. If dict, each key should reflect a term_type, and values the corresponding terms. If str, assumed to be a file name to load from.
term_type{‘terms’, ‘inclusions’, ‘exclusions’}, optional: Which type of terms are being added.
directorySCDB or str, optional: A string or object containing a file path.
dim{‘A’, ‘B’}, optional: Which set of terms to add.

Examples

Add one set of terms, from a list:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])

Add a second set of terms, from a list:

>>> counts.add_terms(['attention', 'perception'], dim='B')

Add some exclusion words, for the second set of terms, from a list:

>>> counts.add_terms(['', 'extrasensory'], term_type='exclusions', dim='B')

check_counts(dim='A')[source]¶

Check how many articles were found for each term.

Parameters

dim{‘A’, ‘B’}: Which set of terms to check.

Examples

Print the number of articles found for each term (assuming counts already has data):

>>> counts.check_counts() 

check_data(data_type='counts', dim='A')[source]¶

Prints out the highest value count or score for each term.

Parameters

data_type{‘counts’, ‘score’}: Which data type to use.
dim{‘A’, ‘B’}, optional: Which set of terms to check.

Examples

Print the highest count for each term (assuming counts already has data):

>>> counts.check_data() 

Print the highest score value for each term (assuming counts already has data):

>>> counts.check_data(data_type='score') 

check_top(dim='A')[source]¶

Check the terms with the most articles.

Parameters

dim{‘A’, ‘B’}, optional: Which set of terms to check.

Examples

Print which term has the most articles (assuming counts already has data):

>>> counts.check_top() 

compute_score(score_type='association', dim='A', return_result=False)[source]¶

Compute a score, such as an index or normalization, of the co-occurrence data.

Parameters

score_type{‘association’, ‘normalize’, ‘similarity’}, optional: The type of score to apply to the co-occurrence data.
dim{‘A’, ‘B’}, optional: Which dimension of counts to use to normalize by or compute similarity across. Only used if ‘score’ is ‘normalize’ or ‘similarity’.
return_resultbool, optional, default: False: Whether to return the computed result.

Examples

Compute association scores of co-occurrence data collected for two lists of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.add_terms(['attention', 'perception'], dim='B')
>>> counts.run_collection() 
>>> counts.compute_score() 

Once you have co-occurrence scores calculated, you might want to plot this data.

You can plot the results as a matrix:

>>> from lisc.plts.counts import plot_matrix  
>>> plot_matrix(counts)  

And/or as a clustermap:

>>> from lisc.plts.counts import plot_clustermap  
>>> plot_clustermap(counts)  

And/or as a dendrogram:

>>> from lisc.plts.counts import plot_dendrogram  
>>> plot_dendrogram(counts)  

copy()[source]¶: Return a copy of the current object.

drop_data(n_articles, dim='A')[source]¶

Drop terms based on number of article results.

Parameters

n_articlesint: Minimum number of articles required to keep each term.
dim{‘A’, ‘B’}, optional: Which set of terms to drop.

Examples

Drop terms with less than 20 articles (assuming counts already has data):

>>> counts.drop_data(20) 

property has_data¶: Indicator for if the object has collected data.

run_collection(db='pubmed', field='TIAB', api_key=None, logging=None, directory=None, verbose=False, **eutils_kwargs)[source]¶

Collect co-occurrence data.

Parameters

dbstr, optional, default: ‘pubmed’: Which database to access from EUtils.
fieldstr, optional, default: ‘TIAB’: Field to search for term in. Defaults to ‘TIAB’, which is Title/Abstract.
api_keystr, optional: An API key for a NCBI account.
logging{None, ‘print’, ‘store’, ‘file’}, optional: What kind of logging, if any, to do for requested URLs.
directorystr or SCDB, optional: Folder or database object specifying the save location.
verbosebool, optional, default: False: Whether to print out updates.
**eutils_kwargs: Additional settings for the EUtils API.

Examples

Collect co-occurrence data from added terms, across one set of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.run_collection() 

Collect co-occurrence data from added terms, across two sets of terms:

>>> counts = Counts()
>>> counts.add_terms(['frontal lobe', 'temporal lobe', 'parietal lobe', 'occipital lobe'])
>>> counts.add_terms(['attention', 'perception', 'cognition'], dim='B')
>>> counts.run_collection() 

Examples using `lisc.Counts`¶

Tutorial 03: Counts Collection

Tutorial 04: Counts Analysis

Tutorial 06: MetaData

lisc.Counts¶

Examples using lisc.Counts¶

Examples using `lisc.Counts`¶