lisc.collect_words¶
- lisc.collect_words(terms, inclusions=None, exclusions=None, labels=None, db='pubmed', retmax=100, field='TIAB', usehistory=False, api_key=None, save_and_clear=False, logging=None, directory=None, collect_info=True, verbose=False, **eutils_kwargs)[source]¶
Collect text data and metadata from EUtils using specified search term(s).
- Parameters
- termslist of list of str
Search terms.
- inclusionslist of list of str, optional
Inclusion words for search terms.
- exclusionslist of list of str, optional
Exclusion words for search terms.
- labelslist of str, optional
Labels for the search terms.
- dbstr, optional, default: ‘pubmed’
Which database to access from EUtils.
- retmaxint, optional, default: 100
Maximum number of articles to return.
- fieldstr, optional, default: ‘TIAB’
Field to search for term within. Defaults to ‘TIAB’, which is Title/Abstract.
- usehistorybool, optional, default: False
Whether to use EUtils history, storing results on their server.
- api_keystr, optional
An API key for a NCBI account.
- save_and_clearbool, optional, default: False
Whether to save words data to disk per term as it goes, instead of holding in memory.
- logging{None, ‘print’, ‘store’, ‘file’}
What kind of logging, if any, to do for requested URLs.
- directorystr or SCDB, optional
Folder or database object specifying the save location.
- collect_infobool, optional, default: True
Whether to collect database information, to be added to meta data.
- verbosebool, optional, default: False
Whether to print out updates.
- **eutils_kwargs
Additional settings for the EUtils API.
- Returns
- resultslist of Articles
Results from collecting data for each term.
- meta_dataMetaData
Meta data from the data collection.
Notes
The collection does an exact word search for the term given. It then loops through all the articles found for that term.
For each article, it pulls and saves out data (including title, abstract, authors, etc), using the hierarchical tag structure that organizes the articles.
Examples
Collect words data for two terms, limiting the results to 5 articles per term:
>>> results, meta_data = collect_words([['frontal lobe'], ['temporal lobe']], retmax=5)