lisc.io.db.SCDB¶
- class lisc.io.db.SCDB(base=None, generate_paths=True, structure={1: {'base': ['terms', 'logs', 'data', 'figures']}, 2: {'data': ['counts', 'words']}, 3: {'words': ['raw', 'summary']}})[source]¶
Database object for a SCANR project.
Notes
The default set of paths for SCDB is:
Level 1: Base
terms
Terms files.
logs
Logs files.
figs
Figures files.
data
Data files.
Level 2: Data
counts
Counts data files.
words
Words data files.
Level 3: Words
raw
Raw words data files.
summary
Summary files for words data.
- Attributes
- pathsdict
Dictionary of all folder paths in the project.
- __init__(base=None, generate_paths=True, structure={1: {'base': ['terms', 'logs', 'data', 'figures']}, 2: {'data': ['counts', 'words']}, 3: {'words': ['raw', 'summary']}})[source]¶
Initialize a SCDB object.
- Parameters
- basestr
The base path to where the database is located.
- generate_pathsbool
Whether to automatically generate all the paths for the database folders.
Examples
Initialize a
SCDB
object:>>> db = SCDB('lisc_db')
Methods
__init__
([base, generate_paths, structure])Initialize a SCDB object.
Check the file structure of the database.
gen_paths
([structure])Generate all the full paths for the database object.
get_file_path
(folder, file_name)Get a path to a file in a designated directory folder.
get_files
(folder[, drop_ext, sort_files])Get a list of available files in a folder in the database.
get_folder_path
(folder)Get the path to a folder in the directory.
- gen_paths(structure={1: {'base': ['terms', 'logs', 'data', 'figures']}, 2: {'data': ['counts', 'words']}, 3: {'words': ['raw', 'summary']}})[source]¶
Generate all the full paths for the database object.
- Parameters
- structuredict, optional
Definition of the folder structure for the database.
Examples
Generate paths for a
SCDB
object:>>> db = SCDB('lisc_db') >>> db.gen_paths() >>> db.paths {'base': PosixPath('lisc_db'), 'terms': PosixPath('lisc_db/terms'), 'logs': PosixPath('lisc_db/logs'), 'data': PosixPath('lisc_db/data'), 'figures': PosixPath('lisc_db/figures'), 'counts': PosixPath('lisc_db/data/counts'), 'words': PosixPath('lisc_db/data/words'), 'raw': PosixPath('lisc_db/data/words/raw'), 'summary': PosixPath('lisc_db/data/words/summary')}
- get_file_path(folder, file_name)[source]¶
Get a path to a file in a designated directory folder.
- Parameters
- folderstr
Which folder path to get the file path from.
- file_namestr
The name of the file to create the full file path for.
- Returns
- str
The full file path to the requested file.
Examples
Get the path to a
Counts
file:>>> db = SCDB('lisc_db') >>> db.get_file_path('counts', 'tutorial_counts.p') PosixPath('lisc_db/data/counts/tutorial_counts.p')
- get_files(folder, drop_ext=False, sort_files=True)[source]¶
Get a list of available files in a folder in the database.
- Parameters
- folderstr
Which folder path to get the list of files from.
- drop_extbool, optional, default: True
Whether to drop the extensions from the list of file names.
- sort_filesbool, optional, default: True
Whether to sort the list of files before returning.
- Returns
- fileslist of str
List of files available in specified folder.
Examples
Get a list of available terms files:
>>> db = SCDB('lisc_db') >>> db.get_files('terms')
- get_folder_path(folder)[source]¶
Get the path to a folder in the directory.
- Parameters
- folderstr
Which folder to get the path for.
- Returns
- str
The path to the requested directory folder.
Examples
Get the path to the folder containing
Counts
data:>>> db = SCDB('lisc_db') >>> db.get_folder_path('counts') PosixPath('lisc_db/data/counts')