pliers.extractors.SpaCyExtractor¶

class pliers.extractors.SpaCyExtractor(extractor_type='token', features=None, model='en_core_web_sm')[source]¶

Bases: TextExtractor

A generic class for Spacy Text extractors

Uses SpaCy to extract features from text. Extracts features for every word (token) in a sentence.

Parameters

extractor_type (str) – The type of feature to extract. Must be one of ‘doc’ (analyze an entire sentence/document) or ‘token’ (analyze each word).
features (list) – A list of strings giving the names of spaCy features to extract. See SpacY documentation for details. By default, returns all available features for the given extractor type.
model (str) – The name of the language model to use.

__init__(extractor_type='token', features=None, model='en_core_web_sm')[source]¶

transform(stim, *args, **kwargs)¶

Executes the transformation on the passed stim(s).

Parameters

stims (str, Stim, list) –
One or more stimuli to process. Must be one of:
- A string giving the path to a file that can be read in as a Stim (e.g., a .txt file, .jpg image, etc.)
- A Stim instance of any type.
- An iterable of stims, where each element is either a string or a Stim.
validation (str) –
String specifying how validation errors should be handled. Must be one of:
- ’strict’: Raise an exception on any validation error
- ’warn’: Issue a warning for all validation errors
- ’loose’: Silently ignore all validation errors
args – Optional positional arguments to pass onto the internal _transform call.
kwargs – Optional positional arguments to pass onto the internal _transform call.