pliers.extractors.DictionaryExtractor

class pliers.extractors.DictionaryExtractor(dictionary, variables=None, missing=nan)[source]

Bases: TextExtractor

A generic dictionary-based extractor that supports extraction of arbitrary features contained in a lookup table.

Parameters
  • dictionary (str, DataFrame) – The dictionary containing the feature values. Either a string giving the path to the dictionary file, or a pandas DF. Format must be tab-delimited, with the first column containing the text key used for lookup. Subsequent columns each represent a single feature that can be used in extraction.

  • variables (list) – Optional subset of columns to keep from the dictionary.

  • missing – Value to insert if no lookup value is found for a text token. Defaults to numpy’s NaN.

__init__(dictionary, variables=None, missing=nan)[source]
transform(stim, *args, **kwargs)

Executes the transformation on the passed stim(s).

Parameters
  • stims (str, Stim, list) –

    One or more stimuli to process. Must be one of:

    • A string giving the path to a file that can be read in as a Stim (e.g., a .txt file, .jpg image, etc.)

    • A Stim instance of any type.

    • An iterable of stims, where each element is either a string or a Stim.

  • validation (str) –

    String specifying how validation errors should be handled. Must be one of:

    • ’strict’: Raise an exception on any validation error

    • ’warn’: Issue a warning for all validation errors

    • ’loose’: Silently ignore all validation errors

  • args – Optional positional arguments to pass onto the internal _transform call.

  • kwargs – Optional positional arguments to pass onto the internal _transform call.