MER-Factory

The first framework for automatically constructing Multimodal Emotion Recognition and Reasoning (MERR) datasets.

The first framework for automatically constructing Multimodal Emotion Recognition and Reasoning (MERR) datasets.

  • Action Unit (AU) Pipeline: Extracts facial Action Units (AUs) and translates them into descriptive natural language.
  • Audio Analysis Pipeline: Extracts audio, transcribes speech, and performs detailed tonal analysis.
  • Video Analysis Pipeline: Generates comprehensive descriptions of video content and context.
  • Image Analysis Pipeline: Provides end-to-end emotion recognition for static images, complete with visual descriptions and emotional synthesis.
  • Full MER Pipeline: An end-to-end multimodal pipeline that identifies peak emotional moments, analyzes all modalities (visual, audio, facial), and synthesizes a holistic emotional reasoning summary.

Check out the documentation for more details.