MER-Factory
The first framework for automatically constructing Multimodal Emotion Recognition and Reasoning (MERR) datasets.
The first framework for automatically constructing Multimodal Emotion Recognition and Reasoning (MERR) datasets.
- Action Unit (AU) Pipeline: Extracts facial Action Units (AUs) and translates them into descriptive natural language.
- Audio Analysis Pipeline: Extracts audio, transcribes speech, and performs detailed tonal analysis.
- Video Analysis Pipeline: Generates comprehensive descriptions of video content and context.
- Image Analysis Pipeline: Provides end-to-end emotion recognition for static images, complete with visual descriptions and emotional synthesis.
- Full MER Pipeline: An end-to-end multimodal pipeline that identifies peak emotional moments, analyzes all modalities (visual, audio, facial), and synthesizes a holistic emotional reasoning summary.
Check out the documentation for more details.