Getting Started with MER-Factory
Get up and running with MER-Factory in just a few minutes. This guide will walk you through the installation process and your first emotion recognition pipeline.
System Overview
Prerequisites
Before installing MER-Factory, ensure you have the following dependencies installed on your system:
1. FFmpeg Installation
FFmpeg is required for video and audio processing.
macOS
brew install ffmpeg
Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg
Windows
Download from ffmpeg.org
Verify installation:
ffmpeg -version
ffprobe -version
2. OpenFace Installation
OpenFace is needed for facial Action Unit extraction.
# Clone OpenFace repository
git clone https://github.com/TadasBaltrusaitis/OpenFace.git
cd OpenFace
# Follow platform-specific build instructions
# For Windows install, make sure you run the download_models.ps1 to download the models.
# See: https://github.com/TadasBaltrusaitis/OpenFace/wiki
FeatureExtraction
executable (typically in build/bin/FeatureExtraction
). You'll need this for configuration.
Installation
1. Clone the Repository
git clone https://github.com/Lum1104/MER-Factory.git
cd MER-Factory
2. Set Up Python Environment
# Create a new conda environment
conda create -n mer-factory python=3.12
conda activate mer-factory
# Install dependencies
pip install -r requirements.txt
3. Configure Environment
# Copy the example environment file
cp .env.example .env
Edit the .env
file with your settings:
# API Keys (optional - choose based on your preferred models)
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
# OpenFace Configuration (required for AU and MER pipelines)
OPENFACE_EXECUTABLE=/absolute/path/to/OpenFace/build/bin/FeatureExtraction
# Optional: Ollama configuration for local models
# OLLAMA_HOST=http://localhost:11434
OPENFACE_EXECUTABLE
path must be absolute and point to the actual executable file.
Your First Pipeline
Let’s run your first emotion recognition pipeline!
1. Prepare Your Media
Create a test directory with a video file:
mkdir test_input
# Copy your video file to test_input/your_video.mp4
2. Run MER Pipeline
# Basic MER pipeline with default Gemini model
python main.py test_input/ output/ --type MER --silent
# With threshold adjustment
python main.py test_input/ output/ --type MER --threshold 0.8 --silent
3. Check Results
# View generated files
ls output/{sample_id}/
# your_video_merr_data.json - Contains complete analysis
# your_video_au_data.csv - Facial Action Units data
# your_video.wav - Extracted audio
# your_video_peak_frame.jpg - Key emotional moment
Export the Dataset
To export datasets for curation or training, use the following commands:
For Dataset Curation
python export.py --output_folder "{output_folder}" --file_type {file_type.lower()} --export_path "{export_path}" --export_csv
For Training
python export.py --input_csv path/to/csv_file.csv --export_format sharegpt
Model Options
MER-Factory supports multiple AI models. Choose based on your needs:
Google Gemini (Default)
python main.py input/ output/ --type MER
- Best for: High-quality multimodal analysis
- Requires:
GOOGLE_API_KEY
in.env
OpenAI ChatGPT
python main.py input/ output/ --type MER --chatgpt-model gpt-4o
- Best for: Advanced reasoning and video analysis
- Requires:
OPENAI_API_KEY
in.env
Ollama (Local Models)
# First, pull the models
ollama pull llava-llama3:latest
ollama pull llama3.2
# Run with Ollama
python main.py input/ output/ --type MER \
--ollama-vision-model llava-llama3:latest \
--ollama-text-model llama3.2
- Best for: Privacy, no API costs, async processing
- Requires: Local Ollama installation
Hugging Face Models
python main.py input/ output/ --type MER --huggingface-model google/gemma-3n-E4B-it
- Best for: Latest research models, custom implementations
- Note: Automatic single-threaded processing
Pipeline Types
Quick Pipeline Comparison
Pipeline | Input | Output | Use Case |
---|---|---|---|
MER | Video/Image | Complete emotion analysis | Full multimodal datasets |
AU | Video | Facial Action Units | Facial expression research |
Audio | Video | Speech + tone analysis | Audio emotion recognition |
Video | Video | Visual description | Video understanding |
Image | Images | Image emotion analysis | Static emotion recognition |
Example Commands
# Action Unit extraction only
python main.py video.mp4 output/ --type AU
# Audio analysis only
python main.py video.mp4 output/ --type audio
# Video description only
python main.py video.mp4 output/ --type video
# Image analysis (auto-detected for image inputs)
python main.py ./images/ output/ --type image
# Full MER with custom settings
python main.py videos/ output/ \
--type MER \
--threshold 0.9 \
--peak-dis 20 \
--concurrency 8 \
--silent
Testing Your Installation
Run the built-in tests to verify everything is working:
# Test FFmpeg integration
python test/test_ffmpeg.py your_video.mp4 test_output/
# Test OpenFace integration
python test/test_openface.py your_video.mp4 test_output/
Common Issues & Solutions
FFmpeg Not Found
Symptom: FileNotFoundError
related to ffmpeg
Solution:
- Verify FFmpeg is installed:
ffmpeg -version
- Check if it’s in your PATH
- On Windows, add FFmpeg to system PATH
OpenFace Executable Not Found
Symptom: Cannot find FeatureExtraction executable
Solution:
- Verify the path in
.env
is absolute - Check file permissions:
chmod +x FeatureExtraction
- Test manually:
/path/to/FeatureExtraction -help
API Key Errors
Symptom: 401 Unauthorized
errors
Solution:
- Verify API keys are correct in
.env
- Check for extra spaces or characters
- Ensure billing is enabled for your API account
Memory Issues
Symptom: Out of memory errors with large files
Solution:
- Reduce concurrency:
--concurrency 1
- Use smaller video files for testing
- Close other memory-intensive applications
Next Steps
Now that you have MER-Factory running, explore these advanced features:
- API Reference - Detailed function documentation
- Examples - Real-world usage examples
- Technical Documentation - System architecture details
Need Help?
- 🐛 Report issues on GitHub Issues
- 💬 Join discussions on GitHub Discussions
- 📖 Read the Technical Documentation for deeper understanding