A speaker plays music at low volume, sounding quiet and thin. Then the volume knob is turned up gradually until the music fills the room.
Seedance 2.0ByteDance · proprietary
Kling 3.0 OmniKuaishou · proprietary
Veo 3.1Google · proprietary
LTX-2.3Lightricks · open
Ovi 1.1character.ai · open
JavisDiT++JavisDiT · open
MagiHumanGAIR-NLP · open
Per-prompt rubric
Basic standards
Visual presence
- a speaker
- a volume knob
Event: a volume knob is turned on a speaker
Audio presence
- a speaker
Sound: music from a speaker
Key standards
Visual physical commonsense (V-PC)
- A speaker with a visible volume knob is shown, and the knob is visibly rotated during the clip.
Audio physical commonsense (A-PC)
- The music grows gradually louder as the volume knob is turned in the loudness-increasing direction.
- The audio is continuous music from the speaker.
Cross-modal physical commonsense (AV-PC)
- The music gets louder when the volume knob is visibly turned in the increase direction.
- The music gets quieter when the volume knob is visibly turned in the decrease direction.