I'm passionate about integrating artist insights and data-driven methods to create animation tools that will power the next generation of media.
New York Flow Open champion 2022, vice champion 2023
Singing and speaking are two fundamental forms of human communication. From a modeling perspective however, speaking can be seen as a subset of singing. We present VOCAL, a system that automatically generates expressive, animator-centric lower face animation from singing audio input. Articulatory phonetics and voice instruction ascribe additional roles to vowels (projecting melody and volume) and consonants (lyrical clarity and rhythmic emphasis) in song. Our approach directly uses these insights to define axes for Melodic-accent and Pitch-sensitivity (Ma-Ps), which together provide an abstract space to visually represent various singing styles. In our system. vowels are processed first. A lyrical vowel is often sung tonally as one or more different vowels. We perform any such vowel modifications using a neural network trained on input audio. These vowels are then dilated from their spoken behaviour to bleed into each other based on Melodic-accent (Ma), with Pitch-sensitivity (Ps) modeling visual vibrato. Consonant animation curves are then layered in, with viseme intensity modeling rhythmic emphasis (inverse to Ma). Our evaluation is fourfold: we show the impact of our design parameters; we compare our results to ground truth and prior art; we present compelling results on a variety of voices and singing styles; and we validate these results with professional singers and animators.
Read moreR & D Engineer @ JALI Research
Built in-house signal processing pipeline to decrease latency. Design and implemented new procedural audio-based neck motion generator
Phd. of Computer Science @ University of Toronto
Master of Computer Science @ University of Toronto
Bachelor of Engineering, Engineering Science @ University of Toronto