The New Standard In Digital Interactors

It has been a long me coming in order to create real-time, digital humans. Although not quite there, a very talented group just put together the best attempt yet:

Mike Seymour, co-founder and interviewer for FXGuide, teamed up with companies such as Epic Games, Cubic Motion and 3Lateral to create this impressive show piece. The demonstration was created for the Siggraph 2017 Conference where Mike Seymour’s avatar interviewed, live on stage, leading industry figures from Pixar, Weta Digital, Magnopus, Disney Research Zurich, Epic Games, USC-ICT, Disney Research and Fox Studios.

Mike Seymour was scanned as part of the Wikihuman project at USC-ICT with additional Eye scanning done at Disney Research Zurich. The graphics engine and real time graphics are a custom build of Unreal Engine. The face tracking and solving is provided by Cubic Motion in Manchester. The State of the art facial rig is made by 3Lateral in Serbia. The complex new skin shaders were developed in partnership with Tencent in China. The technology uses several AI – deep learning engines in tracking, solving, reconstructing and recreating the host and his guests. The research into the acceptance of the technology is being done by Sydney University, Indiana University and Iowa State University. The guest’s avatars are made from single still images by Loom.ai in San Francisco.

While this experience does play at 30fps and 90fps (with wider camera angle) it does come with a cost. Its was created with nine PC’s with 32Gig RAM each and 1080ti Nvidia cards. Here are the other technical facts:

MEETMIKE has about 440,000 triangles being rendered in real time, which means rendering of VR stereo about every 9 milliseconds, of those 75{76c5cb8798b4dc9652375d1c19c86d53c1d1411f4e030dd406aa284e63c21817} are used for the hair.
Mike’s face rig uses about 80 joints, mostly for the movement of the hair and facial hair.
For the face mesh, there is only about 10 joints used- these are for jaw, eyes and the tongue, in order to add more an arc motion.
These are in combination with around 750 blendshapes in the final version of the head. mesh.
The system uses complex traditional software design and three deep learning AI engines.
MIKE’s face uses a state of the art Technoprop’s Stereo head rig with IR computer vision cameras.
The University research studies into acceptance aim to be published at future ACM conferences. The first publication can be found at ACM Conference
FACS real time facial motion capture and solving[Ekman and Rosenberg 1997]
Models built with the Light Stage scanning at USC-ICT
Advanced real time VR rendering
Advanced eye scanning and reconstruction
New eye contact interaction / VR simulation
Interaction of multiple avatars
Character interaction in VR at suitably high frame rates
Shared VR environments
Lip Sync and unscripted conversational dialogue
Facial modeling and AI assisted expression analysis.

This is all very exciting for the evolution of immersive experiences and the evolution of the Digital Interactors. As Immersive experiences become more life-like, the need to create photo-real interactor become more demanding. So far this is the best example of real time facial MOCAP I have seen yet.

In the video above the animation in the mouth seems to be the weakest component. The eyes, nose, hair and wrinkles really seem to shine. The other videos I have seen show of the eyes more than the mouth. I will have to study their approach in order to understand why the mouth animation does not look as good. It is a real head-scratcher especially since the system employs 750 blend targets. The animation is not quite photo real. However, animation such as this may be good enough for stylized characters. Hopefully there is time for the animation technology to catch up to the rendering performance.

Regardless of the mouth, the system was still able to be put together with off the shelf software and components. They were working with a special build of the UE4 editor. How unique that build was has yet to be seen. This is all very inspiring since anyone could start getting similar results now. Companies such as iMyth need to be employing this technology today in order to keep up with the developmental curve. Once tutorials employing these concepts start becoming commonplace, everyone and their uncle will be employing them.

For a company such as iMyth what needs to be developed is a an animation system such that the models will animate similarly despite the face that drives it. Any unique character may have multiple interactors driving them at different times. MEETMIKE was created based on the real Mike Seymour. Immersive Experience companies will not be able to create personalized versions of each character to correspond with each possible interactor. One character model will be built and rigged while multiple interactors need to be able to drive it. This adds an entire new level of complexity to the implementation.

Along with finger tracking, Facial MOCAP was one of the big hurdles for Immersive Experiences to overcome. Maybe with MEETMIKE this barrier has now been overcome.

The New Standard In Digital Interactors

More posts

Shape the Future

Hero Features

Supporter Input

Your Place in the Story