Actors in green suits with ping-pong balls.
That's the idea most people have when you say "motion capture" but we are way past that visual at this point. Technology moves (seemingly) so much faster now than it ever has... so much so that it's hard to keep up with the terminology, let alone the techniques. With Jon Favreau's The Jungle Book movie out this week I thought it was about time to devote some time to finding out where "performance capture" falls on the landscape of visual storytelling. Let's explore!
Listen to this blog post as an iTunes podcast!
Or via SoundCloud if you prefer...
Preorder Stephen's Animation Tutorial Book:
The first question we have to address is what the distinction is between "live-action" and "animation" as terms. This always comes up around Oscar time so the best place to start is with the Academy's definition:
An animated feature film is defined as a motion picture with a running time of greater than 40 minutes, in which movement and characters’ performances are created using a frame-by-frame technique. Motion capture by itself is not an animation technique. In addition, a significant number of the major characters must be animated, and animation must figure in no less than 75 percent of the picture’s running time.
I need to point out one particular sentence: "Motion capture by itself is not an animation technique."
That was added for the 83rd Academy Awards in 2010, one year after Avatar swept the Oscars and started this modern conversation. The thing was that depending on how you judge certain factors, it was difficult to decide which category to place a movie like Avatar. Here are the major applicable factors listed out:
- Character animation must be created using a "frame-by-frame" technique
- A "significant number" of major characters must be animated
- "Motion capture" alone isn't considered an animation technique
- Animation has to be present on-screen in at least 75% of the movies run time (meaning, the length of the movie itself)
Around the release of Avatar, Gawker (which was quoting a Hollywood Reporter article I can't find) reported that "when completed, Cameron expects Avatar to be about 60% CG animation, based on characters created using a newly developed performance capture-based process, and 40% live action, with a lot of VFX in the imagery." Many people take this to mean that the movie is 60/40 in terms of animation/live-action but that's only counting the character animation, not the scenery and vehicles and I would assume it also does not include the alien animal animation. So far, Avatar fits 2 of the 4 major factors to call it animation. Now we have to look at the technique of the animation itself... the question is how much was "motion capture alone" (which isn't considered animation by the academy) and how much was a "frame-by-frame technique?" Well there were actually production notes released with Avatar with a section called "IS IT ANIMATION" which says:
Ask the animators at WETA, and they’ll tell you that the avatars and Na’vi are animated. Ask Jim Cameron, and he’ll say the characters were performed by the actors. The truth is that both are right. It took great animation skill to ensure that the characters performed exactly as the actors did. But at the same time, no liberties were taken with those performances. They were not embellished or exaggerated. The animators sought to be utterly truthful to the actors’ work, doing no more and certainly no less than what Sam, Zoe or Sigourney had done in the Volume. Of course the animators added a little bit, with the movement of the tails and ears, which the actors could not do themselves. But even here, the goal was to stay consistent with the emotions created by the actors during the original capture. So when Neytiri’s tail lashes and her ears lower in fury, they are merely further expressing the anger created by Zoe Saldana in the moment of acting the scene.
OK so they clearly don't think it's animation, but is it solely motion capture? The question seems to be turning into who's making the acting choices: the performance-capture actor or the animator? But if that's the case, then we're talking about the medium itself and not the technique.
Just to confuse things even more, the year that Avatar came out saw the Academy Awards go bonkers trying to figure out what's "live-action" what's "animation" and who gets to also be called "visual effect." Behold:
- Alvin and the Chipmunks: The Squeakquel featured an all Live-Action cast except the Animated Chipmunks/Chipettes was submitted for Best Animated Film
- District 9 featured an all Live-Action cast except the Animated aliens (made partly with "Motion Capture" techniques) was nominated for Best Picture
- District 9 was nominated for Best Visual Effects
- A Christmas Carol was made using Performance Capture techniques and also submitted for Best Animated Film
- A Christmas Carol was nominated for Best Visual Effects
- Avatar was made using Performance Capture techniques and submitted for Best Picture
- Avatar was nominated for best Visual Effects
- Up was fully Animated and nominated for Best Animated Film and Best Picture
- Up was not nominated for Visual Effects
It was a crazy year. It's pretty clear that the lines between live-action, animation, and visual effects are complicated at best and 2009 proved that well. Really, I think this will prove to be a watershed moment of sorts in that it sparked the attempt to redefine classifications which have impacts on things like promotion, budgets, and award opportunities of course. The new thing for 2009 was "performance capture" because up until that point we were calling the technique "motion capture" and with that came certain comparisons that carry over to now, so let's back up and look at that real quick before moving onto any attempted conclusions.
Often when talking about motion capture with animators the comparison invariably leads to the technique of "rotoscoping." Right off the bat, let's look at the definitions:
Rotoscoping is an animation technique in which animators trace over footage, frame by frame, for use in live-action and animated films
Motion capture (Mo-cap for short) is the process of recording the movement of objects or people... and using that information to animate digital character models in 2D or 3D computer animation.
So basically the thought is that since both techniques record movement in some form and that is used to transfer the movement data to an animated character (either by hand, as with rotoscoping, or by computer, as with mo-cap) that they are equal... only separated by the use of computer.
The method of filming actors and then using something called a "rotoscope" (Max Fleischer patented it way back in 1917 but he created before that) to copy the movement more accurately onto paper was used first to lend realism, not to reduce the budget of an animated film or cartoon. For something like the Fleischer Studios "Superman Serials," it allowed them to add some real detailed movement to the characters. But it's important to note that obviously they couldn't film someone flying or catching a falling building so the animators made those design choices and were not done using the rotoscoping technique. Rotoscoping assisted the animation process in this way all the way through the Disney Golden Era starting with the first western animated feature film, Sleeping Beauty.
Later on, as the technology became more affordable, rotoscoping was seen more as a cost cutting technique. It allowed Ralph Bakshi to finish Wizards the way he wanted to when he was denied extra funding, and it (along with clever staging and use of loops) allowed He-Man to exist in such a rich world on a modest budget. More recently in our time, we've come full-circle back to filmmakers like Richard Linklater using it as a direct style choice like in Waking Life and A Scanner Darkly with the technique itself now using computer-assisted algorithms (note: don't read that in a condescending tone, but if you're feeling jaded the GIF below should help).
(... you're welcome)
The more the technology progress, the more automated it can be which either takes burden of the artists or allows them to reach a detail level they were previously prevented to achieve because of time and/or budget. Because it's "2D," rotoscoping has always been seen as a technique for animation and not a resource for style-choices in live-action. Even so, if you ask most animators what they think of rotoscoping most will answer with some form of "it's cheating," and I think what they're referring to is the GIF above. If animation is too real, it's often seen as unnecessary for the medium because using reference footage that strictly takes acting choices away from the animator and puts it too squarely in the "live-action" realm... so the question becomes "what's the point?" Now everyone please welcome Motion Capture to the conversation.
Remember when I pointed out up top that the technology for the motion capture technique has progress so far beyond the rubber suit + ping-pong balls getup? Well let's go back to that for a second. This were the actors on the motion capture set for Final Fantasy: The Spirits Within.
The white, glowing balls are the reference points which the computer will assign to a model with similar reference points (ie the "elbow" dot's motion will be tracked and applied to the "elbow" point on a computer model). I think everyone reading this blog knows at least that much. The reason this setup is needed is because of the 3-dimensional environment. There's a lot of information in conscious movement - it's one of the reasons that in rotoscoping if you just trace the footage strictly it feels "off." So much of that information is being lost... think of it like it's being filtered through the pencil. In order to make the movement seem real again after all that lost the animator needs to use the same pencil to add in the lost data (adding speed to an impact to feel more "real" for instance, many of you have probably done this off your own reference footage). There isn't any good way to "trace" reference footage in a 3D environment with a 3D model without the use of the computer. So here's where the comparison usually is. All those white dots in the image up there are reference points, the same as you use to check that you have smooth arcs (it's one of the 12 Principles of Animation, so you should be checking your arcs). For every point on the body that isn't tracked, that is "lost" data. The computer fills in the blanks as well as it can but for any "conscious movement" (meaning hand or face) the animators need to fill that in. This is done the same way as it is in rotoscoping: either using the filmed reference footage or creating it straight from their own imagination.
Final Fantasy: The Spirits Within was the 1st full-length photorealistic animated film. It was released in 2001 and at this point only the joints and major rotation points were tracked so there was a lot of work turning it into something cinematic.
Lord of the Rings: The Two Towers (considered a live-action film), just one year later in 2002, saw Andy Serkis on-stage performing as digital character Gollum with his co-stars instead of alone and then composited in as it would've been.
By the time we get up to 2006, 2 out of the 3 nominees for Best Animated Film used motion capture: Monster House and the winner Happy Feet. Only Disney-Pixar's Cars was animated without mo-cap.
In 2009, Avatar was released (see above). The technology and techniques had moved far enough ahead that the amount of data which could be captured from an actor's movements meant that the refinement and recreation needs continued to lessen, and that's when people started to determine their own tipping points.
Better yet... why the question? Ask anyone in the business and they'll tell you that there are many functional differences in producing live-action and animation. The scripts are different, the storyboards are different, the budgeting and timelines are very different... even the marketing is different. The awards, of course, have also been different. They have always been fundamentally different mediums. Rotoscope started as an animation technique to increase realism in animation. Mo-cap's beginnings are virtually the same.
Performance Capture was created specifically to bring the worlds of digital creation and in-camera filmmaking together. Digital characters like Colossus from Deadpool took the work of 5 separate actors (mo-cap, on set eyeline, model for design, voice actor, and facial expressions) including many animators, programmers, and compositors to bring him to life. Deadpool itself took the work of director Tim Miller, an animator himself who co-founded Blur Studio, to bring the whole thing together and make the everything work harmoniously in terms of budget, film times, VFX work, and so on. With movies like Avatar, the new Planet of the Apes series, and now Jon Favreau's The Jungle Book merges the two mediums of Animation and Live-Action so thoroughly while using Performance Capture to increase the amount of actors' input that it has impacts which are felt throughout the process, changing the way budgets, planning, filming, and more are created and addressed. There's a blurry line on purpose and there doesn't seem to be any real right answer to the question of is it live-action or animation filmmaking; I believe it's because modern heavily performance capture based movies are instead a new medium of filmmaking which will continue to create films and characters in such a deeply collaborative way to warrant a new categorization: Performance Capture Filmmaking.
Animators and Visual Effects artists have been undervalued for a long time now and the trend is looking to progress in that manner rather than correct itself. Even with Animation and Visual Effects being a part of almost every film we see today, somehow the artists in these fields of work are overlooked. My feeling of categorizing this new medium of filmmaking is to increase awareness of the intense collaborative effort on the part of everyone involved including the animators and the actors. There should be no "it's all animation" or "it's all acting" because to make these characters come to life and exist in a purely digital form everyone needs to play their very significant parts.
Preorder Stephen's Animation Tutorial Book:
TL;DR The technology, techniques and workflow with in-depth performance capture filmmaking now to consider it a brand new medium.
And don't forget to VOTE FOR YOUR FAVORITE submission for April's #RubberOnionBattle right here on Newgrounds!
FGIJonC
This was a really really interesting read. Thanks for that write up. I can't remember if video games were mentioned (I am very tired) but they use similar tech for their cinematics. Uncharted for example, did motion capture for the bodies but the animators themselves keyed the facial expressions, which as you mentioned above for film(s) requires the use of imagination from the animators themselves, which is really awesome.
I'm going to link to this on my twitter.
rubberonion
I mentioned video games in the podcast as one of the first uses (and still a primary use in media) of mo-cap technology (i think the first-first was military simulation). Thanks for the comment, glad you enjoyed it! My twitter is @RubberOnion feel free to tag me! (=