A Review of iPi Soft’s Markerless Motion Capture System
Motion capture or mocap has made its place as part of a modern animator’s toolkit. For many styles of animation, going with mocap instead of traditional animation saves time and cuts budgets. However, until recently, only productions capable of investing in many thousands of dollars worth of cameras and software from companies such as Polhemus and Vicon Systems could even consider using this approach.
However, in the decades since mocap began as a tool for photogrammetric analysis in biomechanics research, advances in technology have continued to drop the price point of entry.
Harnessing Game Gear
Now, iPi Desktop Motion Capture software from Moscow-based iPiSoft combined with Microsoft’s Kinect interface has pushed mocap down to a price point well under $1000. But is this a usable combo for small producers considering a mocap-based project?
When I first heard about iPiSoft’s markerless motion capture technology, I was intrigued. When used with two standard Microsoft Kinect motion sensors, here was a product that promised to deliver accurate motion capture for a fraction of the cost of traditional mocap systems. The setup also promised to do mocap without the use of markers. While these reflective objects affixed to an actor’s suit have traditionally been used to track motion, the use of them raises the price of a setup. iPiSoft’s app and two Kinects, meanwhile, constitute a minimal package that also doesn’t require alternate gear such as inertial sensors, nor teams of technicians or even dedicated studio space.
Quite a claim. But how justified was it? After putting it through its paces, I came up with an interesting conclusion.
Motion Capture Chat
Before I put iPiSoft’s technology to the test, however, let’s first hear from the man behind this innovative software: Michael Nikonov, iPiSoft’s founder and chief technology architect. He was part of a team from Samsung that received a patent for an innovative markerless approach to motion capture this past year.
According to Nikonov, markerless motion capture is not only able to compete with established approaches to mocap, but it is poised to become the predominant system in the future as its accuracy is already very comparable to standard marker-based systems employed by Hollywood.
“I would compare older mocap solutions to antiquated mainframe computers,” says Nikonov. “Now, the use of mocap for animation won’t be limited by your budget but only by the amount of creativity you can offer.”
Let’s take a brief look at marker-based motion-capture technology, such as a setup from the well established British-based company Vicon. Bonita, their latest entry level product line, uses a combination of small IR (infra-red) reflective markers on the actor’s suit along with cameras that employ IR LED lights. Multiple cameras ring the performance space so that by employing real-time triangulation, the Vicon software can locate points on the actors within 3D space to construct movement. Bonita systems start at $10,000.
Markerless systems such as iPiSoft’s don’t rely on established points such as reflecting markers but use complex algorithms to parse the pixel depth information sent to Kinect’s cameras.
Kinect’s camera system also uses infrared projections with a webcam-like camera, as well as a processor to track the movement of objects and individuals in three dimensions. This 3D scanner system employs technology from companies including PrimeSense, GestureTek and 3DV, which Microsoft bought in 2009. (Some of you might remember 3DV for their real-time 3D camera system that turned up in JVC’s booth at NAB a years ago.)
Nikonov says that the markerless approach is more reliable, especially when it comes to dealing with occlusion (a term to describe what happens when parts of an actor’s body are hidden by other parts, such as when an arm goes behind a torso).
The only way to fix occlusion problems in a marker based system, according to Nikonov, is by using more cameras, with the result that some setups can deploy up to sixteen or twenty devices. Due to all of those cameras, as well as the complexity of implementing the sensors and the suits the actors must wear, you’d better count on having a few technicians around to make sure it all works. Naturally, all of this makes a marker-based mocap shoot very expensive. In fact, as Nikonov points out, “everything related to marker-based is expensive. Just the suit alone is more expensive than an iPiSoft system”.
Add in all the cameras, wires, the bother of applying the sensors to the actors, the space to set all of this up in and the price for marker based motion capture soars even higher. In contrast, the iPiSoft markerless software costs $595 for the basic version ($995 for the standard version), while you can pick up a Kinect for about $120 each.
On the top of iPiSoft’s home page on the web, you’ll find the phrase “Motion Capture for the Masses”. I would agree with them. Even independent animators or small studios can afford it. However, the question still remained: How good does it actually work?
After my conversation with Michael Nikonov, it was time to find out for myself.
A pair of Kinects
As mentioned, iPiSoft’s system utilizes two low cost and readily available Kinect motion sensors. Manufactured by Microsoft, the Kinect motion sensor input device was originally developed for the Xbox 360 video game console and released in 2010. The depth sensor includes an infrared sensor which captures 3D data under just about any ambient light condition.
Previous versions of iPi Soft motion capture software included support for only one Kinect (or multiple PlayStation Eye cameras). Now with the ability to collect data from two Kinects, the resulting depth information delivers much more accuracy. As iPi Soft notes on its site, the setup can even capture 360 degree turns by a character, which is an impressive achievement considering the system has to figure this out by detecting subtle differences in a character’s position.
Along with the Kinects came two USB active extension cables; I used those to connect the Kinects to my computer, in this case an HP Z800 workstation. This is an extremely powerful computer that’s not a budget killer either (you can check out my review here). I’m also using an Nvidia Quadro 5000, a good match for 2D and 3D work. Needless to say, this system provided more than enough power for the job.
Before you can start capturing motion, you need to set up everything properly. Naturally, this isn’t as difficult as with marker based systems. However you do need to use a little care in order for it to work. The first thing you need to do is figure out where the performance will be done and then position the Kinects accordingly.
Obviously, the more space you have to move, the better off you are since you must allow space for the action as well as a certain amount of room for the sensors. If you have an unused office or room, that would be ideal. I set it up in my home office, where I faced a bit of a tight squeeze, but there was sufficient space to get a wide range of motion. This is important since one of things I find compelling about iPi Soft’s dual-Kinect motion capture system is that it is within the grasp of individual animators (both financially and from a technical standpoint). While studios who lease a large commercial space will have no problem, it is important, in my opinion, that it also works within more confined spaces as well.
iPi Soft recommends a 10-foot x 10-foot space to work within and the capture area is 7-foot x 7-foot. This is actually limited by the Kinect’s sensor range, not by the software. For more details see this page.
When it comes to positioning the Kinects, you’ve got two options. You can place them so they point at the staging area between 60 and 90 degrees from each other. The other option is to position them at an angle of around 180 degrees. I chose the former option (60 to 90 degrees) which fit in better in my space. I also mounted each Kinect on its own tripod so that they were both around waist height.
After deciding where the action would take place, as well as positioning and pointing the Kinects at the proper angle, the next step was to perform a calibration. This is necessary in order for the iPi Soft app to understand where each Kinect is located in relation to each other. Having two Kinects (as opposed to one) allows for wider coverage area for your motions as well as improved accuracy, but calibration is a must for the system.
Performing the calibration was straightforward. All you need is a relatively large plane. The online manual suggested you use a rectangular piece of plywood, cardboard or veneer around 30″ X 40″. I decided to use a large painting that was hanging on my wall; that turned out to work fine for the task.
iPi Soft comes with two software modules: iPi Recorder and iPi Desktop Motion Capture. iPi Recorder is a free download and you use it to record depth information from the Kinects. The iPi Desktop Motion Capture module requires a license. We’ll talk about that module later.
For now, to record the calibration video (as well as the motion capture itself) you must first use iPi Recorder. Upon launching the app, it immediately recognized the Kinects and began to stream their output to my computer monitor. Depth sensor data is represented by bright, saturated colors in iPi Recorder — vivid reds, blues, violets and yellows.
After pressing the record button and letting about two seconds go by, I entered the scene holding the calibration plane and, as instructed to do so by the manual, recorded myself tilting it towards each camera, left to right, for a few seconds. It is recommended that during this process, as well as during the motion capture itself, you try to minimize the appearance of yellow pixels. They represent areas where the Kinect sensors are unable to determine the depth (note that you don’t need to remove all yellow pixels, just try to get as little as possible). Interestingly, I found that shining an intense light on the set increased the amount of yellow pixels so you don’t need to worry about brightly lighting your set when doing your mocap.
What follows is a video demonstrating what the calibration video looks like from both Kinects. It should be noted that iPi Recorder recorded the depth video at 30 fps, but the playback rate here is 15 fps due to the screen capture utility I used.
After recording the calibration video, I opened it up in the iPi Desktop Motion Capture software module. Upon loading it, I was able to gain insight into the depth information acquired by the Kinects . When looking at the brightly colored pixels in the depth video straight on, all of the pixels looked like they were on a flat X and Y plane. However as I revolved around the viewport, the pixels revealed that they also had Z depth.
The mo cap software had created a set of 3D pixels based on the Kinect data whose collective position described the physical make up of my space including the walls, floor and other features such as furniture, light fixtures and other items. It also represented my body. I deduced that the reason I had to wait a few moments before entering the scene was so iPi Soft had the data needed to separate me from the rest of the scene.
To calibrate the data from the two cameras, I first trimmed the frames to include only the tilting motion of the calibration plane and pressed the calibration button. iPi Soft analyzed the corners of the plane from the two different views over the range of frames and then was able to understand precisely where the cameras were located within space. This alignment is critical in order to record the depth information accurately.
The final step in calibration was to save the calibration data as an XML file to my disk by pressing the Save Scene button under the Calibration tab. From then on, you must load the calibration file every time you work on a motion capture segment by pressing the Load Scene button. If you don’t do this, the system won’t know where your Kinects are located and will be unable to process the data from the two depth sensors.
If you move the Kinect motion sensors accidently or reposition them to different locations, the calibration process must be done over again. Therefore you should try your best to be careful not to bump into them. However, the calibration process does not take that long, so it’s no big deal to re-calibrate them if their positions shift inadvertently. But as long as you don’t move the Kinects, you can keep using the same XML calibration file you saved for every motion sequence you capture.
Time for some action
After calibration, it was time to record some motion. I launched the iPi recorder, put on my dancing shoes and after waiting a couple of seconds (just as is required with the calibration video), I stepped into the staging area.
Before you can begin acting out your motion, the first thing you need to do is assume a T- Pose. For those unfamiliar with the term, this means you stand upright with your arms extended perpendicular to your body. After a moment or two of holding this position, you’re ready to begin your motion.
While I’m no Fred Astaire, I managed to pull off a little number that included a few kicks, some side stepping and a couple of 360 degree full body turns to see how well iPi Soft would follow.
The right track
After my dance routine, I opened the depth video of the dance I recorded in iPi Recorder in iPi Desktop motion Capture. Included in the application is a completely rigged human form in, you guessed it, a T-Pose. After locating a frame in video where I stood in the T-Pose, I maneuvered the included figure to more or less match mine using the move and scale tools.
Once both the iPi Soft figure and my figure were roughly occupying the same position, I hit the Refit Pose button which makes the included figure conform more precisely to my starting T-pose. After this, I trimmed the frame range to include only the motion I wished to capture and hit the Track button. At that point, iPi Soft began to analyze each frame captured by the motion sensors and refit the 3D character and its rig to follow my motion.
This tracking step is what actually creates the motion capture data for the skeletal rig. When the tracking was finished, it was possible to watch the provided iPi character do the same dance I made on the staging area.
IPi Soft also provides further post-processing tools to improve the track, chief among them is Jitter removal, which implements an advanced algorithm that removes any blips or glitches which might appear in the motion capture after performing the track. The jitter removal does a remarkable job, but you also have the option of going back to selected areas of your performance and retracking if you wish to do so as well.
Also available is the option to apply trajectory filtering which further irons out the motion capture data nondestructively. You can dial in a variable which sharpens or smooths out the motion depending on how high or low the number is.
Moment of Truth
After the tracking and post-processing actions was applied, it was time to watch what it looked like. I dragged the playback head to the beginning of the motion, hid the depth video and pressed play. I wasn’t sure what to expect since this was my first real attempt at using the software and I thought that I would have to give it several tries before I would end up with something usable.
To my surprise, and sincere delight, the provided iPiSoft human character went through the motions of the dance fluidly and convincingly. I was impressed. As I said, this was the first real time I used it and the results were pretty great. I’m not saying I was skeptical at first, but I knew how much motion capture rigs cost compared to an iPi System and wondered how good it could really be. It was becoming clear that it was good. Really good.
Working with other software
Naturally, software like iPi’s desktop motion capture system is meant to work in tandem with other 3D software and, once again, it receives high marks in this regard. A quick way you can apply the motion capture data onto a character right inside of iPi Soft is by clicking on Import Target Character under the Export tab. When I did this with a DAE character (from Evolver) for a target, the motion capture data was immediately applied to it with virtually no other work on my part.
Next, rather than bringing in a 3D model into the motion capture software, it was time to bring the motion capture into a fully featured 3D program. iPi Soft provides several different options for this and is compatible with the most popular software packages and formats as Cinema 4D, Maya, MotionBuilder, 3D Max, FBX, COLLADA, BVH, LightWave, Softimage, Poser, DAX 3D, iClone, Blender and more.
For the purposes of this test, I decided to use Maxon’s Cinema 4D R13 since it is a highly advanced 3D package with robust character animation tools. Cinema 4D is also part of the production pipelines at major studios such as Sony Pictures Imageworks and Rhythm and Hues. For more info on this very useful app, read my review of Cinema 4D R13 here. (Note: I also imported it into Maya and Motion Builder for testing and it worked fine).
I exported the motion capture data as a BVH file from iPi Soft and, as expected, I imported it into Cinema 4D with no problems at all. Immediately you could see the rig in the viewport and its complete hierarchy of joints was contained in the project window. Glancing at the timeline, I could see keyframes for the position and rotation of each joint — one for every frame and upon hitting the play button, the rig began to move just as it did in iPi Studio.
Finally, I modeled a simple character in Cinema 4D, bound it to the rig and quickly weighted the vertices at the joints Then I added a light and rendered it out with Cinema 4D’s new physical renderer which, among other things, gave it some nice motion blur.
Here is the resulting animation created with iPi Soft’s markerless motion capture system and Cinema 4D:
I really enjoyed working with iPi Soft’s markerless motion capture system. I think it is a great product and would definitely recommend it. Not only was it able to do what it claimed to, but it did it well with a minimum of fuss. I also found it production ready with “off-the-shelf” usability.
If you require more capture area for things like battle sequences or athletic movements, you may wish to consider iPi’s standard Edition which works with up to 6 Sony PS Eye cameras and allow capture areas up to 20-foot by 20-foot. Enough for most kinds of motions and still much less expensive than products from Vicon, PhaseSpace, and Animazoo.
But if you are a 3D animator with wants to try a mo cap solution with minimum overhead, this might be all you need. If you are a production studio or game company, you have no excuse not to give it a try — especially for the price.
From where I stand, I know of no other software program quite like iPi Studio and after my conversation with Michael Nikonov, I get the feeling that we are going to see exciting new developments in the coming months as it continues to develop. For example, Nikonov mentioned that iPi Soft may develop their own cameras or depth sensors. I would imagine that they would offer more capability at a similarly low price.
If you are serious about animating with motion capture, or are just getting into it, do yourself a favor, check out iPi Studio. You’ll be glad you did.