Generally the core gesture recognition implementation is based on a DTW implementation, provided by initially by a community member called Rhemyst (respect for that!). However, to make this work with the rest of my design in the way I needed it and due to some peculiarities I found about it during testing, I have partially changed it and (hopefully) improved it. The most important change being that it allows for having gesture-specific recognition thresholds, which happens to be quite important if you need to have precise control over the recognition success rate. During testing I’ve found that due to noise and general similarities among gestures, without having this, it’s hard to really make sure that certain gestures get higher priority than others.

Other than the DTW itself, the component is essentially a state machine with each of the states taking care of different things. Basically the states take care of things like determining the set of skeletons they track and handling of stream data (depth, skeleton, image). . Currently the implementation, being a week old, only really deals with the skeleton stream, but that might well change soon as I’ll keep adding more and more features. The gestures themselves are assigned to state transition events, which allows you to navigate through the FSM. Also the FSM is taking care of bubbling up the recognized gestures via events, so that your application can simply subscribe to those and do what it needs.

The default tracking (as it is implemented right now) uses the concept of ‘cue limb’ – i.e. it’s always a particular limb that we’re tracking and using for recognition. The way a limb becomes a cue limb is, by raising it above your head. At this point you can use this limb for gesture recognition, gesture recording, etc. This is a restriction, but it’s easily workaroundable given the current design and will most likely be made more generic as I move forward.

Last edited Aug 8, 2011 at 5:17 PM by minimalistic, version 2


No comments yet.