Gesture recognition

October 18, 2014 — November 12, 2018

machine learning
making things
music
real time
UI

I want to recognise gestures made with generic interface devices for artistic purposes, in real time. Is that so much to ask?

Related: synestizer, time warping, functional data analysis, controller mapping.

Figure 1

1 To Use

  • Reverse engineer Face the Music.

  • Gesture variation following has particular algorithms optimised for real-time music and video control using, AFAICT, particle filter. This is a different approach to the other ones, which use off-the-shelf algorithms for the purpose, leading to some difficulties. (source is C++, PureData and MaxMSP interfaces available)

  • GRT: The Gesture Recognition Toolkit other software for gesture recognition; lower level than Wekinator (default API is raw C++), more powerful algorithms, although a less beguiling demo video. Now also includes a GUI and PureData OpenSoundControl interfaces in addition to the original C++ API.

  • Eyesweb: An inscrutably under-explained GUI(?) for integrating UI stuff somehow or other.

  • Wekinator: Software for using machine learning to build real-time interactive systems. (Which is to say, a workflow optimised for ad-hoc, slippery, artsy applications of cold, hard, calculating machine learning techniques.)

  • Beautifully simple “graffiti” letter recogniser (NN-search on normalised characters, neat hack. Why you should always start from the simplest thing.) (via Chr15m)

  • how the Kinect recognises (spoiler: random forests)

BTW, you can also roll your own with any machine learning library; It’s not clear how much you need all the fancy time-warping tricks.

Likely bottlenecks are constructing a training data set and getting the cursed thing to work in real time. I should make some notes on that theme.

Apropos that, Museplayer can record OpenSoundControl data.

3 References

Arfib, Couturier, Kessous, et al. 2002. Strategies of Mapping Between Gesture Data and Synthesis Model Parameters Using Perceptual Spaces.” Organised Sound.
Caramiaux, Montecchio, Tanaka, et al. 2014. Adaptive Gesture Recognition with Variation Estimation for Interactive Systems.” ACM Trans. Interact. Intell. Syst.
Chen, Fu, and Huang. 2003. Hand Gesture Recognition Using a Real-Time Tracking Method and Hidden Markov Models.” Image and Vision Computing.
Cresci, Di Pietro, Petrocchi, et al. 2017. The Paradigm-Shift of Social Spambots: Evidence, Theories, and Tools for the Arms Race.” In Proc. 26th WWW.
Criminisi, Shotton, and Konukoglu. 2012. Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning.
Fiebrink, and Cook. 2010. The Wekinator: A System for Real-Time, Interactive Machine Learning in Music.” In Proceedings of The Eleventh International Society for Music Information Retrieval Conference (ISMIR 2010). Utrecht.
Fiebrink, Cook, and Trueman. 2011. Human Model Evaluation in Interactive Supervised Learning.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’11.
Fiebrink, Trueman, and Cook. 2009. A Metainstrument for Interactive, on-the-Fly Machine Learning.” In Proceefdings of NIME.
Françoise, Schnell, Borghesi, et al. 2014. Probabilistic Models for Designing Motion and Sound Relationships.” In Proceedings of the 2014 International Conference on New Interfaces for Musical Expression.
Gillian, Knapp, and O’Modhrain. 2011a. A Machine Learning Toolbox for Musician Computer Interaction.” NIME11.
Gillian, Knapp, and O’Modhrain. 2011b. Recognition of Multivariate Temporal Musical Gestures Using n-Dimensional Dynamic Time Warping.” In.
Hantrakul, and Kaczmarek. 2014. Implementations of the Leap Motion Device in Sound Synthesis and Interactive Live Performance.” In Proceedings of the 2014 International Workshop on Movement and Computing. MOCO ’14.
Hong, Turk, and Huang. 2000. Gesture Modeling and Recognition Using Finite State Machines.” In Fourth IEEE International Conference on Automatic Face and Gesture Recognition, 2000. Proceedings.
Hunt, and Wanderley. 2002. Mapping Performer Parameters to Synthesis Engines.” Organised Sound.
King, Pan, and Roberts. 10000. “How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument.” American Political Science Review.
Kratz, and Rohs. 2010. A $3 Gesture Recognizer: Simple Gesture Recognition for Devices Equipped with 3D Acceleration Sensors.” In Proceedings of the 15th International Conference on Intelligent User Interfaces. IUI ’10.
Lee, and Kim. 1999. An HMM-Based Threshold Model Approach for Gesture Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Marković, Valčić, and Malešević. 2016. Body Movement to Sound Interface with Vector Autoregressive Hierarchical Hidden Markov Models.” arXiv:1610.08450 [Cs, Stat].
Mitra, and Acharya. 2007. Gesture Recognition: A Survey.” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews.
Murakami, and Taguchi. 1991. Gesture Recognition Using Recurrent Neural Networks.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
Paine. 2002. Interactivity, Where to from Here? Organised Sound.
Rocchesso, Lemaitre, Susini, et al. 2015. Sketching Sound with Voice and Gesture.” Interactions.
Schacher. 2015. Gestural Electronic Music Using Machine Learning as Generative Device.” In Proceedings of the International Conference on New Interfaces for Musical Expression, NIME’15,.
Schlömer, Poppinga, Henze, et al. 2008. Gesture Recognition with a Wii Controller.” In Proceedings of the 2Nd International Conference on Tangible and Embedded Interaction. TEI ’08.
Wang, Quattoni, Morency, et al. 2006. Hidden Conditional Random Fields for Gesture Recognition.” In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
Williamson, and Murray-Smith. 2002. Audio Feedback for Gesture Recognition.”
Wilson, and Bobick. 1999. Parametric Hidden Markov Models for Gesture Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Wright. 2005. Open Sound Control: An Enabling Technology for Musical Networking.” Organised Sound.
Wu, and Huang. 1999. Vision-Based Gesture Recognition: A Review.” In Gesture-Based Communication in Human-Computer Interaction. Lecture Notes in Computer Science 1739.
Yang, Ahuja, and Tabb. 2002. Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Yoon, Soh, Bae, et al. 2001. Hand Gesture Recognition Using Combined Features of Location, Angle and Velocity.” Pattern Recognition.