A Suite of Tools Supporting Data Streams Annotation and Its Use in Experiments with Hand Gesture Recognition

Tomasz Kapuściński, Dawid Warchoł


In this paper we present the concept and our implementation of a suite of tools supporting the annotation of sequential data. These tools are useful in experiments related to multimedia data sequences. We show the two exemplary usage scenarios of these tools in the process of building the gesture recognition system.


multimedia annotation; software supporting experiments; point clouds processing; finger alphabet recognition; human-computer;

Full Text:



Aubert O., Prié Y., Schmitt D.: Advene as a tailorable hypervideo authoring tool: a case study. Proceedings of the 2012 ACM Symposium on Document Engineering, 2012, s. 79÷82.

Aubert O., Prié Y., Canellas C.: Leveraging video annotations in video-based e-learning. Proceedings of International Conference on Computer Supported Education, 2014, p. 479÷485.

Bhat M., Olszewska I.-J.: DALES: Automated tool for detection, annotation, labelling, and segmentation of multiple objects in multi-camera video streams. Proceedings of the Third Workshop on Vision and Language, 2014, s. 87÷94.

Bradski G., Kaehler A.: Learning OpenCV: Computer vision with the OpenCV library. O'Reilly Media Inc., 2008.

Busjahn T., Schulte C., Sharif B., Simon, Begel A., Hansen M., Bednarik R., Orlov P., Ihantola P., Shchekotova G., Antropova M.: Eye tracking in computing education. Proceedings of the Tenth Annual Conference on International Computing Education Research, 2014, s. 3÷10.

Chang W.-L., Šabanović S., Huber L.: Situated analysis of interactions between cognitively impaired older adults and the therapeutic robot PARO. Social Robotics, Lecture Notes in Computer Science, 2013, t. 8239, s. 371÷380.

Cooperrider K.: Body-directed gestures: Pointing to the self and beyond. Journal of Pragmatics, 2014, t. 71, s. 1÷16.

Crasborn O., Sloetjes H.: Enhanced ELAN functionality for sign language corpora. Proceedings of Language Resources and Evaluation Conference (LREC'08), 2008, s. 39÷42.

Dasiopoulou S., Giannakidou E., Litos G., Malasioti P., Kompatsiaris Y.: A survey of semantic image and video annotation tools. Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. Lecture Notes in Computer Science, 2011, t. 6050, s. 196÷239.

Davidsen J., Vanderlinde R.: Researchers and teachers learning together and from each other using video-based multimodal analysis. British Journal of Educational Technology, 2014, t. 35, nr 3, s. 451÷460.

Gamma E., Helm R., Johnson R., Vlissides J.: Design patterns: Elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston 1995.

Heloir A., Neff M.: Exploiting motion capture for virtual human animation: Data collection and annotation visualization. Proceedings of the Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, 2010.

Hyde J., Kiesler S.-B., Hodgins J.-K., Carter E.-J.: Conversing with children: Cartoon and video people elicit similar conversational behaviors. Proceedings of Conference on Human Factors in Computing Systems, 2014, s. 1787÷1796.

Jongejan B.: Automatic annotation of face velocity and acceleration in Anvil. Proceedings of International Conference on Language Resources and Evaluation (LREC'12), 2012, s. 201÷208.

Kipp M.: ANVIL - A generic annotation tool for multimodal dialogue. Conference of the International Speech Communication Association, 2001, s. 1367÷1370.

Kipp M.: Gesture generation by imitation - from human behavior to computer character animation. Dissertation.com, Boca Raton 2004.

Kipp M.: Spatiotemporal coding in ANVIL. Proceedings of Language Resources and Evaluation Conference (LREC'08), 2008, s. 2042÷2045.

Kipp M., von Hollen L.-F., Hrstka M.-C., Zamponi F.: Single-person and multi-party 3D visualizations for nonverbal communication analysis. Proceedings of Language Resources and Evaluation Conference (LREC'14), 2014, s. 3393÷3397.

MESA Imaging SR4000, http://www.adept.net.au/cameras/Mesa/SR4000.shtml (2017.05.09).

Ng-Thow-Hing V., Pengcheng L., Okita S.: Synchronized gesture and speech production for humanoid robots. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, s. 4617÷4624.

Ooko R., Ishii R., Nakano Y.-I.: Estimating a user's conversational engagement based on head pose information. Intelligent Virtual Agents, Lecture Notes in Computer Science, t. 6895, 2011, s. 262÷268.

Russell B.-C., Torralba A., Murphy K.-P., Freeman W.-T.: LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision, 2008, t. 77, nr 1, s. 157÷173.

Rusu R.-B., Cousins S.: 3D is here: Point Cloud Library (PCL). IEEE International Conference on Robotics and Automation (ICRA), 2011, s. 1÷4.

Rusu R.-B., Bradski G., Thibaux R., Hsu J.: Fast 3D Recognition and pose using the viewpoint feature histogram. Proceedings of the 23rd IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, s. 2155÷2162.

Sargent G., Hanna P., Nicolas H.: Segmentation of music video streams in music pieces through audio-visual analysis. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, s. 724÷728.

Sloetjes H., Wittenburg P.: Annotation by category - ELAN and ISO DCR. Proceedings of Language Resources and Evaluation Conference (LREC'08), 2008, s. 816÷820.

Tseng B., Ching-Yung L., Smith J.: Video personalization and summarization system. Multimedia Signal Processing, 2002 IEEE Workshop on, 2002, s. 424÷427.

Uebersax D., Gall J., Van den Bergh M., Van Gool L.: Real-time sign language letter and word recognition from depth data. IEEE International Conference on Computer Vision Workshops, 2011, s. 383÷390.

VIA - Video Image Annotation Tool, http://via-tool.sourceforge.net (2017.05.09).

Wolfe R., Mcdonald J., Berke L., Stumbo M.: Expanding n-gram analytics in ELAN and a case study for sign synthesis. Proceedings of Language Resources and Evaluation Conference (LREC'14), 2014, s. 1880÷1885.

DOI: http://dx.doi.org/10.21936/si2017_v38.n4.829