Gesture Recognition based on Deep Learning for Quadcopters Flight Control

Authors

  • Volodymyr Samotyy
  • Nikita Kiselov
  • Uliana Dzelendzyak
  • Oksana Shpak

DOI:

https://doi.org/10.47839/ijc.23.4.3757

Keywords:

Type gesture control, deep neural networks, computer vision, convolution neural network, artificial intelligence, quadcopter, TensorFlow, Tensorboard

Abstract

This article presents a system for controlling quadcopters with gestures, which are recognized by a model based on neural networks. A method based on a combined deep learning model is proposed that provides real-time recognition with minimal consumption of computing power. An implementation is presented that offers the possibility of controlling the quadcopter in two ways, via gestures or the keyboard. A functionality is also provided for adding new gestures for recognition using interactive code via the Jupyter Lab web application. A special mode is implemented that allows us to create a data set for a new test directly from the quadcopter camera to simplify data collection. The operation of the control and recognition module is demonstrated using an example in which a DJI Tello Edu drone is controlled. The results of tests under real conditions are presented. The developed software allows one to speed up the process of gesture recognition and facilitates the process of controlling the quadcopters. Several areas of improvement of the developed system and their possible technical implementation are proposed.

References

J. Kobylarz, J. J. Bird, D. R. Faria, E. Parente Ribeiro, A., Ekárt, “Thumbs up, thumbs down: Non-verbal human-robot interaction through real-time EMG classification via inductive and supervised transductive transfer learning,” Journal of Ambient Intelligence and Humanized Computing, vol. 11, pp. 6021-6031, 2020. https://doi.org/10.1007/s12652-020-01852-z.

G. Kiss, “External manipulation of autonomous vehicles,” Proceedings of the IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, Leicester, UK, 19–23 August 2019, pp. 248-252. https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00085.

T. Müezzinoğlu, M. Karaköse, “An intelligent human – Unmanned aerial vehicle interaction approach in real time based on machine learning using wearable gloves,” Sensors, vol. 21, issue 5, 1766, 2021. https://doi.org/10.3390/s21051766.

PatSeer. Patent Landscape Report Hand Gesture Recognition PatSeer Pro. Available online: (accessed 2 November 2017).

V. I. Pavlovic, R. Sharma, T. S. Huang, “Visual interpretation of hand gestures for human-computer interaction: A review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, issue 7, pp. 677–695, 1997. https://doi.org/10.1109/34.598226.

R. P. Sharma, G. K. Verma, “Human computer interaction using hand gesture,” Procedia Computer Science, vol. 54, pp. 721-727, 2015. https://doi.org/10.1016/j.procs.2015.06.085.

M. Z. Islam, M. S. Hossain, R. Ul Islam, K. Andersson, “Static hand gesture recognition using convolutional neural network with data augmentation,” Proceedings of the 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Spokane, WA, USA, 30 May – 02 June 2019. https://doi.org/10.1109/ICIEV.2019.8858563.

T. G. Zimmerman, J. Lanier, C. Blanchard, S. Bryson, Y. Harvill, “A Hand Gesture Interface Device,” ACM SIGCHI Bulletin, vol. 18, no. 4, pp. 189-192, 1986. https://doi.org/10.1145/1165387.275628.

Y. Liu, Y. Jia, “A robust hand tracking and gesture recognition method for wearable visual interfaces and its applications,” Proceedings of the Third International Conference on Image and Graphics (ICIG'04), Hong Kong, China, 18-20 December 2004, pp. 472-475. https://doi.org/10.1109/ICIG.2004.24.

K.-B. Lee, J.-H. Kim, K.-S. Hong, “An implementation of multi-modal game interface based on PDAS,” Proceedings of the 5th ACIS International Conference on Software Engineering Research, Management & Applications (SERA 2007), Busan, Korea, 20-22 August 2007, pp. 759-768. https://doi.org/10.1109/SERA.2007.48.

V. Pallotta, P. Bruegger, B. Hirsbrunner, “Kinetic user interfaces: Physical embodied interaction with mobile pervasive computing systems,” Advances in Ubiquitous Computing: Future Paradigms and Directions, IGI Publishing, 2008, 28 p. doi: 10.4018/978-1-59904-840-6.ch008.

M. Panwar, P. S. Mehra, “Hand gesture recognition for human computer interaction,” Proceedings of the International Conference on Image Information Processing, Shimla, India, 03-05 November 2011. pp. 1-7, https://doi.org/10.1109/ICIIP.2011.6108940.

D. Tezza, M. Andujar, “The state-of-the-art of human–drone interaction: A survey,” IEEE Access, vol. 7, pp. 167438–167454, 2019. https://doi.org/10.1109/ACCESS.2019.2953900.

R. A. Suárez Fernández, J. L. Sanchez-Lopez, C. Sampedro, H. Bavle, M. Molina and P. Campoy, “Natural user interfaces for human-drone multi-modal interaction,” Proceedings of the 2016 International Conference on Unmanned Aircraft Systems (ICUAS), Arlington, VA, USA, 2016, pp. 1013-1022, https://doi.org/10.1109/ICUAS.2016.7502665.

R. Herrmann, L. Schmidt, “Design and evaluation of a natural user interface for piloting an unmanned aerial vehicle,” i-com, vol. 17, issue 1, 2018, pp. 15–24. https://doi.org/10.1515/icom-2018-0001.

S. P. Kleinschmidt, C. S. Wieghardt, B. Wagner, “Tracking solutions for mobile robots: Evaluating positional tracking using dual-axis rotating laser sweeps,” Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics, ICINCO 2017, Madrid, Spain, 26–28 July 2017, pp. 155–164. https://doi.org/10.5220/0006473201550164.

S. Islam, B. Ionescu, C. Gadea, D. Ionescu, “Indoor positional tracking using dual-axis rotating laser sweeps,” Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, Taipei, Taiwan, 23–26 May 2016, pp. 1–6. https://doi.org/10.1109/I2MTC.2016.7520559.

L. Arreola, A. Montes de Oca, A. Flores, J. Sanchez, G. Flores, “Improvement in the UAV position estimation with low-cost GPS, INS and vision-based system: Application to a quadrotor UAV,” Proceedings of the 2018 International Conference on Unmanned Aircraft Systems (ICUAS), Dallas, TX, USA, 12–15 June 2018, pp. 1248–1254. https://doi.org/10.1109/ICUAS.2018.8453349.

W. A. Hoff, K. Nguyen, T. Lyon, “Computer-vision-based registration techniques for augmented reality,” Proceedings of the Intelligent Robots and Computer Vision XV: Algorithms, Techniques, Active Vision, and Materials Handling, Photonics East'96, Boston, MA, United States, vol. 2904, pp. 538–548, 1996. https://doi.org/10.1117/12.256311.

S. S. Deshmukh, C. M. Joshi, R. S. Patel, Y. B. Gurav, “3D object tracking and manipulation in augmented reality,” International Research Journal of Engineering and Technology, vol. 5, issue 1, pp. 287–289, 2018.

E. Shreyas, M. H. Sheth and Mohana, “3D object detection and tracking methods using deep learning for computer vision applications,” Proceedings of the 2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), Bangalore, India, 2021, pp. 735-738, https://doi.org/10.1109/RTEICT52294.2021.9573964.

J. Rambach, C. Deng, A. Pagani, D. Stricker, “Learning 6DoF object poses from synthetic single channel images,” Proceedings of the 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Munich, Germany, 16–20 October, 2018, pp. 164–169. https://doi.org/10.1109/ISMAR-Adjunct.2018.00058.

J. Li, C. Wang, X. Kang, Q. Zhao, “Camera localization for augmented reality and indoor positioning: A vision-based 3D feature database approach,” International Journal of Digital Earth, vol. 13, issue 6, pp. 727–741, 2020. https://doi.org/10.1080/17538947.2018.1564379.

L. Yuan, C. Reardon, G. Warnell, G. Loianno, “Human gaze-driven spatial tasking of an autonomous MAV,” IEEE Robotics and Automation Letters, vol. 4, issue 2, pp. 1343–1350, 2019. https://doi.org/10.1109/LRA.2019.2895419.

G. Albanis, N. Zioulis, A. Dimou, D. Zarpalas, P. Daras, “Dronepose: Photorealistic Uav-assistant dataset synthesis for 3D pose estimation via a smooth silhouette loss,” Proceedings of the Workshop on Computer Vision – ECCV 2020: Glasgow, UK, 23–28 August, 2020, pp. 663–681. https://doi.org/10.1007/978-3-030-66096-3_44.

F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C. L. Chang, M. Grundmann, “Mediapipe Hands: On-Device Real-Time Hand Tracking”, arXiv, 2006. https://doi.org/10.48550/arXiv.2006.10214.

K. Yang, B. Wei, Q. Wang, X. Ren, Y. Xu, H. Liu, “A 3-D depth information based human motion pose tracking algorithms,” Sensors & Transducers, vol. 174, issue 7, 2014, pp. 253-260.

H. Fesenko, V. Kharchenko, A. Sachenko, R. Hiromoto and V. Kochan, “An Internet of drone-based multi-version post-severe accident monitoring system: Structures and reliability,” In book Dependable IoT for Human and Industry - Modeling, Architecting, Implementation. Editors: V. Kharchenko, A. L. Kor and A. Rucinski, River Publishers, 2018, pp. 197-218. https://doi.org/10.1201/9781003337843-12.

I. Zhukov, B. Dolintse, S. Balakin, “Enhancing data processing methods to improve UAV positioning accuracy,” International Journal of Image, Graphics and Signal Processing (IJIGSP), vol. 16, no. 3, pp. 100-110, 2024. https://doi.org/10.5815/ijigsp.2024.03.08.

O. Fedorovych, et al., “Military logistics planning models for enemy targets attack by a swarm of combat drones,” Radioelectronic and Computer Systems, vol. 2024, no. 1, pp. 207-216, 2024. https://doi.org/10.32620/reks.2024.1.16.

M. K. Kabir, A. N. Binte Kabir, J. H. Rony, J. Uddin, “Drone detection from video streams using image processing techniques and YOLOv7,” International Journal of Image, Graphics and Signal Processing (IJIGSP), vol. 16, no. 2, pp. 83-95, 2024. https://doi.org/10.5815/ijigsp.2024.02.07.

I. Paliy, A. Sachenko, V. Koval and Y. Kurylyak, “Approach to face recognition using neural networks,” Proceedings of the 2005 IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Sofia, Bulgaria, 2005, pp. 112-115, https://doi.org/10.1109/IDAACS.2005.282951.

O. Fedorovich, et al., “Modeling waves of a strike drones swarm for a massive attack on enemy targets,” Radioelectronic and Computer Systems, vol. 2024, no. 2, pp. 203-212, 2024. https://doi.org/10.32620/reks.2024.2.16.

Y. Sun, H. Fesenko, V. Kharchenko, L. Zhong, I. Kliushnikov, O. Illiashenko, O. Morozova, A. Sachenko, “UAV and IoT-based systems for the monitoring of industrial facilities using digital twins: Methodology, reliability models, and application,” Sensors, vol. 22, 6444, 2022. https://doi.org/10.3390/s22176444.

Downloads

Published

2025-01-12

How to Cite

Samotyy, V., Kiselov, N., Dzelendzyak, U., & Shpak, O. (2025). Gesture Recognition based on Deep Learning for Quadcopters Flight Control. International Journal of Computing, 23(4), 583-591. https://doi.org/10.47839/ijc.23.4.3757

Issue

Section

Articles