• Nem Talált Eredményt

Real-time Hand Gesture Pattern Recognition and Device Control through Inbuilt Webcam

N/A
N/A
Protected

Academic year: 2023

Ossza meg "Real-time Hand Gesture Pattern Recognition and Device Control through Inbuilt Webcam"

Copied!
5
0
0

Teljes szövegt

(1)

Abstract— Home automation technique is used to design and implement a remotely controlled, energy efficient and highly scalable Smart Home with basic features that safeguard the resident’s comfort and security. Controlling electric home appliances and gadgets with the help of switches is difficult for old and disabled people. The basic problems faced by disabled people in day-to-day life in their homes is to turn ON or OFF the daily used equipment like lights, fans, and difficulty in analyzing switches are observed many times.

In this paper, we attempt to propose a real-time hand gesture- based recognition system using a simple webcam and a microcontroller that can automate the control of electrical home appliances using simple hand gestures without using multiple sensors or any kind of special equipment. The proof of concept is demonstrated by controlling a set of different colored electric bulbs that represent the appliances or switches, which are further connected to an Arduino Uno Microcontroller, which in turn is connected to the personal computer where the gesture recognition is implemented with the help of real-time optimized computer vision library.

Index Terms— computer vision, hand gesture, home automation, machine learning, pattern recognition.

I. INTRODUCTION

mart Home automation has become a major interest area in the Internet of things (IoT) world, The Internet of Things (IoT) describes a system where things (i.e., home appliances such as fan, air conditioner, etc.) integrated with electronics, software, sensors, exchange data with other connected devices, operators, users, etc., to provide services to the user [1].

Recently, one of the major research projects in IoT is to improve the user experience and the simplicity in handling of smart home devices. This emphasizes the interaction between the user and the device [2].

Gestures are a type of communication which is nonverbal and are conveyed with the help of body parts like hands, legs, face, arms etc. Gesture recognition is a process in which gestures made by the user are recognized and then used to actuate a process. These gestures even form basis of sign language. Visual interpretation of hand gestures is a natural method of achieving Human-Computer Interaction (HCI).

Sign languages using hand gestures are only used for assisting communications between people in different circumstances, such as for “talking” in noisy environments, or when the other person is too far away, or even when the talkers have language barriers, but they also serve as a communication method for the deaf-mute conversation.

According to an analysis [3] provided by World Health Organization (WHO), almost 360 million people were suffering of disability of hearing and this number has raised by around 14% in the past decade. Besides, there also are some 1.1 billion young people having an elevated risk of hearing disabilities because of the environment they live in.

Because the sign languages provide such a valuable communication modality for an important segment of human population, the development of hand gestures recognition based HCIs (Human Computer Interaction) has become a hot topic in computer vision in the past decades.

A. The Problem Statement

Much of the growing elder population, in the US and the rest of the world requires some degree of formal and or informal care either due to the loss of function or failing health because of aging. Three quarters of elders over the age of 65 suffer from one or more chronic diseases [3].

The cost and burden of caring for elders is steadily increasing. If given the choice, many elders would prefer to lead an independent way of life in a residential setting with minimum intervention from the caregiver. On the other hand, the role of informal caregivers in providing care to elder population has increased over the past two decades and has resulted in shifting the responsibility for care during recuperation, rehabilitation, and a long-term disability from institutions to individuals and families in the community [3].

In order to be universally beneficial and lessen the burdens on the caregivers and to increase the quality of care and life issues for the elders, the concept of Smart Home evolved.

II. TRENDS IN HOME AUTOMATION APPLICATIONS This section is dedicated for the illustration of the research established in this area in the past and the ongoing trends in application of home automation systems.

A. Early developments

Maria Eugenia Cabrera, and others [4] proposed a system wherein they perfected a Glove based Hand Gesture Recognition System. In this paper, they have presented a unique Glove, which has accelerometer sensors attached on it. These sensors calculate the degree of flexions on each finger, along with the orientation of the hand along the three axes. They then use the data generated by the sensors and feed it into neural networks to identify the gestures.

Christopher Lee and Yangsheng Xu [5] developed another glove-based gesture recognition system. The glove has

Real-time hand gesture pattern recognition and device control through inbuilt webcam

Hisham Zargar Laraiba Shah Department of Telecommunications and Media Informatics Department of Telecommunications and Media Informatics

Budapest University of Technology and Economics Budapest University of Technology and Economics Budapest, Hungary Budapest, Hungary

hzargar@edu.bme.hu shahlaraiba@gmail.com

S

(2)

embedded sensors on it that records the orientation data. This provides them with enough information to develop a gesture recognition system that uses neural networks to classify up to 14 letters from the Hand motioned alphabets, after providing only one or two examples of each. The following points were noted in the paper:

1. Usage of this mechanical apparel, that is the Glove, is a very tedious task and does not feel natural at all.

2. The mechanical Glove can recognize only up to a fixed total of varying gestures.

Etsuko Ueda and Yoshio Matsumoto [6] proposed a hand pose estimation technique. This technique is a subset of Vision based Gesture Recognition System wherein, multiple images of the Hand region are taken using a multiple viewpoint camera system, and then constructing a “Voxel.”

Then a 3-dimensional model is fitted using the Hand model and the Voxel model. This 3-dimensional model is then analyzed to extract the gesture information. The following points were understood from the paper:

1. Creation of a new model using inputs provided as an image via a camera setup.

2. The creation of the hand model follows an overly complex method of fitting. toward technology that can almost think for itself.

B. Hand gesture recognition techniques

1) Gesture Recognition Using DWT and F-Ratio Based Feature Descriptor

The grey world algorithm is used to compensate the illumination variation of hand gesture images. The skin color detection-based algorithm is utilized to separate hand region from the background. A DWT and F-ratio based feature descriptor is proposed for recognition of hand gesture in real time. Based on this method, the duration of feature length is reduced which in turn decreases the time taken for gesture recognition. The high mean accuracy of 98.64%, 95.42% and 99.08% are achieved on the datasets. This provides better performance in terms of mean accuracy for hand gesture recognition. The recognition system is also applied on two real- life gesture-based applications to show the effectiveness of the gesture recognition technique [7].

2) Hand gesture recognition methods

a. HMM Based Method: In this method for every sign one model of HMM is used. The feature vectors which reflect manual sign parameters are used as input in training and recognition process [8].

b. Affine Projection based method: This method gives the computation of 3-dimensional structures and its motion based on affine projection model. Affine projection model views the unknown configuration of points and lines [8].

c. Model Based Method: This method uses finite state machine to model different phases of generic gestures. For finding the motion route this method tracks fingertips in different frames. Then that route is used for finding the position of gesture [8].

3) Different input approaches in hand gesture recognition a. Vision Based Approach: In this method the

interaction between human and computers is done by using cameras only. The images are captured by using those cameras. This method looks trivial but there are various complications like variations in

light, detection of skin color, velocity, recognition time etc. [8].

b. Instrumented Glove Based Approach: This method captures the hand position and its movement using sensors. This method gives exact location of fingers and their orientation [ 8].

c. Colored Markers Based Approach: This method uses gloves with some colors for tracking the hand and its various parts like palm fingers. The gloves with three distinct colors are used to represent hand [8].

4) Current Trends in Home Automation Applications There is a dire need in any paper to recognize the place that the proposed research stands amid the numerous relatable publications every day. In this section we tend to recognize the most relatable and the latest developments in the home automation application. Although it is impossible to list every development, here are few of the various interesting ones.

Vishwakarma et al [11] proposed their home automation technique based on multimodal application that can be operated using voice recognition command of the user using the Google Assistant or through a web-based application.

Stolojescu et al [12] proposed a system called qToggle for interconnecting sensors, actuators, and other data sources with the purpose of multiple home automations. They also developed a mobile application that allows users to control a series of home appliances and sensors. Al-Kuwari et al [13]

proposed a design of an IoT based sensing and monitoring system for smart home automation using EmonCMS platform for collecting and visualizing monitored data and remote controlling of home appliances and devices. Kodali et al [14]

demonstrate an IoT project focuses on building a smart wireless home security system which sends alerts to the owner by using Internet in case of any trespass and raises an alarm optionally. Rautaray et al [15] analyze hand gestures for research in gesture taxonomies, its representations and recognition techniques, software platforms and frameworks.

Also focus on three main phases of hand gesture recognition i.e., detection, tracking and recognition. Cheok et al [16]

categorically review hand gesture and sign language recognition into various stages like data acquisition, preprocessing, segmentation, feature extraction and classification, where the various algorithms at each stage are elaborated and compared. Oudah et al [17] thoroughly review the literature on hand gesture techniques based on Deep learning, 3D-model, Depth, Skeleton, Motion, Appearance, and color. They also introduce their merits and limitations under different circumstances. Mujahid et al [18] proposed a real-time lightweight model based on YOLO (You Only Look Once) v3 and DarkNet-53 convolutional neural networks for hand gesture recognition with an accuracy of 97.68%.

5) Conceptual basis of the system

In this section we highlight the theoretical overview of the hardware and software components used in the proposed implementation.

MediaPipe is a framework for building pipelines to perform inference over arbitrary sensory data. With MediaPipe, a perception pipeline can be built as a graph of modular components, including model inference, media processing algorithms and data transformations, etc.

(3)

MediaPipe allows a developer to prototype a pipeline incrementally. A pipeline is defined as a directed graph of components where each component is a Calculator. The graph is specified using a GraphConfig protocol buffer and then run using a Graph object.

In the graph, the calculators are connected by data Streams.

Each stream represents a time-series of data Packets.

Together, the calculators and streams define a data-flow graph. The packets which flow across the graph are collated by their timestamps within the time-series.

The pipeline can be refined incrementally by inserting or replacing calculators anywhere in the graph. Developers can also define custom calculators. While the graph executes calculators in parallel, each calculator executes on at most one thread at a time1. This constraint, coupled with the immutability of data packets, ensures that custom calculators can be defined without specialized expertise in multi- threaded programming [9].

Figure 1. Object Detection using MediaPipe (from [9]) The hand tracking solution utilizes an ML pipeline consisting of two models working together:

a. A palm detector that operates on a full input image and locates palms via an oriented hand bounding box.

b. A hand landmark model that operates on the cropped hand bounding box provided by the palm detector and returns high fidelity 2.5D landmarks [10].

Figure 2. Architecture of MediaPipe hand landmark model (from [10])

The 21 3D Landmark points are depicted as following:

Figure 3. Twenty-one hand landmarks (from [19]) Arduino Uno is a microcontroller board based on the ATmega328P. It has 14 digital input/output pins (of which 6 can be used as PWM outputs), 6 analog inputs, a 16 MHz ceramic resonator, a USB connection, a power jack, an ICSP header and a reset button. It contains everything needed to support the microcontroller; simply connect it to a computer with a USB cable or power it with an AC to DC adapter or battery to get started [20].

Figure 4. Arduino Uno Pin Description (from [20]) III. IMPLEMENTATION TECHNIQUE

This work focuses on developing a hand-recognition system for home automation with computer vision. The inbuilt or attached camera to the system can recognize hand gestures. These hand gestures are coded as numbers and sent to Arduino Uno. The already programmed Arduino recognizes the input from the program and turns the subsequent signal positive using the relay.

(4)

Figure 5. Conceptual View of the System

A. On the Software Side

In the very beginning of the Python code, we turn the webcam ON using OpenCV module. While the webcam is open, we run the MediaPipe library palm detection function to recognize the hand through the webcam.

Next, with landmark model help, we get the 21 points on the hand that is detected. Then we recognize distinct types of shapes, in our case: one, two and three with these points positioned on the palm and fingers.

These hand shapes in the background are coded as 1d array of 0’s and 1’s and later these are normalized to a single digit, which is given as an input to the Arduino Uno using the pyserial library in python.

The Arduino Uno microcontroller is programmed in the default language of the Arduino IDE. We preprogram it before running the python script such that it reads a single normalized digit from the python code. We program the input to the subsequent relay number, which turns the connected bulb ON or OFF.

B. On the Hardware Side

The figure 5 depicts the connection between the Arduino Uno and the 4-relay module. Further it also depicts the further connections of the light bulbs with the 4-relay module.

IV. RESULTS

We input different hand gestures that the implementation recognizes and consequently a particular output is shown. In the figure 6 we can see that when the user shows a hand gesture of shape ‘five’, which is a default input in our implementation, it switches off all the relays and all the bulbs subsequently turn off.

Figure 6. Default State Recognition

In the figure 7 shown below, we could see when the input gesture one and two are given, the system turns ON the first and second bulb connected respectively with the relay.

Figure 7. Gestures one and two

Further the general results rather advantages and disadvantages of the system depending upon the actual implementation can be summarized for the proposed system are as follows:

1. Instantaneous real-time recognition of the hand gesture patterns from the system.

2. Minimal complexity in implementation as compared to other hand gesture pattern recognition methods.

3. Easy to control few tens of devices

4. Minimal delay while changing from one gesture to another.

5. Scalability of the system is complex, as the gesture could get easily mixed up if the devices are drastically increased.

6. Accuracy of the model decreases if the patterns are complex in nature. An example of this is recognizing sign language.

7. Smooth working of the implementation directly depends upon the hardware in use.

(5)

V. CONCLUSION AND FUTURE WORK

The results were acceptable as we got no errors during our testing time. Each gesture worked well for each appliance (bulb).

In future there is scope of using other home appliances instead of bulbs. Also, the implementation could be extended for gestures from both the hands. This would increase the number of gestures and consequently the number of appliances that could be controlled.

Apart from controlling appliances, hand gestures could be used in various other implementations.

On the downside we used a system with low specifications which resulted in low frame rate. If the implementation is carried out on a system with high hardware specifications (for example dedicated GPU’s), this would drastically improve the frame rate resulting in smooth running of the implementation.

ACKNOWLEDGMENT

The authors would like to thank Pal Varga who participated and helped in this research work.

REFERENCES

[1] Oppitz, M. and Tomsu, P., 2018. Internet of things. In Inventing the Cloud Century (pp. 435-469). Springer, Cham.

[2] Gonzalo, P.J. and Juan, A.H.T., 2015, September. Control of home devices based on hand gestures. In 2015 IEEE 5th International Conference on Consumer Electronics-Berlin (ICCE-Berlin) (pp. 510- 514). IEEE.

[3] World Health Organization (WHO) [online] Available at https://www.who.int/disabilities/world_report/2011

[4] Cabrera, M.E., Bogado, J.M., Fermin, L., Acuna, R. and Ralev, D., 2012. Glove-based gesture recognition system. In Adaptive Mobile Robotics (pp. 747-753).

[5] Lee, C. and Xu, Y., 1996, April. Online, interactive learning of gestures for human/robot interfaces. In Proceedings of IEEE International Conference on Robotics and Automation (Vol. 4, pp. 2982-2987).

IEEE.

[6] Ueda, E., Matsumoto, Y., Imai, M. and Ogasawara, T., 2003. A hand- pose estimation for vision-based human interfaces. IEEE Transactions on Industrial Electronics, 50(4), pp.676-684.

[7] Sahoo, J.P., Ari, S. and Ghosh, D.K., 2018. Hand gesture recognition using DWT and F-ratio based feature descriptor. IET Image Processing, 12(10), pp.1780-1787.

[8] Savitha, V., Nandhini, J., Kokilavani, S., Kalaiarasi, G. and Narmadha, A.S., Hand Gesture Recognition for Home Automation.

[9] Lugaresi, C., Tang, J., Nash, H., McClanahan, C., Uboweja, E., Hays, M., Zhang, F., Chang, C.L., Yong, M.G., Lee, J. and Chang, W.T., 2019. Mediapipe: A framework for building perception pipelines.

arXiv preprint arXiv:1906.08172.

[10] Zhang, F., Bazarevsky, V., Vakunov, A., Tkachenka, A., Sung, G., Chang, C.L. and Grundmann, M., 2020. Mediapipe hands: On-device real-time hand tracking. arXiv preprint arXiv:2006.10214.

[11] Vishwakarma, S.K., Upadhyaya, P., Kumari, B. and Mishra, A.K., 2019, April. Smart energy efficient home automation system using IoT.

In 2019 4th international conference on internet of things: Smart innovation and usages (IoT-SIU) (pp. 1-4). IEEE.

[12] Stolojescu-Crisan, C., Crisan, C. and Butunoi, B.P., 2021. An IoT- based smart home automation system. Sensors, 21(11), p.3784.

[13] Al-Kuwari, M., Ramadan, A., Ismael, Y., Al-Sughair, L., Gastli, A. and Benammar, M., 2018, April. Smart-home automation using IoT-based sensing and monitoring platform. In 2018 IEEE 12th International Conference on Compatibility, Power Electronics and Power Engineering (CPE-POWERENG 2018) (pp. 1-6). IEEE.

[14] Kodali, R.K., Jain, V., Bose, S. and Boppana, L., 2016, April. IoT based smart security and home automation system. In 2016 international conference on computing, communication, and automation (ICCCA) (pp. 1286-1289). IEEE.

[15] Rautaray, S.S. and Agrawal, A., 2015. Vision based hand gesture recognition for human computer interaction: a survey. Artificial intelligence review, 43(1), pp.1-54.

[16] Cheok, M.J., Omar, Z. and Jaward, M.H., 2019. A review of hand gesture and sign language recognition techniques. International Journal of Machine Learning and Cybernetics, 10(1), pp.131-153.

[17] Oudah, M., Al-Naji, A. and Chahl, J., 2020. Hand gesture recognition based on computer vision: a review of techniques. journal of Imaging, 6(8), p.73.

[18] Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R. and Abdulkareem, K.H., 2021. Real-time hand gesture recognition based on deep learning YOLOv3 model. Applied Sciences, 11(9), p.4164.

[19] MediaPipe [online] Available at:

https://google.github.io/mediapipe/solutions/hands

[20] Arduino [online] Available at:

https://www.arduino.cc/en/Guide/Introduction

Hisham Zargar is currently a PhD student at the Budapest University of Technology and Economics (BME). He received his MSc in Computer Engineering from BME in 2020. His research interests include Pattern and Image Recognition, Computer Vision, Machine Learning and Deep Learning.

Laraiba Shah received her MSc in Computer Engineering from Budapest University of Technology and Economics (BME) in 2021. She is currently working at Ericsson, Hungary in the R&D department. Her research interests include IOT, Smart City, Wireless sensor networks and Telecommunication systems.

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

In this paper, we propose a real-time human detection and gesture recognition sys- tem for onboard UAV rescue. Practical application and laboratory testing are two different

Using the results of the registration and the pedestrian detector, we design a real-time track- ing method based on K¨onig’s theorem and the Hungarian method that performs well

Using the results of the registration and the pedestrian detector, we design a real-time track- ing method based on K¨onig’s theorem and the Hungarian method that performs well

On the other hand, the success of the control strongly depends on the epidemic parameters, the infection and recovery rate, and on the control parameters, the length of the

The table also provide a Chi-Square statsitic to test the significance of Wilk's Lambda. If the e -value if less than 0.05, we can conclude that the corresponding function

In this paper, the application of an approach to signal processing and control for real-time systems of any size is addressed and 'matched' refers to combined hardware

The pattern extraction based agent uses a search method with limited depth and breadth combined with pattern recognition.. The evaluation function at the leaves and

This procedure was born from the recognition [20] that if, on the one hand, activity and dose rate decrease of the fall-out within some defined period is describable by