ĐỒ ÁN THUẬT TOÁN ĐIỀU KHIỂN DỰA TRÊN CỬ CHỈ NGÓN TAY DÀNH CHO ROBOT DI ĐỘNG

Mã đồ án CNCDT0202561
Đánh giá: 5.0
Mô tả đồ án

     Đồ án có dung lượng 290MB được trình bày bằng ngôn ngữ tiếng anh. Bao gồm đầy đủ các file như: Tất cả các file chương trình (Pycharm) thuật toán điều khiển dựa trên cử chỉ ngõn tay dành cho robot di động, file bản thuyết minh, nhiệm vụ đồ án, phiếu nhận xét, video mô phong robot di động, bản trình chiếu bảo vệ Power point…). Ngoài ra còn cung cấp rất nhiều các tài liệu chuyên ngành, các tài liệu phục vụ cho thiết kế đồ án........... THUẬT TOÁN ĐIỀU KHIỂN DỰA TRÊN CỬ CHỈ NGÓN TAY DÀNH CHO ROBOT DI ĐỘNG.

Giá: 1,190,000 VND
Nội dung tóm tắt

TABLE OF CONTENTS

ACKNOWLEDGMENTS…………………………………..……………………...........................….i

SUMMARY OF PROJECT CONTENT……………………………………….................................ii

CHAPTER 1. TOPIC OVERVIEW............................................................................................. 1

1.1  Overview............................................................................................................................. 1

1.2  Goal.................................................................................................................................... 1

1.2.1. Build a finger gesture recognition algorithm.................................................................... 1

1.2.2. Create a robot control system through gestures............................................................. 1

1.2.3. Integration of hardware components:.............................................................................. 1

1.2.4. Building an environmental map system........................................................................... 1

1.3  Practical application............................................................................................................ 1

1.3.1. Controlling the robot in a hazardous environment:.......................................................... 2

1.3.2. Application in Healthcare:................................................................................................ 2

1.3.3. Applications in robotics education and research:............................................................. 2

1.4  Related studies................................................................................................................... 2

1.4.1 Wuetal's Study (2012)....................................................................................................... 2

1.4.2 Research by Athiya Marium et al. (2017).......................................................................... 2

1.4.3 Research by Malgireddy et al. (2010)............................................................................... 2

1.4.4 OpenCV and MediaPipe:.................................................................................................. 2

CHAPTER 2. THEORETICAL BASIS........................................................................................ 3

2.1  About image processing systems....................................................................................... 3

2.1.1 Steps in the image processing system............................................................................. 3

2.1.2 Components of the image processing system.................................................................. 3

2.1.2 Applications of image processing systems....................................................................... 4

2.2  Overview of hand gesture recognition................................................................................ 4

2.2.2 Hand gesture recognition concept.................................................................................... 4

2.2.3 Hand gesture recognition methods................................................................................... 5

2.2.4 Challenges in hand gesture recognition............................................................................ 5

2.2.5 Technologies used in hand gesture recognition................................................................ 5

2.3  About MediaPipe................................................................................................................. 6

2.3.1 How MediaPipe Works...................................................................................................... 6

2.3.2 Data Processing in MediaPipe Hand Gesture Recognition............................................... 6

2.3.3 How to use MediaPipe...................................................................................................... 7

2.4  OpenCV............................................................................................................................... 8

2.4.1 Key Features of OpenCV.................................................................................................. 8

2.4.2 OpenCV application in practice........................................................................................ 8

2.5  Ngôn ngữ Python.............................................................................................................. 10

2.5.1 Python Features and Advantages................................................................................... 10

2.5.2 Python in image processing and computer vision........................................................... 11

2.6  ROS2................................................................................................................................ 12

2.6.1 Features and advantages of ROS2................................................................................ 12

2.6.2 Application of ROS2 in the project.................................................................................. 12

2.6.3 Practical applications of ROS2....................................................................................... 12

2.7  VNC (Virtual Network Computing).................................................................................... 13

2.7.1 Features and advantages of VNC.................................................................................. 13

2.7.2 Use VNC in the project................................................................................................... 13

2.8  Arduino IDE (Integrated Development Environment)....................................................... 14

2.8.1 Features and Advantages of Arduino IDE...................................................................... 14

2.8.2 Using the Arduino IDE in the project.............................................................................. 15

2.9  EasyEDA.......................................................................................................................... 15

2.9.1 Features and Advantages of EasyEDA.......................................................................... 16

2.9.2 Application of EasyEDA in the project............................................................................ 16

2.10   Visual Studio Code (VS Code)...................................................................................... 17

2.10.1 Features and Advantages of VS Code......................................................................... 17

2.10.2 Application of VS Code in the project........................................................................... 18

2.11   Devices used................................................................................................................. 19

2.11.1 Raspberry Pi 4 Model B................................................................................................ 21

2.11.2 ESP32-CAM................................................................................................................. 23

2.11.3 Module TB6612FNG..................................................................................................... 25

2.11.4 YDLIDAR X4 Pro.......................................................................................................... 27

2.11.5 MPU6050...................................................................................................................... 29

2.11.6 Servo Reducer Motor.................................................................................................... 31

2.11.7 Mecanum Wheel........................................................................................................... 31

2.12   Chapter 2 Review: Theoretical Basis............................................................................ 32

CHAPTER 3. SYSTEM ANALYSIS AND DESIGN................................................................. 33

3.1  System Block Diagram Design......................................................................................... 33

3.2  System Connection Diagram........................................................................................... 34

3.3  Communication Protocol.................................................................................................. 36

3.3.1 Communication between ESP32-CAM and Laptop....................................................... 37

3.3.2 Communication between Laptop and Raspberry Pi (Flask Server) ...............................37

3.3.3 Internal communication in the Raspberry Pi.................................................................. 37

CHAPTER 4. IMPLEMENTATION FACILITIES..................................................................... 38

4.1  Finger Gesture Recognition Program.............................................................................. 38

4.2  Robot Control Program.................................................................................................... 45

4.3  Robot Locator Program.................................................................................................... 47

4.4  Configure slam_toolbox................................................................................................... 52

4.5  Ydlidar x4 Pro Configuration............................................................................................ 53

4.6  EKF configuration combining IMU and ODOM................................................................ 54

4.7  User Interface.................................................................................................................. 55

4.8  Connect and transfer data............................................................................................... 57

4.9  Testing............................................................................................................................. 60

CHAPTER 5. RESULTS AND REVIEWS.............................................................................. 67

5.1  Results Achieved............................................................................................................. 67

5.2  Assess............................................................................................................................. 67

5.2.1 Hand Cover Assessment............................................................................................... 68

5.2.2 Hand Gesture Recognition Accuracy Assessment........................................................ 69

5.2.3 Assess the actual recognition gap................................................................................. 72

5.2.4 Hand Gesture Recognition Latency Assessment.......................................................... 73

5.2.4 Evaluate system performance with light intensities....................................................... 73

5.3  Restrict............................................................................................................................. 75

5.4  The development direction of the project......................................................................... 75

CONCLUDE…………………………………………………………………...........................……84

REFERENCES………………………………………………………………...............….........…..85

SUMMARY OF PROJECT CONTENT

The project "Finger Gesture-Based Control Algorithm for Mobile Robots" aims to build a mobile robot system capable of recognizing finger gestures and mapping the environment when moving. The system uses a Raspberry Pi 4 Model B as the main processor, combined with the ESP32-CAM to acquire hand gesture images. The robot control signal is transmitted via UDP protocol, from the Raspberry Pi to the ESP32-CAM module and the TB6612FNG motor control circuit, which controls the metal gear TT Servo Reducer Motor.

The robot is integrated with YDLidar X4 Pro to collect lidar data, thereby creating an environmental map on the move. The image processing algorithm uses Python, OpenCV, and MediaPipe to recognize hand gestures. The system controls the robot through gestures: forward, backward, left turn, right turn, speed control and stop.

In addition, the system is integrated with ROS2 to support communication between modules and manage the robot control process. Experimental results show that the robot not only accurately controls hand gestures but also has the ability to map the environment efficiently thanks to lidar, expanding its applicability in tasks such as surveying, rescue, exploring terrain or assisting people with disabilities.

                                                                                                                                                    Ha noi, date … / … / 20…

                                                                                                                                                  Student Implementation

                                                                                                                                                                                            (Sign and write full name)

                                                                                                                                                   ................................

CHƯƠNG 1. TOPIC OVERVIEW

1.1 Overview

In the context of the strong development of automation technology, robots have been playing an important role in many fields such as manufacturing, healthcare, education, and entertainment. In particular, controlling robots through human gestures, especially finger gestures, is becoming a new trend, providing a natural and convenient communication experience. Hand gesture recognition technology has opened up the possibility of controlling robots through visual signals, replacing the traditional control method with a keyboard or mouse.

1.2 Target

The main goal of the project is to develop a mobile robot control system through finger gesture recognition, with the following specific goals:

1.2.1. Build a finger gesture recognition algorithm:

Use image processing methods with Python, OpenCV, and MediaPipe to recognize and analyze hand gestures such as forward, backward, turn left, turn right, speed control, and stop.

1.2.2. Create a robot control system through gestures:

Develop an algorithm to control the robot to move based on recognized hand gestures.

1.2.4. Building an environmental map system:

Use the YDLidar X4 Pro to collect lidar data and build environmental maps as the robot moves.

1.3 Practical application

Robotic control systems via finger gestures have a wide range of practical applications, including:

1.3.1 Controlling the robot in a hazardous environment:

The system can be applied in jobs that require remote robot control, such as surveying hazardous environments (such as in rescue situations, waste treatment).

1.3.3 Applications in robotics education and research:

The system assists students and PhD students in learning and researching technologies related to mobile robotics, gesture control, and image processing.

1.4 Related studies

In recent years, many studies have focused on developing systems for controlling robots with hand gestures and image processing to recognize these gestures. Some case studies include:

1.4.1 Wuetal's Study (2012)

Regarding the use of the Adaptive Self-Organizing Map (SASOM) structure for color recognition that is unstable in hand tracking. This study demonstrates that the use of machine learning methods can improve the accuracy of hand gesture recognition in unstable environments.

1.4.2 Research by Athiya Marium et al. (2017)

Use the webcam in combination with OpenCV to recognize hand gestures and control model robots. The system uses background subtraction and finger detection algorithms to classify basic gestures.

1.4.4 OpenCV and MediaPipe:

These are popular libraries in the field of image processing and gesture recognition, which are applied in many practical studies and projects to develop effective hand gesture recognition systems. The OpenCV library offers powerful tools for image processing, while Google's MediaPipe supports quick and accurate recognition of human hand and body gestures.

CHƯƠNG 2. THEORETICAL BASIS

2.1 About image processing systems

Image processing is an important science and technology in the field of computers and artificial intelligence. With the rapid development of technology and the need to use photos in practical applications, image processing has become one of the core areas of automated systems. Image processing systems are widely used in many fields such as facial recognition, video analysis, healthcare, transportation, and robot control.

2.1.1 Steps in the image processing system

An image processing system typically includes the following steps:

- Image Acquisition: Images are acquired from devices such as cameras or photo scanners. Images can be color or black-and-white images. The image acquisition process can affect the image quality and recognition in the next steps.

- Preprocessing: This step aims to improve the quality of the image by removing noise, increasing contrast, or changing the brightness of the image. Some common techniques in photo preprocessing include Gaussian filtering, blurring photos, and converting photos to grayscale.

- Image segmentation: Image segmentation is the process of dividing an image into regions with similar characteristics, making it easy to analyze and identify. These areas can be objects in the photo or areas with drastic changes in brightness or color.

2.1.2 Components of the image processing system

An image processing system typically consists of the following key components:

- Image Sensor: This is the part that acquires images from the surrounding environment. An image sensor can be a camera, scanner, or other device capable of recording images.

- Image Processor: This is the part that processes signals from the image sensor, implementing image processing algorithms such as filtering, segmentation, and recognition. Image processors can be microprocessors, FPGAs, or powerful computer systems.

- Storage: An image processing system needs storage to store the images obtained and the results of the processing. These storage units can be hard drives, SSD disks, or system storage.

2.2 Overview of hand gesture recognition

Hand gesture recognition is a sub-branch of image processing, which aims to analyze and identify gestures made by the human hand. This is an important area in the study of human-machine communication, especially in the development of methods for controlling robots, mobile devices, and other interactive systems without the use of traditional input devices such as mice or keyboards.

2.2.1 Hand gesture recognition concept

Hand gesture recognition is the process of using image processing algorithms to analyze hand movements and gestures, thereby converting this information into signals that can be used for controlling systems, such as mobile robots, smart devices, or application software. Hand gestures typically include actions such as raising a hand, holding a hand, waving, or more complex gestures such as swiping, pulling, or tapping.

The process of recognizing hand gestures can be divided into the following basic steps:

- Data Collection: Use cameras or other sensor devices to acquire images of the hand.

- Pre-processing: Cleaning and preparing image data (graying out photos, blurring noise, segmenting areas of interest).

- Gesture recognition: Identify important features in the photo (e.g., number of fingers, shape of the hand, movements).

2.2.3 Challenges in hand gesture recognition

Recognizing hand gestures is not a simple task and faces some major challenges:

- Uneven lighting conditions: Hand recognition can be affected by factors such as low light or constant changes.

- Variation of gestures: Gestures can vary in shape or speed, and sometimes the hands can be obscured, making it difficult to identify accurately.

- Complex backgrounds: When the background of a photo has too many details or colors that are similar to the hand, the segmentation and recognition process is difficult.

2.4 OpenCV

OpenCV (Open Source Computer Vision Library) is a powerful open-source library developed by Intel, for computer vision and image processing applications. With more than 2,500 algorithms and functions related to image and video processing, OpenCV helps build systems for object recognition, motion tracking, image analysis, deep learning, and other AI applications.

2.4.1 Key Features of OpenCV

- Basic image processing: Color conversion: RGB, HSV, Lab, and Grayscale. Resize the photo. Filter photos with filters such as Gaussian blur, median blur, and other filters to blur and clean up noise in photos. Handles brightness and contrast.

- Image analysis: Edge detection using algorithms such as Canny edge detection. Image segmentation using methods such as Thresholding, Watershed algorithm, and Region-based segmentation. Feature extraction using algorithms such as Harris Corner detection, SIFT (Scale-Invariant Feature Transform), SURF (Speeded-Up Robust Features).

- Object Recognition: The Haar Cascade Classifiers model to identify objects in photos. Machine Learning integrates with machine learning libraries such as TensorFlow and PyTorch, helping to build object recognition systems based on deep learning models.

2.4.2 OpenCV application in practice

Face Recognition is one of OpenCV's most popular applications. With algorithms like Haar Cascade Classifier, OpenCV can detect and recognize faces in photos and videos. Facial recognition systems can be used in applications such as:

- Security and surveillance: Facial recognition systems help authenticate users, such as unlocking phones or accessing secure areas.

- Automation and human-machine interaction: OpenCV can be used for facial recognition in interactive systems, helping computers recognize users and provide customized or personalized services.

2.5. Ngôn ngữ Python

Python is a popular, powerful, and easy-to-learn programming language that was developed by Guido van Rossum in 1991. Python is known for its simple, easy-to-read, and easy-to-write syntax, which helps programmers focus on solving problems instead of worrying about complex syntax. Python is a high-level programming language that supports many programming paradigms such as object-oriented programming (OOP), procedural programming, and functional programming.

2.5.1 Python Features and Advantages

- Easy-to-read and easy-to-write syntax: The syntax is simple and easy to understand, making it easy for beginners to access and program effectively. Data structures such as lists, dictionaries, and sets are available in Python and are easy to manipulate.

- Rich Libraries: Provides a very rich library ecosystem, including standard libraries and open-source libraries such as NumPy, Pandas, Matplotlib for data analysis, TensorFlow, Keras, PyTorch for machine learning and OpenCV, MediaPipe for image processing and computer vision.

- High compatibility: Runs on various operating systems such as Windows, Linux, macOS, and can be easily integrated with other languages such as C/C++, Java, and Fortran. This makes Python an ideal language for cross-platform applications.

2.5.2 Python in image processing and computer vision

Python is a popular language in the field of image processing and computer vision thanks to powerful and easy-to-use libraries such as OpenCV, Pillow, SciPy, and scikit-image. In addition, Python also supports machine learning libraries such as TensorFlow, Keras, and PyTorch, which help develop deep learning models for object recognition and image analysis.

- OpenCV with Python: OpenCV is the most popular library in photo and video processing. Python provides an easy-to-use interface with OpenCV, which makes it possible to perform operations such as image reading, image processing, object recognition, and video analysis.

- MediaPipe with Python: MediaPipe, with its Python interface, allows for easy implementation of models that recognize hand, face, body, and other objects in photos and videos. Python helps integrate these models into real-world applications with high performance.

2.7 VNC (Virtual Network Computing)

VNC (Virtual Network Computing) is a protocol that helps you control a computer remotely over a network, using a graphical interface (GUI). Connecting the Raspberry Pi 4 to VNC allows you to control the Raspberry Pi through a computer screen without having to directly connect a mouse, keyboard, or monitor.

2.7.1 Features and advantages of VNC

- Remote Control: You can access and control the Raspberry Pi from any device with a network connection, including a personal computer or smartphone.

- Graphical Interface (GUI): VNC allows you to use the graphical interface of the Raspberry Pi, making it easy to configure the system, develop software, and test applications like ROS2.

2.7.2 Use VNC in the project

Connecting VNC to the Raspberry Pi 4 allows you to access and use the graphical interface of the Raspberry Pi remotely.

- Develop and test GUI interfaces: You can easily develop and test graphical interfaces for robot control systems, such as displaying data from sensors or robot status information.

- Space and cost savings: The elimination of the need to connect a monitor, mouse, and keyboard to the Raspberry Pi saves space and money during the development process.

- Remote Access and Configuration: With VNC, you can easily configure and test the system remotely without having to move the Raspberry Pi, making it convenient to make changes in the robotic system.

2.9 EasyEDA

EasyEDA is an online PCB (Printed Circuit Board) and electronic circuit design tool that makes it easy for engineers and developers to create circuit diagrams, circuit simulations, and PCB designs without having to install complex software. EasyEDA provides an intuitive, easy-to-use user interface and powerful tools for electronic design, saving time and money during product development.

2.9.1 Features and Advantages of EasyEDA

- Electronic Circuit Design: EasyEDA allows users to design electronic circuit schematics easily and quickly. This tool supports a wide range of electronic components from the available libraries, helping you build electrical circuits without having to manually draw every detail.

- Circuit Simulation: One of the standout features of EasyEDA is its ability to simulate electrical circuits. This allows you to test electrical circuits before fabricating the actual PCB circuit, helping to identify defects in the circuit and optimize the design.

- PCB Circuit Design: EasyEDA supports PCB circuit design with tools such as autoplacement, autorouting, which automatically arranges components and connects them. You can create PCB circuits accurately and quickly, thereby producing circuit boards for practical applications.

2.9.2 Application of EasyEDA in the project

The main applications of EasyEDA in the project include:

- Control circuit design: EasyEDA helps design the connection circuit between lidar sensors (YDLIDAR X4 Pro), cameras (ESP32-CAM) and Raspberry Pi, TB6612FNG, MPU6050.

- Pre-production circuit simulation: EasyEDA allows for pre-production simulation of electrical circuits, helping to detect design errors, optimize circuits, and ensure that components will function properly in the robotic system.

- PCB Circuit Creation: After designing the schematic circuit, EasyEDA allows you to convert the schematic into a PCB design. This helps to create a neat printed circuit (PCB) that is easy to manufacture and integrate into the robot.

2.11 Devices used

2.11.1 Raspberry Pi 4 Model B

The Raspberry Pi 4 Model B is a compact, powerful, and versatile computer, developed by the Raspberry Pi Foundation. With its high performance and diverse connectivity capabilities, the Raspberry Pi 4 Model B serves as the main processor of the robotic system in the project.

Features and advantages of the Raspberry Pi 4 Model B:

- Powerful performance: equipped with a Broadcom BCM2711 processor with four ARM Cortex-A72 cores (64-bit) running at 1.5 GHz, for powerful performance, especially in handling complex tasks such as image and video recognition, making the Raspberry Pi 4 able to handle hand gesture recognition algorithms in real time.

- RAM and memory: there are versions with RAM memory from 2GB, 4GB to 8GB LPDDR4-3200, choose the configuration that suits the requirements of the 4GB topic.

- Network connectivity and interface: supports Gigabit Ethernet, Wi-Fi

802.11ac, and Bluetooth 5.0, making it quick to connect to other devices in the robotic system. Supports USB 3.0 and USB 2.0 ports, making it possible to connect to peripheral devices such as keyboards, mice, and sensors.

Application of Raspberry Pi 4 Model B in the project:

- Image Processing and Hand Gesture Recognition: The Raspberry Pi 4 Model B will combine with the ESP32-CAM to capture hand gesture images and use libraries such as OpenCV and MediaPipe for image processing and gesture recognition. Image processing requires high performance to be able to perform recognition operations in real time.

- Control the robot and communicate with sensors: The Raspberry Pi 4 will control the robot to move based on recognized hand gestures, through transmitting signals to the motor control circuits via GPIO pins. The Raspberry Pi will receive signals from sensors such as the YDLidar X4 Pro to map the environment and adjust the robot's behavior.

2.11.3 Module TB6612FNG

TB6612FNG is a DC motor control circuit module manufactured by Toshiba, which helps control motors in mobile robotics and automation applications. This module is designed to provide smooth and efficient control for motors, while protecting the system from faults such as overload or overheating. In the project TB6612FNG used to control the robot's DC motors based on signals received from the Raspberry Pi and ESP32-CAM.

Features and advantages of TB6612FNG:

- DC motor control: support DC motor control and stepper motors, helping the robot move accurately according to hand gestures.

- Voltage and current: can control the motor with voltage from 2.5V to 13.5V, suitable for a variety of DC motors. It also supports currents up to

1.2A per channel (up to 3.2A peak), allowing for the control of small and medium-power motors, ideally.

- High efficiency and low temperature: the motor can be controlled without excessive temperature rise, which helps protect the components in the robot system from damage caused by overheating.

2.11.4 YDLIDAR X4 Pro

YDLIDAR X4 Pro is a 2D lidar sensor, developed by Yangdong Lidar (YDLIDAR), dedicated to environmental scanning and distance measurement applications in robotic and automation systems. This sensor uses laser technology to scan the environment and create a map of the environment around the robot. The YDLIDAR X4 Pro has high accuracy and long scanning range, making it ideal for mobile robots that need to build maps and move safely in 2D space environments.

Features and advantages of YDLIDAR X4 Pro:

- Scanning range and measurement range: it can scan the environment with a scanning range of up to 10 meters, with a 360-degree scanning angle. This allows the robot to collect environmental data in a wide and accurate space, helping the robot to identify obstacles and move safely in complex environments.

- High Accuracy: With its high resolution and fast scanning capabilities, the YDLIDAR X4 Pro provides accurate data on objects in the environment. The accuracy of the measurements can reach ± 2 cm in the scanning range, helping to create detailed and accurate environmental maps.

- Advanced Laser Technology: using high-frequency laser technology to scan and measure distances. The use of lasers makes the sensor capable of measuring far away and not being affected by ambient light such as ultrasonic or infrared sensors.

2.12 Chapter 2 Review: Theoretical Basis

Chapter 2 presents the technologies and software platforms that support the development of mobile robot systems controlled through finger gestures. Key technologies include:

- Image Processing System: Uses OpenCV and MediaPipe to process and recognize hand gestures from images obtained from the ESP32-CAM.

- Hand gesture recognition: Use deep learning modeling (CNN) to detect joint points on the hand, helping to recognize gestures such as raising hands and holding hands.

- ROS2: Provides a distributed communication and real-time computing environment to control robot modules such as lidar sensors and motors.

- Raspberry Pi 4 and ESP32-CAM: The Raspberry Pi 4 processes data and controls robots, while the ESP32-CAM collects images of hand gestures and transmits them to the Raspberry Pi.

- YDLIDAR X4 Pro: Helps build environmental maps and detect obstacles, allowing the robot to move safely in space.

CHƯƠNG 3. SYSTEM ANALYSIS AND DESIGN

3.1 System Block Diagram Design

The mobile robot system controlled by finger gestures is built with a structure consisting of two main parts: the central processing control part (located outside the robot) and the robot body (located in the moving frame).

Central Control Section:

- CAM-COMPUTER: Record hand images according to real-time data.

- COMPUTER: used to monitor, program, and communicate remotely with the Raspberry Pi via an extension cord or Remote Desktop. Display graphical interface, activity logs, maps obtained from LiDAR data and robot control status, receive video stream signals from ESP32-CAM.

Robot body:

- Raspberry Pi 4 Model B: The central processor, which receives image data from the ESP32-CAM and scan data from the YDLidar sensor, processes gesture recognition and maps, and then sends control signals to the motor.

- ESP32-CAM: The camera captures images when the vehicle is moving for direct robot control.

3.2 System Connection Diagram

Above is the actual connection diagram of the system including the power connection to the Raspberry Pi 4 Model B and ESP32-CAM, through an intermediate PCB printed circuit to 2 TB6612FNG modules connected to the motor battery power, the motor battery supply to the motor and 2 TB6612FNG modules.

The PCB circuit is specifically designed to connect the GPIO pins of the Raspberry Pi to the control pins of TB6612FNG such as IN1, IN2, PWM A/B, STBY.

3.3 Communication Protocol

The system uses a variety of communication protocols to connect between hardware, ensuring that the data transmission between sensors, processors, and control devices is stable, synchronous, and real-time.

3.3.1 Communication between ESP32-CAM and Laptop Protocol: HTTP

Description:

- The ESP32-CAM performs a live video stream over Wi-Fi to a local IP address using the HTTP protocol.

- The laptop uses the OpenCV library to receive video streams from the ESP32-CAM, process frames, and recognize hand gestures using MediaPipe.

Properties: Wireless, real-time communication, low latency.

3.3.3 Internal communication in the Raspberry Pi

The hardware communication in the Raspberry Pi uses the following protocols:

- Flask Server → TB6612FNG Motor Driver Protocol: GPIO digital output + PWM

Description: Raspberry Pi controls TB6612FNG through GPIO pins, which sends PWM signals to adjust speed, IN1/IN2 to adjust motor direction.

- Flask Server ↔ MPU6050 Protocol: I2C (SDA/SCL)

Description: The Raspberry Pi reads data from the MPU6050 sensor via I2C communication. The data includes acceleration, rotation angle and direction – used to calibrate the robot's movement.

- Flask Server ↔ LiDAR 2D Protocol: USB-to-Serial

Description: The Raspberry Pi receives data from LiDAR (e.g. YDLidar X4) for use in SLAM and Mapping algorithms in the real environment. Information from the LiDAR is processed through ROS2 nodes to build maps or avoid obstacles.

CHƯƠNG 4. IMPLEMENTATION FACILITIES

4.1 Finger Gesture Recognition Program

The program consists of the following main components:

- User Interface: Created with Tkinter

- Image Processing: Using OpenCV and Mediapipe for Hand Recognition

- Network Connection: Communicates with Raspberry Pi 4 via HTTP

- Processing Flows: Use multiple threads for parallel processing

Initialize the app:

- Create a main window with a size of 1100x750 px

- Initialize Mediapipe Hands for Hand Recognition

- Open 2 cameras: webcam (for gesture recognition) and OBS Virtual Camera (for displaying maps)

- Create separate streams to handle streams from ESP32-CAM and OBS

Check the connection:

- If successful: Continue running the program

- If it fails: Display an error message and exit the app

- Loop continuously every 10ms to update the interface

 Read images from webcam: Get 1 latest frame from webcam Hand Detection Test:

- Use MediaPipe to detect if there is a hand in the image

- If applicable: Continue processing

- Otherwise: Skip

Feature Extraction: Locate the knuckle points (21 points per hand)

Left/Right Hand Differentiation:

- MediaPipe provides information whether this is left or right hand

- Each hand has its own function: Right hand: Control the direction of movement, Left hand: Adjust the speed

Close the app:

When the user closes the window:

- Unleash the camera

- Stop processing flows

- Close network connection

- Clean program exit

Gesture Test:

The system checks each gesture of the right hand based on the raised fingers:

- Index finger raised only:

If yes, send a 'forward' command. If No, move on to the next check.

- The middle finger is raised only:

If yes, send a 'backward' command. If No, move on to the next check.

- Ring finger raised only:

If yes, send a 'left' command.

If No, move on to the next check.

4.2 Robot Control Program

System Initialization:

- Configure GPIO (e.g., Raspberry Pi) and PWM (pulse width modulation) pins to control the motor

- Create a Flask application to handle the API.

- Declare the speed and direction of the storage variable

Flow Smoothing Speed:

- If Yes → This stream will continuously adjust the actual speed of the robot based on the target speed and update the PWM.

- For example, the robot increases/decelerates slowly instead of changing suddenly.

- If No → Skip this stream

- Handle:

- Write the new speed value to the buffer.

- The Speed Smooster stream will use this value to adjust gradually.

Status

- Function: Returns current information.

- Response:

- Direction.

Export data:

- Log read filtered data

- Encapsulating data into a ROS2 package

- Publish data to topic /imu/data

4.4 Configure slam_toolbox

slam_toolbox:

ros parameters:

mode: mapping

odom_topic: /odom/filtered use_sim_time: false map_frame: map

odom_frame: odom base_frame: base_link scan_topic: /scan_throttled max_laser_range: 10.0

minimum_range: 0.1

angle_increment: 0.0075 do_compute_covariance: true transform_tolerance: 0.2

correlation_search_space_scale: 0.6

correlation_search_space_smear_deviation: 0.05

- mode: mapping: Active mode (mapping - creating a map)

- use_sim_time: false: Does not use simulated time (uses real time)

- queue_size: 50: Data processing queue size

4.6 EKF configuration combining IMU and ODOM

ekf_node:

ros parameters:

frequency: 50.0

sensor_timeout: 0.1 two_d_mode: true transform_time_offset: 0.0

transform_timeout: 0.0 print_diagnostics: true debug: false

map_frame: map odom_frame: odom

base_link_frame: base_link world_frame: odom

# Odometry from wheel encoders odom0: /odom

# IMU data immu0: /imu/data

imu0_config: [false, false, false, # x, y, z false, false, true, # roll, pitch, yaw false, false, false, # vx, vy, vz

false, false, true]  # vroll, vpitch, vyaw imu0_differential: false

imu0_queue_size: 10

imu0_remove_gravitational_acceleration: true imu0_nodelay: false

4.7 User Interface

The user interface consists of 4 parts:

- Camera Display:

Webcam: Display images from the camera with the landmark points of the hand

Robot View: Displays streams from the robot's camera (ESP32-CAM) Map: Show map from OBS Virtual Camera

- Instructions:

Explaining hand gestures to control:

Right hand: Index finger (forward), middle finger (backward), ring finger (left), little finger (right)

4.9 Testing

CHƯƠNG 5. RESULTS AND REVIEWS

5.1 Results Achieved

- Function: recognize hand gestures to control the robot to move: go straight, turn left, turn right, backward, stop, go left, walk right can adjust the speed

- Installing the OpenCV OS and Library for Raspberry Pi

- Python Programming Theory Research, OpenCV

- Learned the theory of image processing

- Learn and use slamtool_box configuration

- Building a Hand Gesture Recognition Algorithm That Works Successfully

- Building a mobile robot control algorithm

5.2 Assess

5.2.1 Hand Cover Assessment

In fact, the hand gesture recognition system can work with at least 60% of the hand images captured through the camera still achieving high accuracy similar to gestures

5.2.2 Hand Gesture Recognition Accuracy Assessment

The accuracy of the system in recognizing hand gestures in favorable lighting conditions and the right distance depends on the other factor is the recognition of the mediapipe library, below is a table of accuracy recorded through 100 experiments

5.2.4 Hand Gesture Recognition Latency Assessment

The average latency through 82 gestures processing is 1.9ms

5.3 Restrict

- Depends on network connection: An unstable or interrupted connection will result in the vehicle stopping or not working properly

- The accuracy of the movement commands is not enough to control in narrow spaces or require sophisticated movements

- Performance limitations: large CPU/GPU requirements are not suitable for low-profile machines

5.4 The development direction of the project

- Incorporate AI to analyze maps and make decisions on their own such as avoiding obstacles

- 3D maps by upgrading from lidar2D to lidar3D

- Built-in humanoid robot or robot arm can perform other operations such as picking up things, pressing buttons,.....

CONCLUDE

In the process of researching and implementing the project " Finger Gesture-Based Control Algorithm for Mobile Robots", he has achieved some initial results as follows:

1. Master the operating principle of the mobile robot system controlled by hand gestures and integrate Lidar sensors to build a map of the surrounding environment.

2. MediaPipe application to recognize and process hand gestures through the camera, thereby transmitting control signals to the robot smoothly and accurately.

3. The design and assembly of the robot's mechanical part uses a DC motor, a mecanum wheel, and an ESP32-CAM controller connected to a Raspberry Pi.

4. Build system connection diagrams and program robot controls using Python language, integrate ROS2 to process data from Lidar sensors (YDLIDAR X4 Pro) and map the actual environment.

However, the above results are only the first steps in realizing the robot control system by gestures. There are still many points that need to be improved and optimized to make the system operate more stably and efficiently. With the spirit of progress, I will continue to research and improve the topic to develop a better practical application product in the future.

Development orientation:

This is a highly applicable topic and brings a lot of useful knowledge in the field of robotics, artificial intelligence and automation. I plan to continue to develop this project into a graduation project, which focuses on the following contents:

1. Improve the real-time mapping system (SLAM) and improve the accuracy of gesture recognition with machine learning techniques.

2. The intuitive control interface design makes it easy for users to interact with the robot through gestures or assistive applications.

3. Optimize the connection between hardware modules (Raspberry Pi, ESP32, Lidar, motor...) to improve stability, reduce latency in response and control.

Finally, I would like to express my deep gratitude to the teachers in the department, especially Assoc: Prof. Dr. ………….….. PhD for wholeheartedly guiding, supporting and inspiring me to complete this project.

Thank you very much!

REFERENCES

1. Scientific reports: GestureMoRo: an algorithm for autonomous mobile robot teleoperation based on gesture recognition.

GestureMoRo: an algorithm for autonomous mobile robot teleoperation based on gesture recognition | Scientific Reports

2. Scientific Reports: Teleoperation system for multiple robots with intuitive hand recognition interface.

Teleoperation system for multiple robots with intuitive hand recognition interface

| Scientific Reports

3. Human–Robot Interaction Using Dynamic Hand Gesture for Teleoperation of Quadruped Robots with a Robotic Arm.

https://www.mdpi.com/2079-9292/14/5/860 4. Robot Control Using Hand Gestures

https://robodev.blog/robot-control-using-hand-gestures

5. Hand Gesture Interface for Robot Path Definition in Collaborative Applications:

Implementation and Comparative Study https://www.mdpi.com/1424-8220/23/9/4219

6. Real time interaction with mobile robots using hand gestures.

https://dl.acm.org/doi/10.1145/2157689.2157743

"TẢI VỀ ĐỂ XEM ĐẦY ĐỦ ĐỒ ÁN"