Strawberry Robotic...

  • 2022-09-23 10:11:00

Strawberry Robotic Hand Recognition Algorithm Based on Color Prior Knowledge and Deep Learning

The early strawberry picking machinery generally harvested the fruits and stems and leaves together, and then manually separated the fruits and stems and leaves.

In 1996, Japan was the first to invent a robotic hand that uses machine vision to recognize fruit picking. The general process of picking strawberries by a robotic hand based on machine vision technology is to first look for ripe strawberries, separate them from the branches without damaging the fruit, and then put them in a tray. The automation of this process requires the following steps: (1) Judging whether the strawberry can be picked according to the maturity of the strawberry; (2) Determine the position of the strawberry to be picked; (3) Cut off the stem of the strawberry and catch the strawberry; ( 4) Place strawberries on a plate. The first two steps in these 4 steps belong to the positioning step, which plays an important role in the design of the whole robot. Therefore, how to design an accurate positioning algorithm is the key to the design of the strawberry robot hand. Most of the current mainstream positioning algorithms are based on machine vision technology.

Since the robotic hand is required to be able to distinguish the ripeness of strawberries, traditional machine vision-based localization algorithms identify the strawberries by judging the color of the strawberries, but judging based on color alone may mistake other fruits with similar colors as identification targets, such as Small tomatoes, etc., so there is a support vector machine (SVM) classification algorithm for recognition based on texture features. However, the classification algorithm based on SVM needs to design feature vectors in advance, and the artificially designed feature vectors are difficult to accurately describe the appearance and texture features of strawberries, so the misrecognition rate is very high.


This paper researches and develops a practical strawberry picking robot, and proposes a recognition algorithm that combines color prior knowledge and deep learning, which can accurately identify strawberries. Since the ripeness of strawberries can be pre-classified by color, the images obtained by the robot hand are first converted to HSV space, and the H channel is segmented. There are multiple candidate target regions after segmentation. In order to obtain the most accurate positioning target, multiple candidate regions are input into the pre-trained deep convolutional network (CNN) for identification, and the candidate target with the highest recognition accuracy is output as the final positioning result. . When training the CNN, the selected positive samples include mature strawberries of various shapes. In order to avoid the robot hand picking incomplete candidate targets, a large number of partial strawberry pictures are placed in the negative samples. Experiments confirm that the algorithm based on color prior knowledge can obtain accurate localization results.


1. Structure design of strawberry robot hand

The robot body is mainly composed of 3 robot arms and a hose with cables, as shown in Figure 1. The two robotic arms are in the same plane, responsible for the linear telescopic motion of the robotic arm in the horizontal plane, and the other robotic arm scans and moves in the vertical plane. In order to simplify control and identification, a three-arm system is used to convert cylindrical coordinates into Cartesian coordinates for accurate positioning. The gripping part is composed of a robotic hand, which grips the stalk part of the strawberry fruit. When picking, the robotic hand stretches out and picks it off. The power of the manipulator is provided by the servo control. The walking system is designed as a four-wheeled trolley, and a shock absorption system is designed to improve the stability and off-road performance of the robot. The intelligent recognition algorithm of the robot hand is the key to the research. On the basis of pre-segmentation based on the prior knowledge of color, deep learning is used to assist the strawberry robot hand to locate the target, which can effectively and accurately identify strawberries and ensure that the robot hand picks accurately.

2. Strawberry candidate target segmentation based on color prior knowledge

In this paper, the strawberry image is firstly segmented based on the HSV color model to obtain the candidate region of the strawberry. The color space of the image obtained by the robot hand is RGB, so it is first converted to the HSV space by formula (1).

The strawberry robot hand needs to determine the ripeness of strawberries according to the color of the strawberries. According to statistics [2], the color value h of the ripe strawberries is between [0, 5] and [150, 220 ], so the image is thresholded according to formula (2). Get a binarized image.

After binarization based on the h channel, the adjacent candidate regions have certain adhesion, and then the final candidate region result is obtained through morphological operation processing [5], as shown in Figure 2.

Figure 2(e) is obtained by taking the largest circumscribed rectangle for the binarized area of Figure 2(d). It can be seen from Figure 2(e) that due to the morphological filtering, even if the strawberry is partially occluded by stems and leaves, the complete strawberry candidate can be obtained area.

3. Target recognition and positioning based on deep convolutional network

After obtaining the target candidate area based on the color prior knowledge, the strawberry robot hand needs to select the most suitable area from the candidate target as the picking object. The traditional method is to calculate the texture features of strawberries, such as artificially designed feature vectors such as co-occurrence matrices, and then train a support vector machine (SVM) as a classifier for classification [3], but the artificially designed features are difficult to cover the various forms of strawberries, so The classification effect of SVM is poor.

With the development of deep learning technology, deep convolutional networks have shown great advantages in image classification [6]: no need to manually define features; image features are automatically extracted during the learning process; Increase, the convolutional network can learn precise target features.

The strawberry target in the image obtained by the robot hand is small, so when training the model, the convolutional network is first trained based on the Cifar10 data set, and then the convolutional network in this paper is initialized with the network parameters obtained by training. In order to adapt to the Cifar10 convolutional network parameters, when classifying and identifying candidate targets, the candidate target image is first scaled to a standard size of 32×32. The specific steps are as follows:


(1) Pre-training the convolutional network with the Cifar10 dataset;


(2) Initialize the convolutional network in this paper with the network parameters obtained in step (1). In order to reduce the training parameters, and because the candidate target is small, the convolutional network used in this paper has one less fully connected layer than step (1). ;


(3) Fine-tune the convolutional network initialized in step (2) based on the strawberry sample;


(4) Input the target candidate region obtained by segmentation based on color prior knowledge into the convolutional network obtained in step (3) for identification;


(5) Select the candidate region with the highest recognition probability in step (4) as the final positioning result output.


The training samples and verification samples used in this paper are 600 each, and the positive and negative samples are 300 each. In order to avoid the over-fitting phenomenon caused by the small number of samples, the Cifar10 data set is used for network training, and the network parameters after training are used to Initialize the training model in Figure 3(a), and then use the samples collected in this paper for fine-tuning training. Due to the small number of samples used in this paper, 10 positive and negative random samples are input each time when the network in Figure 3 is used for iterative training, and 30 iterations just cover the entire data set.


Since the strawberry target picked by the strawberry robot hand must meet the preset strawberry standard, the strawberry samples used in the training in this paper are all pictures that meet the set standard, and the negative samples include a large number of negative sample pictures that do not meet the preset standard. , such as incomplete strawberries, plants similar to strawberries, strawberry stems and leaves, etc. Some samples are shown in Figure 4.


4. Simulation results and analysis

In order to verify the algorithm in this paper, the Cifar10 data set is used to iteratively train 10,000 times [8]. Due to the large amount of data, in order to improve the training speed, the learning rate is selected to be 0.001 during training, and then the training samples and verification samples used in this paper are used in the model. Fine-tuning, due to the small number of samples in this article, the learning rate is 0.000 1. The training accuracy and loss curve of the Cifar10 model are shown in Figure 5, and the model accuracy and loss curve after fine-tuning with the data in this paper are shown in Figure 6.

It can be seen from Figure 5 that after 100,000 trainings of Cifar10, the accuracy can reach 0.73, but such accuracy cannot be applied to the strawberry robot hand. The training results of fine-tuning the network model are shown in Figure 6. After 400 iterations, the accuracy can be close to 1.0 in the validation sample of this paper, and the loss function is also close to 0 after 400 iterations. Therefore, the fine-tuning used in this paper The model can fully achieve the precision required by the strawberry robot.

Figure 7 shows the output results of the two pictures collected by the strawberry robot hand after color segmentation and input to the convolutional network in Figure 3(b). As can be seen from the figure, the recognition result basically conforms to the subjective definition of ripe strawberry, and the recognition result that best meets the subjective standard can even reach 1.000 0. Even when there are multiple candidate regions that meet the human eye standard, the proposed Algorithms can also give objective scoring rankings. For the sticking targets appearing in the figure, the algorithm in this paper can also give objective recognition results respectively.

The algorithm in this paper is implemented based on C++++ language. The running platform is configured as: CPU is i5 processor, memory is 4 GB, GPU card used is Nvidia quadro k620, video memory is 2 GB. The color segmentation of a picture with a resolution of 1 280 × 960 takes 80 ms, and the identification of candidate regions takes 320 ms. Therefore, the algorithm in this paper can fully meet the real-time requirements of the strawberry robot.

5. Conclusion

In this paper, a strawberry robot hand recognition algorithm based on color prior knowledge and deep learning is designed. After experimental verification, this algorithm can meet the accuracy and real-time requirements of the robot hand. However, due to the limited number of samples, some special samples cannot be covered, resulting in occasional positioning failure of the robot hand, which can be solved by appropriately increasing the sample size in the future.