论文检索
期刊
全部知识仓储预印本开放期刊机构
高级检索

基于改进Faster R-CNN的苹果采摘视觉定位与检测方法OA北大核心CSTPCD

Vision Detection Method for Picking Robots Based on Improved Faster R-CNN

中文摘要英文摘要

针对采摘机器人对场景中目标分布密集、果实相互遮挡的检测及定位能力不理想问题,提出一种引入高效通道注意力机制(ECA)和多尺度融合特征金字塔(FPN)改进Faster R-CNN果实检测及定位方法.首先,利用表达能力较强的融合FPN的残差网络ResNet50替换原VGG16网络,消除了网络退化问题,进而提取更加抽象和丰富的语义信息,提升模型对多尺度和小目标的检测能力;其次,引入注意力机制ECA模块,使特征提取网络聚焦特征图像的局部高效信息,减少无效目标的干扰,提升模型检测精度;最后,采用一种枝叶插图数据增强方法改进苹果数据集,解决图像数据不足问题.基于构建的数据集,使用遗传算法优化K-means++聚类生成自适应锚框,提高模型定位准确性.试验结果表明,改进模型对可抓取和不可直接抓取苹果的精度均值分别为96.16%和86.95%,平均精度均值为92.79%,较传统Faster R-CNN提升15.68个百分点;对可抓取和不可直接抓取的苹果定位精度分别为97.14%和88.93%,较传统Faster R-CNN分别提高12.53个百分点和40.49个百分点;内存占用量减少38.20%,每帧平均计算时间缩短40.7%,改进后的模型参数量小且实时性好,能够更好地应用于果实采摘机器人视觉系统.

To address the issue of poor detection and positioning capabilities of fruit picking robots in scenes with densely distributed targets and fruits occluding each other,a method to improve the fruit detection and positioning of Faster R-CNN was proposed by introducing an efficient channel attention mechanism(EC A)and a multiscale feature fusion pyramid(FPN).Firstly,the commonly used VGG16 network was replaced with a ResNet50 residual network with strong expression capability and eliminate network degradation problem,thus extracting more abstract and rich semantic information to enhance the model's detection ability for multiscale and small targets.Secondly,the ECA module was introduced to enable the feature extraction network to focus on local and efficient information in the feature map,reduce the interference of invalid targets,and improve the model's detection accuracy.Finally,a branch and leaf grafting data augmentation method was used to improve the apple dataset and solve the problem of insufficient image data.Based on the constructed dataset,genetic algorithms were used to optimize K-means++clustering and generate adaptive anchor boxes.Experimental results showed that the improved model had average precision of 96.16%for graspable apples and 86.95%for non-graspable apples,and the mean average precision was 92.79%,which was 15.68 percentages higher than that of the traditional Faster R-CNN.The positioning accuracy for graspable and non-directly graspable apples were 97.14%and 88.93%,respectively,which were 12.53 percentages and 40.49 percentages higher than that of traditional Faster R-CNN.The weight was reduced by 38.20%.The computation time was reduced by 40.7%.The improved model was more suitable for application in fruit-picking robot visual systems.

李翠明;杨柯;申涛;尚拯宇

兰州理工大学机电工程学院,兰州 730050

计算机与自动化

苹果采摘机器人;目标定位与检测;Faster R-CNN;注意力机制;特征金字塔

apple picking robot;target localization and detection;Faster R-CNN;attention mechanism;feature pyramid

《农业机械学报》 2024 (001)

47-54 / 8

国家自然科学基金项目(52265065、51765031)

10.6041/j.issn.1000-1298.2024.01.004

评论

下载量:0
点击量:0