论文检索
期刊
全部知识仓储预印本开放期刊机构
高级检索

细粒度图像分类上Vision Transformer的发展综述OA北大核心CSTPCD

Survey of Vision Transformer in Fine-Grained Image Classification

中文摘要英文摘要

细粒度图像分类(fine-grained image classification,FGIC)一直是计算机视觉领域中的重要问题.与传统图像分类任务相比,FGIC的挑战在于类间对象极其相似,使任务难度进一步增加.随着深度学习的发展,Vision Transformer(ViT)模型在视觉领域掀起热潮,并被引入到FGIC任务中.介绍了FGIC任务所面临的挑战,分析了ViT模型及其特性.主要根据模型结构全面综述了基于ViT的FGIC算法,包括特征提取、特征关系构建、特征注意和特征增强四方面内容,对每种算法进行了总结,并分析了它们的优缺点.通过对不同ViT模型在相同公用数据集上进行模型性能比较,以验证它们在FGIC任务上的有效性.最后指出了目前研究的不足,并提出未来研究方向,以进一步探索ViT在FGIC中的潜力.

Fine-grained image classification(FGIC)has always been an important problem in computer vision.Compared to traditional image classification tasks,FGIC faces the challenge of extremely similar inter-class objects,which further increases the difficulty of the task.With the development of deep learning,Vision Transformer(ViT)models have become popular in the field of vision and have been introduced into FGIC tasks.This paper introduces the challenges faced by FGIC tasks,provides an overview of the ViT model,and analyzes its characteristics.The comprehensive review is primarily based on the model structure and covers FGIC algorithms based on ViT.It includes feature extraction,feature relation modeling,feature attention,and feature enhancement as the main aspects.Each algorithm is summarized,and its advantages and disadvantages are analyzed.Following that,a comparison of the performance of different ViT models on the same public dataset is conducted to validate their effectiveness in the FGIC tasks.Furthermore,the limitations of current research are pointed out,and future research directions are proposed to further explore the potential of ViT in FGIC.

孙露露;刘建平;王健;邢嘉璐;张越;王晨阳

北方民族大学 计算机科学与工程学院,银川 750021北方民族大学 计算机科学与工程学院,银川 750021||北方民族大学 图像图形智能处理国家民委重点实验室,银川 750021中国农业科学院 农业信息研究所,北京 100081

计算机与自动化

细粒度图像分类;Vision Transformer;特征提取;特征关系构建;特征注意;特征增强

fine-grained image classification;Vision Transformer;feature extraction;feature relation modeling;feature attention;feature enhancement

《计算机工程与应用》 2024 (010)

30-46 / 17

宁夏重点研发计划(引才专项)(2022BSB03044);宁夏自然科学基金(2021AAC03205);北方民族大学科研启动金项目(2020KYQD37);北方民族大学研究生创新项目(YCX23168).

10.3778/j.issn.1002-8331.2310-0395

评论

下载量:0
点击量:0