当前位置: 首页 > news >正文

【首发】类脑智能体:通往通用智能体之路-当前智能体的能力调研

【作者】于波,魏江宁,胡敏臻,韩泽杰,邹天健,何叶,刘军(liujun@bupt.edu.cn)

摘要

通用人工智能(AGI)被广泛认为是人工智能的根本目标,代表着实现具备处理通用任务的人类水平认知能力。脑启发人工智能的研究者试图从人脑的运行机制中汲取灵感,旨在在智能模型中复现其功能规则。此外,随着近年来大规模模型的迅速发展,智能体(智能体)的概念也受到越来越多的关注,研究者普遍认为这是实现AGI的必经之路。本文提出了脑启发AI智能体的概念,并分析了如何从人脑复杂的机制中提取出相对可行且适用于智能体的皮层区域功能及其相关的功能连接网络。将这些结构植入智能体中,使其能够实现类似人类能力的基础认知智能。最后,我们探讨了实现脑启发智能体所面临的限制与挑战,并讨论了其未来的发展方向。

1. 概述

AGI 指的是创建具有人类典型的一般认知能力的(半)自主和自适应计算机系统,包括抽象、类比、规划和问题解决能力[1]。人工智能的概念可追溯至20世纪50年代,“人工智能”这一术语首次出现在1956年的达特茅斯暑期研究项目中[2]。自那以后,尽管经历了无数次失败的尝试,研究人员仍不断探索实现AGI的可行性[3–5]。随着时间的推移,AGI日益复杂,促使20世纪70至80年代的研究重心从通用智能系统转向特定领域的问题求解方法。20世纪80年代反向传播算法的引入[6]为现代深度学习网络奠定了基础,如卷积神经网络(CNNs)[7]和循环神经网络(RNNs)[8]。利用自注意力机制的 Transformer模型成为多个先进人工神经网络的基础,如BERT[9]和GPT[10]。这些模型的出现标志着现代人工智能的爆炸式增长。

截至目前,以上所述的人工智能系统仍属于特定目的AI的范畴。事实上,作为AI最终目标的通用人工智能系统研究在21世纪初重新焕发活力,并成为学术讨论的核心主题[11–13]。AGI的概念在Ben Goertzel于2007年出版的著作中得到了进一步发展[14],并自此引起广泛关注。最近深度学习的突破,特别是大语言模型的出现,再次激发了人们对AGI的兴趣,这些模型被认为是通向AGI的火花。基于大语言模型的架构被广泛认为是实现AGI的关键步骤[15, 16]。

AI Agent(智能体)的概念最早在20世纪80年代提出,是人工智能中的一种框架,旨在展现智能行为,其特征包括自主性、反应性、主动性和社交能力[17]。随着深度学习的迅速发展和大语言模型的涌现[18],智能体逐渐被认为是未来AI研究的重要方向。近年来,智能体的应用在多个领域中迅速扩展,包括自动驾驶[19, 20]、游戏AI[21]、自主机器人[22, 23]和智能医疗[24]。

然而,当前的智能体架构仍主要针对特定任务,其工作流程基于任务特性构建,并在很大程度上依赖于大语言模型(LLMs)的处理能力。这些架构与人脑的认知处理能力之间仍存在显著差距。尽管一些智能体架构在一定程度上引入了模拟脑结构的元素,例如[25]中提出的Talker-Reasoner双系统架构模拟了大脑的快思考与慢思考系统,或如[26]中将“心智理论”(Theory of Mind, ToM)引入脉冲神经网络(SNN)以增强多智能体系统在合作与竞争情景下的决策能力,但依然存在明显的局限性。

尽管基于智能体的AI已取得显著进展,脑启发AI同样也有了长足的发展。近年来的研究表明,人工神经网络(ANNs)与生物神经网络(BNNs)之间存在高度相似性[27–31],吸引了大量研究者投身于该领域。例如,[32]是最早提出使用脑区作为模型节点并采用简单连接规则的研究之一。[33]中提出的BIAVAN网络模拟了人类视觉系统中的偏向竞争过程以解码视觉注意力。此外,脉冲神经网络(SNNs)[34]被用于模拟大脑中的神经元行为,进而在脑启发AI与脑模拟之间建立了有效的桥梁[35]。尽管基于智能体和脑启发AI领域都取得了重要进展,目前尚未有研究聚焦于基于脑启发结构的智能体架构设计。

基于上述讨论,我们提出脑启发智能体架构可能是实现AGI的一条有前景的路径。智能体架构中的感知-规划-决策-行动(PPDA)模型通常被解读为对应人类的认知与行为模式。虽然该模型及其扩展取得了显著进展,但仍处于对人类认知的宏观模拟层面,尚未深入人脑实际功能机制。事实上,人脑是一个极其复杂的结构,其复杂的神经系统与脑区功能连接仍未被完全探索[36, 37]。尽管我们无法直接模拟神经回路,但可以将研究重点转向中尺度皮层网络。在该架构中,我们将不同的大脑皮层区域视为脑启发智能体的功能模块,每个模块负责一项或多项任务。不同的功能模块协同完成具体任务,这一特点常被研究者认为体现了大脑的小世界特性 [38]。这些功能模块的设计依据任务复杂性和具体功能需求,可由大语言模型(LLMs)或其他适当工具实现。例如,人类大脑中的V1区负责复杂视觉特征的初级提取,可由CNN或YOLO[39]实现;而前额叶皮层中的FPC区作为人类大脑中最复杂的区域之一,负责高阶认知功能,我们尝试用LLMs进行模拟。此外,整个前额叶皮层作为人脑中的大型皮层区域,将由核心模块控制,并与其他大型皮层区域交互。

与此同时,智能体可复现大脑提取多种信息类型的过程,由不同的皮层区域负责处理特定模态的特征,如视觉、听觉和嗅觉,而非受限于单一特征输入。信息进入系统后,在相关功能模块中进行特征提取,并存储于对应的记忆模块中。只有当执行功能所在的脑区发出指令时,该信息才会从记忆模块传输至指定脑区进行分析。

脑启发智能体的功能实现的另一个重要方面是我们如何参考大脑皮层区的连接机制,为不同类型的任务激活不同的区域与连接路径。根据[40],脑区连接通常分为结构连接和功能连接。我们的连接设计主要基于功能连接模型,它支持并行任务处理与跨区域信息整合。在该连接模型中,不同脑区的角色清晰,任务执行顺序明确,并允许并行任务执行。首先,我们将人脑复杂的功能连接简化为皮层区域之间的区域连接。每个大型皮层区域包含多个较小皮层区,由核心模块负责实现其功能与交互。皮层区域之间的交互促使不同路径在特定功能需求下被激活。我们认为,这种脑启发的连接机制有潜力使智能体获得接近人类认知能力的智能水平。

总之,智能体架构的广泛应用标志着朝向AGI迈出的重要一步。受到人脑启发,我们提出了一种脑启发智能体架构,旨在最大限度发挥智能体的潜力并推进AGI的实现。尽管人脑作为已知最复杂的结构,无法被人工智能完全复制,但持续的研究与技术进步,尤其是大语言模型的出现,让我们在逐步接近人脑架构的过程中重燃了对实现通用功能的希望。所提出的智能体架构为这一目标提供了一个有前景的框架,使我们距离拥有具有人类甚至超越人类认知能力的AGI更进一步。第二章将讨论与人脑相关的研究;第三章将系统性分析智能体架构的发展及其与脑功能的关系,验证我们方法的可行性;第四章将讨论当前方法的局限性,并提出未来研究的方向。

2. 人脑机制

大脑是目前已知最复杂的系统之一,负责调节人体的生理、认知和行为功能。它处理感官信息、执行运动指令、支持高级认知功能(如思维、决策和记忆),并控制身体的基本生理过程。作为中枢神经系统的核心,大脑由多个层级结构组成。从宏观角度看,它包括大脑半球、小脑和脑干等多个部分;从微观层面看,大脑由超过860亿个神经元构成[41],每个神经元最多可与其他神经元形成约1万个突触连接[42],形成一个极其复杂的连接网络,为智能的涌现提供了基础。大脑作为高度层级化的结构,难以单纯从宏观或微观角度进行建模。事实上,在人脑的中尺度层级中,不同的皮层区域在功能上是独立的,但又彼此连接,具有高效的功能连接机制[43, 44]。然而,由于大脑结构组织的复杂性,研究皮层区域的分布一直是生物学研究的重点方向[40, 45, 46]。

图片

图1:大脑与类脑智能体比对概念图

2.1 大脑皮层

大脑皮层是大脑最外层的结构,被分为左右两个半球,每个半球都包含四个主要功能区域:额叶、顶叶、枕叶和颞叶。每个区域负责不同的功能,并与其他区域协同工作。每个脑叶内的区域又可进一步细分为更小的皮层区域,并由一个核心模块控制该脑叶的功能并管理与其他脑叶的交互。例如,以额叶为例,其核心模块为前额叶皮层(PFC),负责决策制定、规划调控、执行控制和任务管理[47]。顶叶则负责处理与身体相关的感官信息(如触觉和空间定位),其核心控制模块为顶上小叶和顶下小叶,前额叶皮层(PFC)通过与其交互整合这些感官信息,以指导复杂的运动行为或决策[48]。额叶与枕叶之间的交互与视觉感知和认知相关,视觉输入通过初级视觉皮层处理,而 PFC 协助制定行动决策[49]。额叶与颞叶之间的连接在记忆处理中起重要作用,特别是在语言、面部识别和长期记忆提取方面。此外,PFC本身可进一步划分为多个核心模块,包括背外侧前额叶皮层(DLPFC)、腹内侧前额叶皮层(VMPFC)和前扣带皮层(ACC),每个模块又分工不同的功能[50],例如认知灵活性、风险评估和情绪调节等。

2.2 大脑皮层划分框架

● 布罗德曼分区(Brodmann Areas)框架:Brodmann分区由德国神经解剖学家 Korbinian Brodmann于1909年提出,依据皮层细胞层结构差异对大脑皮层进行划分[45]。Brodmann根据皮层的显微解剖特征将其划分为52个区域。每个区域根据其形态学和功能特征被分配一个编号,提供了识别特定脑区的框架。Brodmann区的一个关键特征是不同区域通常执行不同的功能,尽管这些功能会随着研究进展而被修正。尽管该分区并非基于现代神经影像技术(如fMRI)定义,但它为理解大脑功能组织提供了重要的理论框架。

● 人类连接组计划(HCP)框架:由Van Essen等人在2012年提出的人类连接组计划(Human Connectome Project)利用多模态磁共振成像数据,并结合半自动神经解剖方法,在210名健康年轻成年人的群体平均图像中精确划分每个半球的180个区域。该方法通过检测皮层结构、功能、连接性和拓扑的突变来定义区域,共鉴定出 97个新区域和83个已报道区域,最终形成180个区域的划分。该方法将多模态 MRI 技术与半自动神经解剖方法相结合,为脑皮层的分区提供了新视角,并为理解大脑复杂性和功能网络提供了重要基础数据。

● 大脑皮层区域连接机制框架:大脑皮层区域之间的连接机制可分为结构连接和功能连接[51]。结构连接通过解剖上的神经纤维束(如白质束)建立,神经轴突将不同脑区连接起来。这些连接是静态的,不随脑活动变化而改变。主要结构连接包括局部电路、长距离纤维通路和皮层间轴突通路[52],为大脑网络运行奠定了基本框架。功能连接是指大脑皮层区域在不同功能状态下的激活关系,这些典型的功能网络通路在认知功能中起着关键作用,并被广泛研究[53–55]。最著名的功能网络是默认模式网络(DMN),通常在静息、反思或记忆处理时激活,主要包括内侧前额叶皮层、后扣带皮层、海马体和下颞皮层。另一个关键网络是执行控制网络(ECN),负责工作记忆、决策、问题解决和规划等高阶认知任务。其他重要的功能网络包括情绪调节网络、视听网络、运动网络、语言网络和自我相关网络。

基于上述皮层划分方法与功能连接机制,我们可以选择性地将大脑的中尺度运行机制建模到智能体架构中。大脑的不同皮层区域可根据其功能特性设计为功能模块,每个模块实现特定功能。在接收到对应的任务指令后,相关的关键功能连接机制与功能模块将被激活。我们的目标是构建一种脑启发的智能体架构,并验证该结构在通用认知任务中的性能表现。

3. 类脑智能体

智能体被认为是迈向AGI的关键步骤,近年来大语言模型的迅速发展也使智能体受到了广泛关注。然而,目前尚未有一个统一的智能体架构被充分研究和确立,现有的 智能体 结构与工作流程大多面向特定目标任务进行定制,导致缺乏灵活性和通用性。因此,我们旨在基于中尺度皮层区域的功能划分与连接网络,定义一种脑启发式 智能体 结构。该结构在经典的感知-规划-行动(Perception-Planning-Action)框架[56] 基础上,融合了其他基本的大脑功能及其相应的功能连接网络,如图1所示。在第三章3.1节中,我们将详细说明所提出的脑启发结构如何定义;在第三章3.2节中,我们将分析现有的 智能体 架构,并比较当前智能体模型与脑启发结构之间的差异。

3.1 脑启发AI智能体

脑启发AI智能体 是基于大脑皮层结构构建的智能体模型,它不同于将大语言模型(LLM)作为智能体“大脑”的传统 智能体 构想,即将 智能体 视为类机器的智能体。相反,它被构想为一种脑启发的智能实体,能够处理广泛的通用任务。其能力包括但不限于:感知、规划、决策、行动、记忆、反思、优化、情感、语言及其他常见的大脑功能。这些能力体现了当前 智能体 所追求的人类类智能特性。

基于上述功能,我们识别了与这些能力相关联的大脑主要皮层区域,将其作为脑启发 智能体 中的功能节点。同时,我们简化了相应的功能连接网络,仅保留存在于功能节点之间的交互。每个功能节点被设置为一个或多个用于其主要功能及对外交互的功能模块。例如,初级视觉皮层被配置为一个视觉语言模型(VLM),并结合其他目标检测模型,以提取足够的视觉信息供后续分析使用。表1总结了每项类脑功能所对应的皮层区域节点与功能连接网络。此外,我们为每个功能节点设置了一个激活状态。该状态取决于该节点当前是否参与了相关功能连接网络中的任务处理。

表1:类脑功能所对应的皮层区域节点与功能连接网络汇总

图片

3.2 当前智能体架构综述

我们总结并回顾了过去两年内涌现的一些智能体架构,并探讨它们与脑启发结构之间的关系,包括但不限于具身智能体(embodied agents)、虚拟世界场景智能体、逻辑创新智能体、网页索引智能体以及多模态智能体。表2主要聚焦于基于LLM的单智能体架构,其维度与我们提出的脑启发智能体相当。

如表2所示,当前实现的 智能体 架构普遍体现出相似的脑启发结构理念。根据具体任务,智能体 被配置为具备不同的类脑能力模块,这表明研究者普遍认识到仅依赖LLM 难以满足性能需求,必须通过引入脑启发能力来增强智能体表现。这种方法在处理专门任务时取得了显著成效,但仍不足以推动通用人工智能的发展。因此,探索和实现具备通用类脑功能的智能体架构至关重要。此外,我们摒弃了传统的面向任务的工作流设计方法,转而依据大脑功能连接机制来设计智能体的工作流程。我们认为,这些进展将为实现AGI奠定坚实基础。

表2:当前主要基于LLM实现的单智能体架构与类脑智能体的能力对比

图片

4. 讨论与展望

在本论文中,我们设计了一种脑启发智能体架构,并通过模拟人脑结构,提出了通用人工智能智能体的概念,并探讨了其实现的可行性。然而,距离真正实现具备人类水平智能的脑启发智能体,仍存在诸多显著的限制与挑战,包括:

● 对人脑理解的局限性:尽管神经科学与脑启发智能领域取得了显著进展,研究者要全面理解人脑并在人工智能中复现其智能功能,依然是一项艰巨的挑战。

● 脑启发架构定义不充分:本研究主要聚焦于中尺度皮层区域及其相关的功能连接网络,而对皮层下区域及更细粒度的神经连接关注较少。我们认为,这些区域及其相应的功能连接网络同样对实现脑启发智能至关重要。

● 计算资源消耗问题:脑启发智能体不可避免地将消耗大量计算资源。因此,如何高效地分配和使用这些资源,是我们研究中无法回避的挑战。

● 框架整合难题:集成众多功能节点与连接网络构成另一个关键挑战。我们希望超越现有集成框架,开发一种专门面向脑启发智能体的新型专用架构。

脑启发智能体的未来充满希望,其发展潜力可能与近年来大语言模型(LLMs)的崛起类似。可以预见,智能体的时代正在到来,因此聚焦于能够逼近人类类智能的智能体正当其时。这类智能体将在多个行业中注入新的活力与无限可能。

脑启发智能体的实现不仅将为实现 AGI 提供新的范式,也将成为解决各类通用智能任务的重要参考依据。

此外,基础计算机科学算法与硬件能力的进步仍将持续深刻影响人工智能的发展。诸如检索增强生成(Retrieval-Augmented Generation, RAG)[79]、强化学习[80]、模仿学习[81] 等技术同样在推动 AI 发展中发挥了重要作用。

最终,我们将致力于在未来不断完善本研究中的相关组件,最大限度地推动脑启发智能体结构的实现,并为 AGI 的达成作出实质性的贡献。

参考文献

[1] P. Voss and M. Jovanovic, “Why we don’t have agi yet,” 2023. [Online]. Available: https://arxiv.org/abs/2308.03598

[2] J. McCarthy, M. Minsky, N. Rochester, and C. Shannon, “Proposal for the dartmouth summer research project on artificial intelligence,” Unpublished Manuscript, Dartmouth College Archives, Hanover, New Hampshire, USA, 1956. [Online]. Available: https://en.wikipedia.org/wiki/Dartmouth workshop

[3] A. Roland and P. Shiman, Strategic Computing: DARPA and the Quest for Machine Intelligence. Cambridge, MA, USA: MIT Press, 2002.

[4] A. Newell and H. A. Simon, “Gps, a program that simulates human thought,” in Computers and Thought, E. Feigenbaum and J. Feldman, Eds. New York, NY, USA: McGraw-Hill, 1961, pp. 279–293.

[5] T. Motooka, Ed., Fifth Generation Computer Systems: Proceedings of the International Conference on Fifth Generation Computer Systems, Tokyo, Japan, October 19-22, 1981. Amsterdam, Netherlands: North-Holland Publishing Company, 1982.

[6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989.

[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, p. 84–90, May 2017. [Online]. Available: https://doi.org/10.1145/3065386

[8] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 27, 2014, pp. 3104–3112.

[9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds. Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423

[10] A. Radford and K. Narasimhan, “Improving language understanding by generative pre-training,” 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:49313245

[11] J. Moor, “The dartmouth college artificial intelligence conference: The next fifty years,” AI Magazine, vol. 27, no. 4, pp. 87–91, 2006. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1609/aimag.v27i4.1911

[12] J. Hawkins and S. Blakeslee, On Intelligence. New York, NY, USA: Henry Holt and Co., 2004.

[13] T. Everitt and M. Hutter, Universal Artificial Intelligence. Cham: Springer International Publishing, 2018, pp. 15–46. [Online]. Available: https://doi.org/10.1007/978-3-319-64816-3_2

[14] B. Goertzel and P. Wang, Eds., Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms. Amsterdam, Netherlands: IOS Press, 2007.

[15] M. Wooldridge and N. R. Jennings, “Intelligent agents: theory and practice,” The Knowledge Engineering Review, vol. 10, no. 2, p. 115–152, 1995.

[16] S. Bubeck, V. Chandrasekaran, R. Eldan, J. A. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y.-F. Li, S. M. Lundberg, H. Nori, H. Palangi, M. T. Ribeiro, and Y. Zhang, “Sparks of artificial general intelligence: Early experiments with gpt-4,” ArXiv, vol. abs/2303.12712, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257663729

[17] R. Goodwin, “Formalizing properties of agents,” USA, Tech. Rep., 1993.

[18] A. Radford and K. Narasimhan, “Improving language understanding by generative pre-training,” 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:49313245

[19] M. Bojarski, D. W. del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to end learning for self-driving cars,” ArXiv, vol. abs/1604.07316, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:15780954

[20] B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. A. Sallab, S. Yogamani, and P. PÅLerez, “Deep reinforcement learning for autonomous driving: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 6, pp. 4909–4926, 2022.

[21] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. P. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, pp. 484–489, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:515925

[22] R. Stern, Multi-Agent Path Finding – An Overview. Cham: Springer International Publishing, 2019, pp. 96–115. [Online]. Available: https://doi.org/10.1007/978-3-030-33274-7_6

[23] W. Zhao, J. P. Queralta, and T. Westerlund, “Sim-to-real transfer in deep reinforcement learning for robotics: a survey,” in 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020, pp. 737–744.

[24] Y. G. Kim, S. Lee, J. Son, H. Bae, and B. D. Chung, “Multi-agent system and reinforcement learning approach for distributed intelligence in a flexible smart manufacturing system,” Journal of Manufacturing Systems, vol. 57, pp. 440–450, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0278612520301916

[25] K. Christakopoulou, S. Mourad, and M. Matarić, “Agents thinking fast and slow: A talker-reasoner architecture,” 2024. [Online]. Available: https://arxiv.org/abs/2410.08328

[26] Z. Zhao, F. Zhao, Y. Zhao, Y. Zeng, and Y. Sun, “A brain-inspired theory of mind spiking neural network improves multi-agent cooperation and competition,” Patterns, vol. 4, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:259256823

[27] M. T. C. Olmedo, M. Paegelow, J.-F. Mas, and F. Escobar, “Geomatic approaches for modeling land change scenarios,” Geomatic Approaches for Modeling Land Change Scenarios, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:134145688

[28] Y. Chen, Y. Du, Z. Xiao, L. Zhao, L. Zhang, D. Liu, D. Zhu, T. Zhang, X. Hu, T. Liu, and X. Jiang, “A unified and biologically-plausible relational graph representation of vision transformers,” IEEE Transactions on Neural Networks and Learning Systems, vol. PP, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:249926529

[29] L. Zhao, H. Dai, Z. Wu, Z. Xiao, L. Zhang, D. Liu, X. Hu, X. Jiang, S. Li, D. Zhu, and T. Liu, “Coupling visual semantics of artificial neural networks and human brain function via synchronized activations,” IEEE Transactions on Cognitive and Developmental Systems, vol. 16, pp. 584–594, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:249927004

[30] X. Liu, M. Zhou, G. Shi, Y. Du, L. Zhao, Z. Wu, D. Liu, T. Liu, and X. Hu, “Coupling artificial neurons in BERT and biological neurons in the human brain,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 7, pp. 8888–8896, Jun. 2023. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/26068

[31] J. You, J. Leskovec, K. He, and S. Xie, “Graph structure of neural networks,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13–18 Jul 2020, pp. 10881–10891. [Online]. Available: https://proceedings.mlr.press/v119/you20b.html

[32] K. E. Joyce, P. J. Laurienti, and S. Hayasaka, “Complexity in a brain-inspired agent-based model,” Neural Networks, vol. 33, pp. 275–290, 2012. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0893608012001578

[33] H. Huang, L. Zhao, X. Hu, H. Dai, L. Zhang, D. Zhu, and T. Liu, “BIAVAN: Brain inspired adversarial visual attention network,” ArXiv, vol. abs/2210.15790, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:253223845

[34] S. Ghosh-Dastidar and H. Adeli, “Spiking neural networks,” International Journal of Neural Systems, vol. 19, no. 4, pp. 295–308, August 2009.

[35] Y. Zeng, D. Zhao, F. Zhao, G. Shen, Y. Dong, E. Lu, Q. Zhang, Y. Sun, Q. Liang, Y. Zhao, Z. Zhao, H. Fang, Y. Wang, Y. Li, X. Liu, C. Du, Q. Kong, Z. Ruan, and W. Bi, “BrainCog: A spiking neural network based, brain-inspired cognitive intelligence engine for brain-inspired AI and brain simulation,” Patterns, vol. 4, no. 8, p. 100789, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2666389923001447

[36] L. Zhao, L. Zhang, Z. Wu, Y. Chen, H. Dai, X. Yu, Z. Liu, T. Zhang, X. Hu, X. Jiang, X. Li, D. Zhu, D. Shen, and T. Liu, “When brain-inspired AI meets AGI,” Meta-Radiology, vol. 1, no. 1, p. 100005, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S295016282300005X

[37] B. C. M. van Wijk, C. J. Stam, and A. Daffertshofer, “Comparing brain networks of different size and connectivity density using graph theory,” PLOS ONE, vol. 5, no. 10, pp. 1–13, 10 2010. [Online]. Available: https://doi.org/10.1371/journal.pone.0013701

[38] X. Liao, A. V. Vasilakos, and Y. He, “Small-world human brain networks: Perspectives and challenges,” Neuroscience & Biobehavioral Reviews, vol. 77, pp. 286–300, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:13001431

[39] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” 2016. [Online]. Available: https://arxiv.org/abs/1506.02640

[40] L. Fan, H. Li, J. Zhuo, Y. Zhang, J. Wang, L. Chen, Z. Yang, C. Chu, S. Xie, A. R. Laird, P. T. Fox, S. B. Eickhoff, C. Yu, and T. Jiang, “The human brainnetome atlas: A new brain atlas based on connectional architecture,” Cerebral Cortex (New York, NY), vol. 26, pp. 3508–3526, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:21767790

[41] J. Zhang, “Basic neural units of the brain: Neurons, synapses and action potential,” arXiv: Neurons and Cognition, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:174799219

[42] E. T. Bullmore and O. Sporns, “Complex brain networks: Graph theoretical analysis of structural and functional systems,” Nature Reviews Neuroscience, vol. 10, pp. 186–198, 2009. [Online]. Available: https://api.semanticscholar.org/CorpusID:205504722

[43] M. D. Fox, A. Z. Snyder, J. L. Vincent, M. Corbetta, D. C. V. Essen, and M. E. Raichle, “The human brain is intrinsically organized into dynamic, anticorrelated functional networks.” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 27, pp. 9673–9678, 2005. [Online]. Available: https://api.semanticscholar.org/CorpusID:512175

[44] K. Brodmann, “Vergleichende lokalisationslehre der großhirnrinde: in ihren prinzipien dargestellt auf grund des zellenbaues,” 1985. [Online]. Available: https://api.semanticscholar.org/CorpusID:142722366

[45] M. F. Glasser, T. S. Coalson, E. C. Robinson, C. D. Hacker, J. W. Harwell, E. Yacoub, K. Uğurbil, J. L. R. Andersson, C. F. Beckmann, M. Jenkinson, S. M. Smith, and D. C. V. Essen, “A multi-modal parcellation of human cerebral cortex,” Nature, vol. 536, pp. 171–178, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:205249949

[46] E. K. Miller and J. D. Cohen, “An integrative theory of prefrontal cortex function,” Annual Review of Neuroscience, vol. 24, pp. 167–202, 2001. [Online]. Available: https://www.annualreviews.org/content/journals/10.1146/annurev.neuro.24.1.167

[47] R. C. O’Reilly, S. A. Herd, and W. M. Pauli, “Computational models of cognitive control,” Current Opinion in Neurobiology, vol. 20, no. 2, pp. 257–261, 2010. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0959438810000097

[48] J. M. Fuster, “Frontal lobe and cognitive development,” Brain Research Bulletin, vol. 57, no. 5, pp. 367–370, March 2002.

[49] P. S. Goldman-Rakic, “Circuitry of primate prefrontal cortex and regulation of behavior by representational memory,” in Handbook of Physiology: The Nervous System: Higher Functions of the Brain, F. Plum, Ed. Bethesda, MD, USA: American Physiological Society, 1987, vol. V, pp. 373–417.

[50] T. E. Behrens and O. Sporns, “Human connectomics,” Current Opinion in Neurobiology, vol. 22, no. 1, pp. 144–153, 2012. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0959438811001449

[51] P. Hagmann, L. Cammoun, X. Gigandet, R. Meuli, C. J. Honey, V. J. Wedeen, and O. Sporns, “Mapping the structural core of human cerebral cortex,” PLOS Biology, vol. 6, no. 7, pp. 1–15, 2008. [Online]. Available: https://doi.org/10.1371/journal.pbio.0060159

[52] R. Buckner, J. Andrews-Hanna, and D. Schacter, “The brain’s default network: Anatomy, function, and relevance to disease,” Annals of the New York Academy of Sciences, no. 1124, pp. 1–38, 2008. [Online]. Available: http://www.nyas.org/Publications/Annals/Default.aspx

[53] W. W. Seeley, R. K. Crawford, J. Zhou, B. L. Miller, and M. D. Greicius, “Neurodegenerative diseases target large-scale human brain networks,” Neuron, vol. 62, no. 1, pp. 42–52, 2009. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0896627309002499

[54] S. M. Smith, P. T. Fox, K. L. Miller, D. C. Glahn, P. M. Fox, C. E. Mackay, N. Filippini, K. E. Watkins, R. Toro, A. R. Laird, and C. F. Beckmann, “Correspondence of the brain’s functional architecture during activation and rest,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106, no. 31, pp. 13040–13045, August 2009. [Online]. Available: https://europepmc.org/articles/PMC2722273

[55] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. USA: Prentice Hall Press, 2009.

[56] W. Huang, P. Abbeel, D. Pathak, and I. Mordatch, “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents,” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162. PMLR, 17–23 Jul 2022, pp. 9118–9147. [Online]. Available: https://proceedings.mlr.press/v162/huang22a.html

[57] M. Gramopadhye and D. Szafir, “Generating executable action plans with environmentally-aware language models,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 3568–3575.

[58] S. Yao, H. Chen, J. Yang, and K. Narasimhan, “Webshop: Towards scalable real-world web interaction with grounded language agents,” in Advances in Neural Information Processing Systems, vol. 35. Curran Associates, Inc., 2022, pp. 20744–20757. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/82ad13ec01f9fe44c01cb91814fd7b8c-Paper-Conference.pdf

[59] R. Feldt, S. Kang, J. Yoon, and S. Yoo, “Towards autonomous testing agents via conversational large language models,” in 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1688–1693, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:259108951

[60] H. Li, Y. Hao, Y. Zhai, and Z. Qian, “The hitchhiker’s guide to program analysis: A journey with large language models,” ArXiv, vol. abs/2308.00245, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260351308

[61] D. A. Boiko, R. MacKnight, and G. Gomes, “Emergent autonomous scientific research capabilities of large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2304.05332

[62] A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A. White, and P. Schwaller, “Augmenting large language models with chemistry tools,” in NeurIPS 2023 AI for Science Workshop, 2023. [Online]. Available: https://openreview.net/forum?id=wdGIL6lx3l

[63] Y. S. Kang and J. Kim, “ChatMOF: An autonomous AI system for predicting and generating metal-organic frameworks,” ArXiv, vol. abs/2308.01423, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260438479

[64] Z. Wang, S. Cai, G. Chen, A. Liu, X. S. Ma, and Y. Liang, “Describe, explain, plan and select: Interactive planning with LLMs enables open-world multi-task agents,” in Advances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023, pp. 34153–34189. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2023/file/6b8dfb8c0c12e6fafc6c256cb08a5ca7-Paper-Conference.pdf

[65] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. J. Fan, and A. Anandkumar, “Voyager: An open-ended embodied agent with large language models,” Trans. Mach. Learn. Res., vol. 2024, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:258887849

[66] K. Nottingham, P. Ammanabrolu, A. Suhr, Y. Choi, H. Hajishirzi, S. Singh, and R. Fox, “Do embodied agents dream of pixelated sheep?: Embodied decision making using language guided world modelling,” in International Conference on Machine Learning, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:256389514

[67] H. Yuan, C. Zhang, H. Wang, F. Xie, P. Cai, H. Dong, and Z. Lu, “Plan4MC: Skill reinforcement learning and planning for open-world Minecraft tasks,” ArXiv, vol. abs/2303.16563, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257805102

[68] Y. Wu, S. Y. Min, Y. Bisk, R. Salakhutdinov, A. Azaria, Y.-F. Li, T. M. Mitchell, and S. Prabhumoye, “Plan, eliminate, and track – language models are good teachers for embodied agents,” ArXiv, vol. abs/2305.02412, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:258480064

[69] I. Gur, H. Furuta, A. V. Huang, M. Safdari, Y. Matsuo, D. Eck, and A. Faust, “A real-world web agent with planning, long context understanding, and program synthesis,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=9JQtrumvg8

[70] P.-L. Chen and C.-S. Chang, “Interact: Exploring the potentials of ChatGPT as a cooperative agent,” ArXiv, vol. abs/2308.01552, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260438734

[71] H. Furuta, K.-H. Lee, O. Nachum, Y. Matsuo, A. Faust, S. S. Gu, and I. Gur, “Multimodal web navigation with instruction-finetuned foundation models,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=efFmBWioSc

[72] G. Kim, P. Baldi, and S. McAleer, “Language models can solve computer tasks,” in Advances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023, pp. 39648–39677. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2023/file/7cc1005ec73cfbaac9fa21192b622507-Paper-Conference.pdf

[73] L. Zheng, R. Wang, X. Wang, and B. An, “Synapse: Trajectory-as-exemplar prompting with memory for computer control,” 2024. [Online]. Available: https://arxiv.org/abs/2306.07863

[74] X. Deng, Y. Gu, B. Zheng, S. Chen, S. Stevens, B. Wang, H. Sun, and Y. Su, “Mind2Web: Towards a generalist agent for the web,” 2023. [Online]. Available: https://arxiv.org/abs/2306.06070

[75] W. Zhang, K. Tang, H. Wu, M. Wang, Y. Shen, G. Hou, Z. Tan, P. Li, Y. Zhuang, and W. Lu, “AgentPro: Learning to evolve via policy-level reflection and optimization,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 5348–5375. [Online]. Available: https://aclanthology.org/2024.acl-long.292

[76] W. Zhang, Y. Shen, L. Wu, Q. Peng, J. Wang, Y. Zhuang, and W. Lu, “Self-contrast: Better reflection through inconsistent solving perspectives,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 3602–3622. [Online]. Available: https://aclanthology.org/2024.acl-long.197

[77] J. Han, W. Buntine, and E. Shareghi, “Towards uncertainty-aware language agent,” in Findings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V. Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 6662–6685. [Online]. Available: https://aclanthology.org/2024.findings-acl.398

[78] T. Xie, D. Zhang, J. Chen, X. Li, S. Zhao, R. Cao, T. J. Hua, Z. Cheng, D. Shin, F. Lei, Y. Liu, Y. Xu, S. Zhou, S. Savarese, C. Xiong, V. Zhong, and T. Yu, “OSWorld: Benchmarking multimodal agents for open-ended tasks in real computer environments,” ArXiv, vol. abs/2404.07972, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:269042918

[79] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-T. Yih, T. Rocktäschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 33, 2020, pp. 9459–9474. [Online]. Available: https://proceedings.neurips.cc/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf

[80] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 1998.

[81] P. Abbeel and A. Y. Ng, “Apprenticeship learning via inverse reinforcement learning,” in Proceedings of the Twenty-first International Conference on Machine Learning (ICML). ACM, 2004, pp. 1–8.

相关文章:

  • 简述谷歌A2A协议
  • 迷你世界脚本脚本常见问题
  • mongodb 4.0+多文档事务的实现原理
  • 表单数据校验方法
  • 有ts文件却无法ts出来解决办法
  • LeetCode 2999.统计强大整数的数目:上下界数位DP
  • 【渗透测试】Vulnhub靶机-Mordor: 1.1-详细通关教程
  • SpringAi 会话记忆功能
  • Linux扩展名相关知识
  • 信息安全管理与评估广东省2023省赛正式赛题
  • Python中数值计算、表格处理和可视化的应用
  • React基础知识一
  • 【项目管理】第13章 项目资源管理-- 知识点整理
  • 【github】github不能访问了,Access to this site has been restricted.
  • 反序列化漏洞介绍与挖掘指南
  • 拜特科技签约惠生工程,携手打造高效资金管理系统
  • FTXUI 笔记(五)——基本交互组件
  • redission锁释放失败处理
  • 2025阿里云AI 应用-AI Agent 开发新范式-MCP最佳实践-78页.pptx
  • 利用 Genspark 和 AI IDE 一键配置 Java 开发环境
  • 商务部新闻发言人就出口管制管控名单答记者问
  • 夜读丨读《汉书》一得
  • 文化润疆|为新疆青少年提供科普大餐,“小小博物家(喀什版)”启动
  • 第十二届警博会在京开幕:12个国家和地区835家企业参展
  • 江西贵溪:铜板上雕出的国潮美学
  • 乌方:泽连斯基只接受与普京会谈,拒见其他俄代表