ChatGPT在做什么...为什么它会有效?(十四)

So … What Is ChatGPT Doing, and Why Does It Work?

那么... ChatGPT在做什么,它为什么有效呢?

The basic concept of ChatGPT is at some level rather simple. Start from a huge sample of human-created text from the web, books, etc. Then train a neural net to generate text that’s “like this”. And in particular, make it able to start from a “prompt” and then continue with text that’s “like what it’s been trained with”.

ChatGPT 的基本概念在某种程度上非常简单。首先,从网络、书籍等人类创建的文本样本中开始。然后训练一个神经网络来生成“类似的”文本。特别是,使它能够从一个“提示”开始,然后继续生成“类似于训练数据”的文本。

As we’ve seen, the actual neural net in ChatGPT is made up of very simple elements—though billions of them. And the basic operation of the neural net is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates.

正如我们所看到的,ChatGPT中的实际神经网络是由非常简单的元素组成,尽管它们有数十亿个。神经网络的基本操作也非常简单,主要是把输入文本和它生成的新词(或单词的一部分)给到它的“每个元素进行一次计算”(没有任何循环等)。

But the remarkable—and unexpected—thing is that this process can produce text that’s successfully “like” what’s out there on the web, in books, etc. And not only is it coherent human language, it also “says things” that “follow its prompt” making use of content it’s “read”. It doesn’t always say things that “globally make sense” (or correspond to correct computations)—because (without, for example, accessing the “computational superpowers” of Wolfram|Alpha) it’s just saying things that “sound right” based on what things “sounded like” in its training material.

但是,令人惊讶和意想不到的是,这个过程可以产生类似于网络、书籍等已有文本的文本。它不仅是连贯的人类语言,而且还能读懂提供给它的提示,说出相关的内容。它并不总是说出“全局有意义”的事情(或给出正确的计算),因为(在没有访问 Wolfram|Alpha 的“计算超能力”的情况下)它只是模仿训练材料中的内容,说一些“听起来正确”的话。

The specific engineering of ChatGPT has made it quite compelling. But ultimately (at least until it can use outside tools) ChatGPT is “merely” pulling out some “coherent thread of text” from the “statistics of conventional wisdom” that it’s accumulated. But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought. ChatGPT has implicitly discovered it. But we can potentially explicitly expose it, with semantic grammar, computational language, etc.

ChatGPT 的具体工程已经使它变得相当引人注目了。但最终(至少在它能够使用外部工具之前)ChatGPT “只是”从它积累的“常规知识的统计数据”中提取了一些“连贯的文本线索”。但令人惊讶的是,它的结果非常接近人类的表现。正如我所讨论的那样,这表明了一些至少在科学上非常重要的事情:人类语言(以及它背后的思维模式)在某种程度上比我们想象的更简单,更"规律"。ChatGPT 已经隐晦地发现了它。但我们可以通过语义语法、计算语言等潜在明确地暴露它。

What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar.

ChatGPT生成文本的能力非常出色,而且结果通常非常类似于我们人类所写的内容。那么,这是否意味着ChatGPT在像大脑一样工作呢?其基础的人工神经网络结构最终是基于对大脑的理想化模拟而建立的。而且,当我们人类产生语言时,很可能许多方面的工作方式都与它相似。

When it comes to training (AKA learning) the different “hardware” of the brain and of current computers (as well as, perhaps, some undeveloped algorithmic ideas) forces ChatGPT to use a strategy that’s probably rather different (and in some ways much less efficient) than the brain. And there’s something else as well: unlike even in typical algorithmic computation, ChatGPT doesn’t internally “have loops” or “recompute on data”. And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.

当涉及到训练(又称学习)时,大脑和当前计算机的不同 "硬件"(以及,也许,一些未开发的算法想法)迫使ChatGPT使用一种可能与大脑相当不同(在某些方面效率低得多)的策略。还有一点:即使与典型的算法计算不同,ChatGPT内部也没有 "循环 "或 "对数据进行重新计算"。而这不可避免地限制了它的计算能力--即使与目前的计算机相比也是如此,但与大脑相比肯定是如此。

在训练(也就是学习)方面,大脑和当前计算机的不同“硬件”(以及一些尚未开发的算法思想)迫使ChatGPT使用的策略可能与大脑完全不同(在某些方面更加低效)。还有一点:与典型的算法计算不同,ChatGPT在内部没有“循环”或“重新计算数据”。这不可避免地限制了它的计算能力,不论是相对于当前的计算机而言,还是相对于大脑而言。

It’s not clear how to “fix that” and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more “brain-like things”. Of course, there are plenty of things that brains don’t do so well—particularly involving what amount to irreducible computations. And for these both brains and things like ChatGPT have to seek “outside tools”—like Wolfram Language.

目前还不清楚如何“解决这个问题”,同时还能保持系统的合理效率。但这样做可能会让未来的 ChatGPT 可以做更多“类似大脑的事情”。当然,大脑也有很多做得不好的事——特别是涉及不可约计算的事。对于这些事情,大脑和像 ChatGPT 这样的东西都必须寻求“外部工具”的帮助——比如 Wolfram 语言。

But for now it’s exciting to see what ChatGPT has already been able to do. At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things. But it also provides perhaps the best impetus we’ve had in two thousand years to understand better just what the fundamental character and principles might be of that central feature of the human condition that is human language and the processes of thinking behind it.

但目前来看,看到 ChatGPT 已经能够做到的事情还是很令人兴奋的。在某种程度上,它是一个极好的例子,说明了一个基本的科学事实,即大量的简单计算元素可以做出卓越和意想不到的事情。它也为我们提供了也许是两千年来最好的推动力,去更好地理解成为人类的核心特征,即人类语言及其背后的思维过程,其基本特征和原则可能是什么。

Thanks

感谢

I’ve been following the development of neural nets now for about 43 years, and during that time I’ve interacted with many people about them. Among them—some from long ago, some from recently, and some across many years—have been: Giulio Alessandrini, Dario Amodei, Etienne Bernard, Taliesin Beynon, Sebastian Bodenstein, Greg Brockman, Jack Cowan, Pedro Domingos, Jesse Galef, Roger Germundsson, Robert Hecht-Nielsen, Geoff Hinton, John Hopfield, Yann LeCun, Jerry Lettvin, Jerome Louradour, Marvin Minsky, Eric Mjolsness, Cayden Pierce, Tomaso Poggio, Matteo Salvarezza, Terry Sejnowski, Oliver Selfridge, Gordon Shaw, Jonas Sjöberg, Ilya Sutskever, Gerry Tesauro and Timothee Verdier. For help with this piece, I’d particularly like to thank Giulio Alessandrini and Brad Klee.

我已经关注神经网络的发展约43年了,在此期间,我与许多人交流过有关它们的问题。其中一些人,有些是很久以前认识的,有些是最近才认识的,还有一些则跨越了多年:Giulio Alessandrini, Dario Amodei, Etienne Bernard, Taliesin Beynon, Sebastian Bodenstein, Greg Brockman, Jack Cowan, Pedro Domingos, Jesse Galef, Roger Germundsson, Robert Hecht-Nielsen, Geoff Hinton, John Hopfield, Yann LeCun, Jerry Lettvin, Jerome Louradour, Marvin Minsky, Eric Mjolsness, Cayden Pierce, Tomaso Poggio, Matteo Salvarezza, Terry Sejnowski, Oliver Selfridge, Gordon Shaw, Jonas Sjöberg, Ilya Sutskever, Gerry Tesauro and Timothee Verdier. For help with this piece, I’d particularly like to thank Giulio Alessandrini and Brad Klee.

PS:不光深度学习是一门艺术,翻译也是一门艺术。不同的语言会有不同的思维方式,这就使得不同语言很难能完全遵照原话找到合适的词语进行表达。虽然是机翻加人工修改,还是比想象中麻烦很多。


这篇文章是我在网上看见的一篇关于ChatGPT工作原理的分析。作者由浅入深的解释了ChatGPT是如何运行的,整个过程并没有深入到具体的模型算法实现,适合非机器学习的开发人员阅读学习。

作者Stephen Wolfram,业界知名的科学家、企业家和计算机科学家。Wolfram Research 公司的创始人、CEO,该公司开发了许多计算机程序和技术,包括 Mathematica 和 Wolfram Alpha 搜索引擎。

本文先使用ChatGPT翻译,再由我进行二次修改,红字部分是我额外添加的说明。由于原文很长,我将原文按章节拆分成多篇文章。想要看原文的朋友可以点击下方的原文链接。

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

如果你想亲自尝试一下ChatGPT,可以访问下图小程序,我们已经在小程序中对接了ChatGPT3.5-turbo接口用于测试。

目前通过接口使用ChatGPT与直接访问ChatGPT官网相比,在使用的稳定性和回答质量上还有所差距。特别是接口受到tokens长度限制,无法进行多次的连续对话。

如果你希望能直接访问官网应用,欢迎扫描下图中的二维码进行咨询,和我们一起体验ChatGPT在学习和工作中所带来的巨大改变。

0条留言

留言