ChatGPT在做什么...为什么它会有效?(十三)

Semantic Grammar and the Power of Computational Language

语义语法与计算语言的能力

What does it take to produce “meaningful human language”? In the past, we might have assumed it could be nothing short of a human brain. But now we know it can be done quite respectably by the neural net of ChatGPT. Still, maybe that’s as far as we can go, and there’ll be nothing simpler—or more human understandable—that will work. But my strong suspicion is that the success of ChatGPT implicitly reveals an important “scientific” fact: that there’s actually a lot more structure and simplicity to meaningful human language than we ever knew—and that in the end there may be even fairly simple rules that describe how such language can be put together.

产生“有意义的人类语言”需要什么?过去,我们可能认为只有人类大脑才能做到。但现在我们知道,ChatGPT的神经网络也能够做到这一点。不过,也许这就是我们能够达到的极限了,没有比这更简单或更易于人类理解的方法。但我强烈怀疑的是,ChatGPT的成功隐含地揭示了一个重要的 "科学 "事实:实际上,有关有意义的人类语言的结构和简单性比我们所知道的要多得多,最终可能会有相当简单的规则来描述如何构建这样的语言。

As we mentioned above, syntactic grammar gives rules for how words corresponding to things like different parts of speech can be put together in human language. But to deal with meaning, we need to go further. And one version of how to do this is to think about not just a syntactic grammar for language, but also a semantic one.

正如我们前面提到的,句法语法提供了有关如何将与不同词性相关的单词组合在一起的规则。但是要使文本有意义,我们需要更进一步。一个处理这个问题的方法是不仅仅要考虑句法语法,还要考虑语义语法。

For purposes of syntax, we identify things like nouns and verbs. But for purposes of semantics, we need “finer gradations”. So, for example, we might identify the concept of “moving”, and the concept of an “object” that “maintains its identity independent of location”. There are endless specific examples of each of these “semantic concepts”. But for the purposes of our semantic grammar, we’ll just have some general kind of rule that basically says that “objects” can “move”. There’s a lot to say about how all this might work (some of which I’ve said before). But I’ll content myself here with just a few remarks that indicate some of the potential path forward.

针对句法,我们可以识别诸如名词和动词之类的词性。但是针对语义,我们需要“更细致的层次”。因此,例如,我们可能会识别“移动”这个概念,以及“物体”这个概念,它“独立于位置而保持其身份”。这些“语义概念”中的每一个都有无穷无尽的具体例子。但是为了我们的语义语法的目的,我们只有一些通用的规则,只会说“对象”可以“移动”。关于这一切如何运作,我们还有很多话要说(其中一些我之前已经说过)。但是,我在这里只想说几句,指出一些潜在的前进道路。

It’s worth mentioning that even if a sentence is perfectly OK according to the semantic grammar, that doesn’t mean it’s been realized (or even could be realized) in practice. “The elephant traveled to the Moon” would doubtless “pass” our semantic grammar, but it certainly hasn’t been realized (at least yet) in our actual world—though it’s absolutely fair game for a fictional world.

值得一提的是,即使一句话在语义语法上完全正确,也不意味着它在实践中已经被实现(甚至可以实现)。 “大象去月球旅行” 毫无疑问会“通过”我们的语义语法,但是它在我们现实世界中肯定还没有被实现(至少目前还没有)-尽管它绝对是虚构世界的一个合理选项。

When we start talking about “semantic grammar” we’re soon led to ask “What’s underneath it?” What “model of the world” is it assuming? A syntactic grammar is really just about the construction of language from words. But a semantic grammar necessarily engages with some kind of “model of the world”—something that serves as a “skeleton” on top of which language made from actual words can be layered.

当我们开始谈论“语义语法”时,我们很快就会问:“它的底层是什么?”它所假设的是什么 "世界模型"? 句法语法只涉及从单词构建语言的过程,但是语义语法必然涉及某种“世界模型”——一些作为“骨架”的东西,在上面可以添加由实际单词组成的语言。

Until recent times, we might have imagined that (human) language would be the only general way to describe our “model of the world”. Already a few centuries ago there started to be formalizations of specific kinds of things, based particularly on mathematics. But now there’s a much more general approach to formalization: computational language.

直到近年来,我们可能认为(人类)语言是描述我们“世界模型”的唯一通用方式。早在几个世纪前,就已经开始对某些特定事物进行形式化,特别是基于数学。但现在有了一种更通用的形式化方法:计算语言。

And, yes, that’s been my big project over the course of more than four decades (as now embodied in the Wolfram Language): to develop a precise symbolic representation that can talk as broadly as possible about things in the world, as well as abstract things that we care about. And so, for example, we have symbolic representations for cities and molecules and images and neural networks, and we have built-in knowledge about how to compute about those things.

是的,这是我四十多年的时间里一直致力于的大项目(现在体现在 Wolfram语言中):开发一个精确的符号表示,可以尽可能广泛地描述世界上的事物,以及我们关心的抽象事物。因此,例如,我们有关于城市、分子、图像和神经网络的符号表示,并且我们内置了有关如何计算这些事物的知识。

And, after decades of work, we’ve covered a lot of areas in this way. But in the past, we haven’t particularly dealt with “everyday discourse”. In “I bought two pounds of apples” we can readily represent (and do nutrition and other computations on) the “two pounds of apples”. But we don’t (quite yet) have a symbolic representation for “I bought”.

过去几十年里,我们在很多领域做了很多工作,但我们并没有特别处理“日常语言交流”。在句子“我买了两磅苹果”中,我们可以很容易地表示(并对其进行营养和其他计算)“两磅苹果”,但是我们(现在)还没有“我买了”这个动作的符号表示。

It’s all connected to the idea of semantic grammar—and the goal of having a generic symbolic “construction kit” for concepts, that would give us rules for what could fit together with what, and thus for the “flow” of what we might turn into human language.

这一切都与“语义语法”的概念有关,以及拥有一个通用符号“构建工具包”来表示概念的目标,这将为我们提供规则,说明什么可以与什么相匹配,从而实现我们可能转化为人类语言的“流程”。

But let’s say we had this “symbolic discourse language”. What would we do with it? We could start off doing things like generating “locally meaningful text”. But ultimately we’re likely to want more “globally meaningful” results—which means “computing” more about what can actually exist or happen in the world (or perhaps in some consistent fictional world).

但是,假设我们有了这种“符号对话语言”,我们会用它来做什么呢?我们可以开始做一些像生成“本地有意义的文本”这样的事情。但最终,我们可能希望得到更多“全局有意义”的结果——这意味着“计算”更多关于世界上(或者也许是某个一致的虚构世界)可能实际存在或发生的事情。

Right now in Wolfram Language we have a huge amount of built-in computational knowledge about lots of kinds of things. But for a complete symbolic discourse language we’d have to build in additional “calculi” about general things in the world: if an object moves from A to B and from B to C, then it’s moved from A to C, etc.

目前,在Wolfram语言中,我们已经内置了大量关于各种事物的计算知识。但是要建立一个完整的符号话语语言,我们必须构建关于世界上一般事物的额外“计算”:例如,如果一个物体从A移动到B,再从B移动到C,那么它就从 A移动到C,等等。

Given a symbolic discourse language we might use it to make “standalone statements”. But we can also use it to ask questions about the world, “Wolfram|Alpha style”. Or we can use it to state things that we “want to make so”, presumably with some external actuation mechanism. Or we can use it to make assertions—perhaps about the actual world, or perhaps about some specific world we’re considering, fictional or otherwise.

如果有一个符号话语语言,我们可以用它来做“独立的陈述”。但我们也可以用它来问关于世界的问题,类似于“Wolfram|Alpha风格”。或者,我们可以用它来陈述我们“想要实现”的事情,可能需要一些外部的激励机制。或者,我们可以用它来作出断言——可能是关于真实的世界,也可能是关于我们正在考虑的某个特定世界,无论是虚构的还是其他的。

Human language is fundamentally imprecise, not least because it isn’t “tethered” to a specific computational implementation, and its meaning is basically defined just by a “social contract” between its users. But computational language, by its nature, has a certain fundamental precision—because in the end what it specifies can always be “unambiguously executed on a computer”. Human language can usually get away with a certain vagueness. (When we say “planet” does it include exoplanets or not, etc.?) But in computational language we have to be precise and clear about all the distinctions we’re making.

人类语言从根本上说是不精确的,这不仅因为它没有与特定的计算实现“绑定”,而且它的意义基本上只是由其使用者之间的“社会契约”来定义。但是计算语言,就其性质而言,具有某种基本的精确性,因为最终它所指定的东西总是可以在计算机上“无歧义地执行”。人类语言通常可以通过一定的模糊性来逃避一些问题(例如,当我们说“行星”时,它是否包括系外行星,等等?),但在计算语言中,我们必须准确、清晰地表达我们所作的所有区分。

It’s often convenient to leverage ordinary human language in making up names in computational language. But the meanings they have in computational language are necessarily precise—and might or might not cover some particular connotation in typical human language usage.

在计算语言中,利用普通人类语言来编造名字往往很方便,但是在计算语言中它们的含义必须是精确的,并且可能涵盖也可能不涵盖典型人类语言用法中的某些特定含义。

How should one figure out the fundamental “ontology” suitable for a general symbolic discourse language? Well, it’s not easy. Which is perhaps why little has been done in these since the primitive beginnings Aristotle made more than two millennia ago. But it really helps that today we now know so much about how to think about the world computationally (and it doesn’t hurt to have a “fundamental metaphysics” from our Physics Project and the idea of the ruliad).

如何找出一种适合普适符号话语语言的基本“本体论”呢?好吧,这并不容易。这也许就是为什么自从亚里士多德两千多年前的原始开端以来,很少有人在这方面做过什么。但是,现在我们已经知道了如何用计算的方式思考世界,这确实有所帮助(而且从我们的物理项目和 ruliad 的想法中获得“基本形而上学”也没有什么坏处)。

But what does all this mean in the context of ChatGPT? From its training ChatGPT has effectively “pieced together” a certain (rather impressive) quantity of what amounts to semantic grammar. But its very success gives us a reason to think that it’s going to be feasible to construct something more complete in computational language form. And, unlike what we’ve so far figured out about the innards of ChatGPT, we can expect to design the computational language so that it’s readily understandable to humans.

但在ChatGPT的背景下,这一切意味着什么呢?通过它的训练,ChatGPT实际上已经“拼凑”了一定数量(相当令人印象深刻)的语义语法。但是它的成功让我们有理由认为,以计算语言形式构建更完整的东西是可行的。而且,与我们迄今为止对ChatGPT内部的理解不同的是,我们可以期望将计算语言设计得很容易被人类所理解。

When we talk about semantic grammar, we can draw an analogy to syllogistic logic. At first, syllogistic logic was essentially a collection of rules about statements expressed in human language. But (yes, two millennia later) when formal logic was developed, the original basic constructs of syllogistic logic could now be used to build huge “formal towers” that include, for example, the operation of modern digital circuitry. And so, we can expect, it will be with more general semantic grammar. At first, it may just be able to deal with simple patterns, expressed, say, as text. But once its whole computational language framework is built, we can expect that it will be able to be used to erect tall towers of “generalized semantic logic”, that allow us to work in a precise and formal way with all sorts of things that have never been accessible to us before, except just at a “ground-floor level” through human language, with all its vagueness.

当我们谈论语义语法时,可以将其比作推理逻辑。起初,推理逻辑本质上是关于用人类语言表达的陈述的规则集合。但是(是的,两千年后),当形式逻辑被开发出来时,推理逻辑的原始基本结构现在可以用于构建巨大的“形式塔”,例如,现代数字电路的运行。因此,我们可以预期,更一般的语义语法也会如此。起初,它可能只能处理简单的模式,比如文本。但一旦其整个计算语言框架被建立起来,我们可以期望它能够用于建立“广义语义逻辑”的高塔,使我们能够以精确而正式的方式处理各种我们以前从未接触过的东西,除了使用人类语言所带来的模糊性。

We can think of the construction of computational language—and semantic grammar—as representing a kind of ultimate compression in representing things. Because it allows us to talk about the essence of what’s possible, without, for example, dealing with all the “turns of phrase” that exist in ordinary human language. And we can view the great strength of ChatGPT as being something a bit similar: because it too has in a sense “drilled through” to the point where it can “put language together in a semantically meaningful way” without concern for different possible turns of phrase.

我们可以将计算语言的构建和语义语法视为一种表示事物的终极压缩方式。因为它允许我们谈论可能的事物的本质,而不必处理普通人类语言中存在的所有“措辞变化”。我们可以将ChatGPT的巨大优势看作是有点类似的东西:因为它也在某种程度上“钻透”到能够“以语义上有意义的方式将语言组合在一起”的程度,而不必担心不同可能的措辞变化。

So what would happen if we applied ChatGPT to underlying computational language? The computational language can describe what’s possible. But what can still be added is a sense of “what’s popular”—based for example on reading all that content on the web. But then—underneath—operating with computational language means that something like ChatGPT has immediate and fundamental access to what amount to ultimate tools for making use of potentially irreducible computations. And that makes it a system that can not only “generate reasonable text”, but can expect to work out whatever can be worked out about whether that text actually makes “correct” statements about the world—or whatever it’s supposed to be talking about.

那么,如果我们将ChatGPT应用于底层的计算语言会发生什么呢?计算语言可以描述什么是可能的。但是仍然可以添加一种“什么是流行的”的感觉——例如基于对网络上所有内容的阅读。但是,从底层开始使用计算语言意味着像ChatGPT这样的系统,可以立即且基本地访问不可约计算的终极工具。这使得它不仅可以“生成合理的文本”,而且可以期望解决关于该文本是否实际上对世界或其他主题进行了“正确”陈述的所有问题。


这篇文章是我在网上看见的一篇关于ChatGPT工作原理的分析。作者由浅入深的解释了ChatGPT是如何运行的,整个过程并没有深入到具体的模型算法实现,适合非机器学习的开发人员阅读学习。

作者Stephen Wolfram,业界知名的科学家、企业家和计算机科学家。Wolfram Research 公司的创始人、CEO,该公司开发了许多计算机程序和技术,包括 Mathematica 和 Wolfram Alpha 搜索引擎。

本文先使用ChatGPT翻译,再由我进行二次修改,红字部分是我额外添加的说明。由于原文很长,我将原文按章节拆分成多篇文章。想要看原文的朋友可以点击下方的原文链接。

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

如果你想亲自尝试一下ChatGPT,可以访问下图小程序,我们已经在小程序中对接了ChatGPT3.5-turbo接口用于测试。

目前通过接口使用ChatGPT与直接访问ChatGPT官网相比,在使用的稳定性和回答质量上还有所差距。特别是接口受到tokens长度限制,无法进行多次的连续对话。

如果你希望能直接访问官网应用,欢迎扫描下图中的二维码进行咨询,和我们一起体验ChatGPT在学习和工作中所带来的巨大改变。

0条留言

留言