Language as an analogy in the natural sciences

A conference held in Munich, November 20-23, 1997.

Belaboring the obvious: Chemistry as sister science to linguistics

Pierre Laszlo
Ecole polytechnique, Palaiseau, France
and University of Liège, Belgium

Introduction
I see the purpose of this conference as two-fold: to impress upon us participants, whether we are active scientists, historians of science or philosophers of science, the centrality of the language metaphor; and to make us, at the same time, question this metaphor as to its limits, its usefulness, its risks and its pathologies.
I have chosen in my presentation to deal with the science of chemistry in its resemblance to the science of linguistics. As we shall see, in like manner as linguistics projects on an empirical continuum, that of sound (or rather phonological) production, a small number of discrete units or phonemes, chemistry interprets the material world with a small number of discrete units or modules, that identify with radicals such as methyl, phenyl or benzyl and with other units such as a molecule in a molecular solid, a triad in a stereoregular polymer or an aminoacid residue in a protein.
Thus, my concern today is very little with the language of chemistry. I am not talking about nomenclature and terminology, I'm talking about what I've termed la parole des choses.
I'm talking to the issue of the parallel between chemistry and a natural language. In doing so, I'll try to steer clear of two rival temptations, two pitfalls which we might call the anthropomorphic fallacy, and the anthropocentric fallacy. The former would assert too much and claim linguistic status for the substances and the operations of chemistry. The latter would allow no such claim whatsoever, because only the human animal, it would assert, is graced with language. In the last resort, the anthropocentric fallacy, with its emphasis on Logos, goes back to the Judeo-Christian tradition of divine utterances.
At this juncture, the distinction introduced by Saussure between langue et parole is most helpful. Some of the points I shall be making will refer to language as langue, others, including some of the examples, will refer to language as parole.

My approach to the question of locating chemistry on a map of knowledge is to look at the everyday activity of chemists. Since I am myself a practicing chemist, this is a natural thing to do. Some sociologists of science would however rule out such testimony as tainted and irrelevant. In their view, only an outsider is in a position to locate the lines of force in a scientific field. What sociologists may gain in objectivity, they lack in competence, being unable for instance to distinguish between a passing scientific fad (such as use of NMR shift reagents in the 1970s) and a genuine breakthrough (such as the discovery of fullerenes in 1987). In any case, I hope that this mini-essay will convince you that the attempt at self-scrutiny was worth trying, and that the somewhat original position I shall be building up to is worth a hearing at the least, and that it further offers such a profusion of avenues for further exploration as to make it valuable.

1. The elucidation of molecular structure.

Chemistry is to a very large extent a molecular science. The first task in a laboratory is to determine the structure, viz. the arrangement of the atoms relative to one another, in a new molecule, whether natural or man-made. I won't dwell on the technical aspects, mass spectrometry and nuclear magnetic resonance are the tools of choice for such structural determination. What these methodologies provide however is not a three-dimensional model of the molecule, showing the atoms with their actual positions. Such model building comes afterwards, almost as an afterthought and is not that important. The information that chemists need and obtain from such structural analysis is of an altogether different type. Let me make an analogy. If the molecular model were to be built from a Lego set, the pieces of Lego are identified, one by one, and handed to the chemist, in the correct sequence for building up the appropriate structure, as contrasted with one of the numerous isomers.

The Lego units or modules I am referring to are methyl groups CH3, hydroxy groups OH, phenyl groups C6H5, carbonyls CO, etc. They are the radicals that were introduced in the 1850s by Dumas, Laurent, Liebig, only after a heated controversy. These radicals had the beneficial effect of removing chemistry from the status of a natural science and to prop it up into what Popper would term World 3, i.e. to replace positivist chemical species with what, at that time of the mid-Nineteenth Century, were merely Platonic archetypes. But radicals, these idealized concepts, have been fantastically productive, and they continue to constitute the core of structural chemistry.

Hence, I beg you to trust me if I assert, to summarize this first observation of what chemists really do in their laboratories, that

the everyday practice of chemists in elucidating the structure of an unknown compound produces semantic units, analogous to phonemes in language.

2. Building a molecule

The complementary standard activity of chemists is synthesis. We make molecules. We build both natural substances and artificial products. There are many reasons and many aspects to such synthetic endeavor. Chemistry is both a science and an industry. Advancement of knowledge demands that we devise all sorts of new and crazy structures, molecules in the shape of knots, molecules in the shape of the Platonic solids, molecules in the shape of dendrimers or footballs, etc. The pharmaceutical industry has a voracious need for synthesizing rare but biologically active natural products as antitumorals, antihypertensors, antiretrovirals, insect antifeedants, etc. As a rule, such molecules stem from plants and they desport complex and delicate architectures, demanding exquisite skill in their step-by-step construction. Not only do we need to duplicate the natural construct, in order to make it abundant, we also synthesize hundreds of variants in order to try and improve potency as a drug while reducing undesirable side effects.

There is no need to elaborate further here about the need for synthesis. let me only say that the activity, like that of structural analysis, occupies the core of chemical science. The everyday practice of chemists in performing the multi-stage synthesis of a complex molecule assembles parts or modules, near-identical to those involved in (and identified by) the structural determination.

This buildup process is akin to word and sentence construction from the units of speech.

Historical reminder

The ethanol molecule (alcohol) had its elemental composition established early in the XIXth century. It was shown as equivalent to the sum of an ethylene molecule and a water molecule. And indeed, it proved feasible soon thereafter to synthesize ethanol by adding water to ethylene. This continues also to be one of its industrial preparations.

However, from the standpoint of chemical thought, equating for synthesis' sake a molecule to the sum of another two stable, existing molecules is sterile. Productive, fruitful thought does not consider ethanol as C2H6O = C2H4 + H2O, but as C2H5OH, i.e. as the union of the two radicals ethyl C2H5 and hydroxy OH. Why is it productive? Because these two units alone account also for other molecules, (C2H5)2 (butane) and (OH)2 (hydrogen peroxide).

3. Mixing and combining

Allow me at this point a short excursion. Piagetan epistemology is the inspiration. Its contribution has been a typology of the gradual acquisition by children, as they grow up, of various mental skills then related, as so many basic elements, to our advancement of knowledge about the world.

I'll draw on two well-established behaviors, quite a few years apart in child development and yet related to one another in a deep sense. The first is echolalia, as an infant starts to speak. His or her first utterances do not make sense (and perhaps do not attempt at making sense, at communicating yet with parents and siblings). The child is learning the sounds of a particular language. After having heard speech around him, he makes noises with the twin characteristics of a chanting according to the phonology of the language being acquired, thus following the rules of what can indeed be termed prosody, and of emitting a series of phonemes pertaining to the language being learned. The point of the matter is that the string of phonemes, in such echolalia, is assembled seemingly at random but of course in rather strict conformity with the phonological rules for this particular language. It seems as if the maturing brain were combining modules of speech, both in playful and experimental manner. Gradually as the baby becomes more and more familiar, both with the discrete elements of human speech (phonemes) and with their further temporal organization into e.g. tonal modulation (phonology), he will start uttering syllables and words and he will start associating meanings to these phonic emissions.

And one might complement such data, regarding speech acquisition, with other and structurally similar observations, regarding for instance sensorimotor skills, as when the baby tries sorting out objects of various shapes, in a task such as slipping a disk or triangle through the appropriate slots in a box. The observer cannot help noticing that, rather than a moronic trial-and-error procedure, the child tries out, in more or less systematic manner, various configurations resulting from combining the elements at hand. Play here is inseparable from cognition.

A few years later, a young boy or girl will start a chemistry of sorts. The usual first step is play, at home or at kindergarten, with colors. The child tries out, from the paintbox mixing colors in an experimental spirit to find out the result from combining a red and a yellow, or a purple and a green, and so on. Of course, there is no essential difference between such an activity and the earlier echolalia. Both aim at establishing a store of knowledge about generating new species of artefacts from the combining of available discrete elements, whether these are speech units or chemical dyes.

Later on, usually in an age range of say 6 to 12, the child is sometimes presented with a chemistry toy kit. And all recommandations from the instruction booklet ignored, the boy or girl if left alone to play with the set of chemicals, will again investigate, in a playful inquisitive mood, whether combining two or several ingredients will have an interesting result, such as an unexpected color change, a brusque heating, and better yet a spectacular explosion to startle the parents with the magic prowess that the little brat had been able to endow himself with by his own wits! Children love to mix things together. Chemists perhaps are the segment of society allowed to continue indulging in such "childish" behavior.

Let me summarize this section thus:

The small child spontaneously mixes together substances, paints for instance, in what can be termed as "creative curiosity." Thus, from the standpoint of genetic epistemology, the combining urge is basic to chemistry.

4. The lesson from science history

Mixing and combining are intrinsic to chemistry from its very beginnings in Modern Times, when it parted from alchemy during the XVIIth and XVIIIth century, as can be documented from the books of Jean Béguin (Tyrocinium chymicum), Lémery, and Macquer. Already in this early period, there are numerous indications, from the various authors, that they conceptualized chemical species as the union of component particles. The art of the chemist consisted in separating these, using operators such as heat or menstrues (solvents), prior to recombining them in novel ways.

The evidence is widespread. The entry CHYMIE by Gabriel Venel in the Encyclopédie is very clear on this point. The whole chapter in the history of chemistry devoted to the devising of tables of chemical affinities in the 1770s partakes of the same notion of a systematic exploration of the binary interactions of various elements, to use a modern terminology. At that time of the 1770s, one can argue for the existence of an explicit research program to systematically chart the map of such binary compounds. One example makes that point very effectively. This is the book by Baumé: Chymie expérimentale et raisonnée. (Paris, Didot, 1773). It is a systematically organized description of both existing and unknown chemical compounds. The pages of Baumé's treatise swarm with negative observations such as those quoted below, pointing to an overall view of chemistry as a combinatorial science:

vol. 2. p. 145 "Borax and Lime Water. The effects both of lime and of lime water on borax are unknown"

"Borax and Sulfur. The effects of sulfur on borax are also unknown."

(p. 146) "Borax and Nitre. The effects of nitre on borax are unknown; one only knows that nitre does not detonate."

(p. 238) "Arsenic regule with distilled vinegar. I did not perform experiments to find out the action of distilled vinegar upon arsenic; but it is to be presumed that it would act no better than distilled water."

(p. 255) "Arsenic and borax. The effect of these two substances one upon the other is unknown.

(p. 304) "Nickel with nitre. The effects of pure nickel with nitre remain unknown."

volume 3, p. 176

"Platinum and nickel. We have not examined the properties of these two substances one on the other."

I can do no better to close this section with a quotation, from G.B. de Saint-Romain. It expresses very articulately the powerful metaphor of a chemical compound being analogous to a word: Atoms in a compound are, according to Saint-Romain, just like letters and syllables in a word:

There is a difference between the characteristics of simple elements, which are atoms, and the characteristics of substances that are composed. The first are immutable and incorruptible like atoms, and the second are changing and transient like compounds. (...) Thus, atoms being immutable by their stability, their characteristics have the same immutability, but substances that are composed of several distinct parts are liable to change as their parts change location, or separate entirely.

(...)

The letters that compose syllables and words provide a very appropriate example for explaining this doctrine. Letters are immutable and by changing location they change the syllable or the word, without any change occurring in the appearance and in the substance or essence of the letters, which always remain the same in whichever condition and in whichever arrangement they are put. Now it is certain that letters, of which there are twenty-four, provide what is necessary for the formation of all syllables, all words, all dictions and all speeches, and even all books that are composed in the world. And as words and dictions, and syllables and speeches and even books change without the letters undergoing any change, similarly large and small compounds change and corrupt, without the atoms changing and perishing in any way.

(...)

Letters are the true portrait of atoms with regard to the composition or decomposition of things. As the substance, essence and nature of words depend on syllables, those of syllables depend on letters and their arrangement. Similarly the substance, essence and nature of substances depend on smaller ones, that are called corpuscles, and those of corpuscles depend on atoms and their arrangement.

Let me sum up this section:

That chemistry is a combinatorial art was set, or recognized very soon after it parted ways with alchemy. Some of the XVIIth and XVIIIth century chemists were very lucid on this score.

5. Combinatorial chemistry

Another line of evidence is striking. Chemistry has acquired during the last few years a brand-new subdiscipline, known as "combinatorial chemistry." It bridges the science and the industry, since most of the present applications are in the area of drug design for the pharmaceutical industry.

First, let me explain the concept. Let us assume, for simplicity's sake, that we want to synthesize quadripartite molecules, consisting of the reunion of four modules A, B, C, D. Thus, the resulting product can be written A-B-C-D. To give a concrete example, if the component units A, B, C and D are aminoacid residues, A-B-C-D would be a tetrapeptide. This example points to the desirability of such a preparation, since numerous tetrapeptides are endowed with interesting and often beneficial biological activities, as hormones, neurotransmitters, etc.

Techniques are available for the automatized synthesis of all possible entities with the general formula A-B-C-D, separated from one another on small beads of polymer, from a variety of choices for each of the A-D modules. In the example of a tetrapeptide, and drawing from the store of the existing 20 natural aminoacids, one can thus synthesize (in a matter of a few days!) no less than 20x20x20x20 = 204= 160,000 tetrapeptides. This chemical library is then examined, using sensitive biological tests, for locating those (and only those) products that look promising, for future development as drugs.

This activity is not confined to the pharmaceutical industry, it can also be applied to many other goals, such as the systematic screening of novel types of materials, organic, inorganic or composite, for instance for discovery of new superconducting ceramics.

One can foresee quite a few other applications. This is a rather peculiar branch of science, unashamedly Edisonian, trial-and-error attempting to gain success (discoveries) through random chance rather than enlightened rational insight. The best analogy is the improbability of a monkey playing with the keyboard of a word processor and coming out with the text of Coleridge's "Kubla Khan."

Present-day chemistry includes explicitly a sub-discipline of "combinatorial chemistry."

6. Parallels with linguistics

To provide a first instance, let's return to this daily activity of chemists, structural analysis. It translates, for the most part, in the "reading" of a document, known as a nuclear magnetic resonance spectrum. This consists of a series of lines, occurring at various frequencies in the microwave range (MHz). Looking at such an array of discrete peaks, the chemist identifies them and declares to himself: "This is a methyl group, this is a phenyl ring, this is a methyl ketone, this is an aldehyde", and so on. In other words, his brain recognizes semantic pointers, in very much the same way as we identify phonemes, syllables and words in the acoustic spectrum (30Hz - 30 kHz). Thus, reading from an nmr spectrum of a molecule the underlying structural elements exploits very similar cognitive skills as employed by the competent speaker of a language.

But the parallel is yet more general. To hear words when listening to speech - or likewise for the musician to hear notes when listening to sound vibrations extending in time - means a transformation by the neurones in the brain of a continuum, of a physical disturbance that varies continually with time, into a string of discrete signals.

Chemists constantly make similar transforms. Arguably the most important and implicit among these, they refer constantly to atoms in molecules: This aporia makes the assumption that, even though the electronic distribution has been altered, an oxygen atom (say) retains its identity in environments as diverse as a carbon dioxide CO2 molecule, a water H2O molecule, or an iron pentacarbonyl Fe(CO)5 molecule. Transferability, i.e. recognition of atoms and radicals in molecules, is basic to chemistry. It is a property shared with all natural languages, where the analog is the recognition of phonemes.

Let me turn to yet another parallel. Very often, in the course of the multistage synthesis of a complex molecule, that of a natural product typically, the chemist brings in an (aesthetic) element of surprise and elegance, by way of a deep-seated rearrangement. To give a schematic idea of what it consists of, and taking as an example a molecule presenting in sequence four modules A-X-Y-B, a rearrangement would turn it into X-A-B-Y, for instance. As such, rearrangements are a kind of molecular wordplay, such as in the anagram or in the pun.

The next point I'll make, for being brief nevertheless is arguably the most important. Chemistry and language share creativity, viz. the ability to bring into existence brand-new statements. Language allows us to flesh out our thoughts, on whatever topic or form. It enables us to express the poetry of the world, and to give a linguistic representation of any natural object. Chemistry allows us to make artificial species, to such an extent that mankind now lives in a chemisphere of its own devising. From hair to toe, we are covered with chemicals that we have invented.

At this point, I'll stop the parallel. I could go on and on about the deep-seated analogy between chemical science and linguistic science. For those of you interested in further comparisons, I have written a whole book, La parole des choses, to deal with this issue. You'll find in it many more examples and arguments in support of the same thesis.

There is a deep analogy between chemistry and linguistics, that can be elaborated on fruitfully quite far.
Let me give a couple of examples.
All languages, a majority of which use syllables CVC consisting of a vowel V in-between two consonants C, have given themselves rules to exclude series of consecutive consonants, depending on their type and on their number. In the French language, for instance, sequences of three consonants are not rare, as in astronaute; apart of course of cases in which duplication through gemination occurs: with functional value in forms such as "courrais" or "mourrais". Still in the French language, the word with the greatest accumulation of successive consonants (which may have led to its premature disappearance, since it has become archaic) is dextre, with four consonants in a row. In other words, such as exprès or extrême the group of consonants bridges two syllables: [eks-trem]. An example in German would be the name Karlsruhe with four consonants bridging two adjacent syllables. Consider now this chemical analogy: with the exception of polymers, to a large extent derived from carbon chemistry, molecules, especially when they include atoms from the right of the periodic table, i.e. atoms that tend to be electron-rich, dislike bringing them together beyond a number of about three.
Take the oxygen atom as an example. Dioxygen, a relatively unstable molecule already, has two such atoms. The ozone molecule, with three oxygen atoms, is yet more reactive. It is capable, on account of this activation, of adding to ethylenic double bonds. The five-membered ring first formed , with three oxygens in a row, is so unstable that it rearranges spontaneously to an isomeric structure with one oxygen on one side, and a peroxo bridge on the other.
Let me provide a second example of the analogy and of its fruitfulness. A central phenomenon to chemistry is conjugation, that is to say the interaction between chemical groups due to electronic delocalization. A Michael acceptorsuch as, for instance, acroleinshows such conjugation between the carbonyl group and the ethylenic double bond. According to the circumstances of reagents and conditions, such a conjugated entity will either react as a whole and undergo 1,4-addition, or it will behave as the sum of its independent parts, and then undergo for instance 1,2-addition.
Languages show us rather similar phenomena with respect to syllabization. Where the French language says table, continually, the English language, because its phonologically deep structure is different, says table [teibl] in two syllables. Likewise, when we say [poepl] peuple, the English say it quite differently [pi:pl]. This interesting problem in syllabization is reflected in a German word such as haben [ha:bn].

Conclusions

I am reaching the end of my presentation and time has came to sum it up. In order to emphasize the resemblance between the two sciences of chemistry and linguisticsand they have in common the assembly of discrete entities into meaningful sets or strings, together with rules of assembly and also with control devices to adjudge if a well-formed product is apt to belong to the common stockI have used a pragmatic, what might be termed a constructivist approach, to infer from the observable daily behavior of present-day chemists some of the aspects of the underlying deeper structure.
I did not address explicitly the question central to this conference, that is to say if language as it is wired into our brains provides the as it were templatewhat a German might term the Ur Modellfor all the theories and the models of science. It is a difficult and complex issue, and I fully plan to address it some day, I have been aware of this prime question to philosophers of science for quite a while. Thus, I plead guilty for not yet having tackled this problem because I find its difficulty awesome.
However, there are other types of criticisms that can be leveled at my approach wich I'd like to deal with before closing. The first such critique is that the history of chemistry shows no trace of any influence from philosophy or, later on, from linguistics. The answer is that, during the crucial period between 1840 and 1860, when chemistry invents the structural theory that was so strongly needed then linguistics had a single overwhelming concern, that of diachrony, of evolution and history of language (which incidentally explains the extent of their influence upon Charles Darwin). Nevertheless, one can find a few points of convergence, then, between chemistry and philology. The theory of radicals is one such contact area. The benzoyl radical, for example, is conceived by analogy with the root in a word, with the observation that a series of words such as in English a glance, a glow, a gleam, glamour, to glisten, and a few other share the same root gl* (stemming ultimately from Indo-European, but that is another story).
A second critique would be methodological: how come I did not devote my attention to the language of chemistry, to its nomenclature for instance, since a priori this is where the language analogy ought to thrive and to be at its strongest? The answer to this statement is that the language of chemistry is indeed a mere nomenclature and that it is too impoverished a language to qualify as a correspondent of natural language. The true language of chemistry is to be found instead in chemical praxis, in what chemists actually do and in their everyday communication.

A third critique, and the last I shall consider before the discussion, is that I offered too sweeping an assertion when, I termed chemistry a conbinatorial art. One might declare likewise that physics deal with combinations that are all formed from elementary particles, that biology is also a combinatorial game based on genes and on cells, in short that any science of matter, living or inanimate, is at the same time a science of the combination of a small number of elements that make up the alphabet of that science. This objection was eloquently dealt with by Berthelot, when he wrote La chimie crée son objet. Chemical science is volontaristic, it is an exercise in will power submitted to the strong constraint of having to decide a priori which, out of an astronomical number of potential structures, one should try and bring to existence.