Semantic biology has been developed in stages since the 1970s. The mathematical papers appeared in 1974 and 1987 (Barbieri, 1974a, 1974b, 1987). The first biological paper was "The ribotype theory on the origin of life" (Barbieri, 1981) and the first general theory (the concept of evolution by natural conventions) was proposed in The Semantic Theory of Evolution (Barbieri, 1985). The idea that splicing and signal transduction are based on organic codes was introduced much later (Barbieri, 1998), and so was the term semantic biology (Barbieri, 2001). There are at least five new concepts in this biology (ribotype, organic codes, organic memories, reconstruction from incomplete information and evolution by natural conventions), and they have all gone unnoticed for a long time. Something similar happened to the ideas of Edward Trifonov, who has been calling attention to sequence codes since 1988, and perhaps that was not a coincidence. Things however seem to be changing. The discoveries of the sugar code (Gabius, 2000) and of the histone code (Strahl and Allis, 2000; Turner, 2000) have made some impact, and there is a growing awareness that real organic codes do exist in nature. A parallel development has also taken place in philosophy and in linguistics. In 1963, Thomas Sebeok proposed that semiotics must have a biological basis, and has campaigned ever since for a more general approach which today is known as biosemiotics (Sebeok, 2001). In the long run, the place of organic codes in nature is bound to be acknowledged, and biology will need a proper theoretical framework for them. Today we only have a preliminary outline of that framework, and this last chapter is going to underline it by showing that semantic biology can be summarised in eight propositions. More precisely, it can be expressed by four general principles and four biological models.
Embryonic development was defined by Aristotle as an epigenesis, i.e. a chain of one genesis after another, a step by step generation of new structures, and, apart from the brief interval of preformationism, this view has been endorsed throughout the history of biology, and still holds good. Despite its popularity, however, epigenesis is not an easy concept to handle, and for most practical purposes it is convenient to define it as the property of a system to increase its own complexity. More precisely, epigenesis can be defined as a convergent increase of complexity, in order to emphasise that the oriented character of embryonic development is qualitatively different from the divergent increase of complexity that may take place, for example, in evolution.
The historical association of epigenesis with embryonic development has been so close that the two terms are sometimes taken as synonymous, and this has been unfortunate because it has probably prevented biologists from realising that a convergent increase of complexity is a universal feature of life. The definitions of life which have appeared in the last 200 years (starting with Lamarck's entry) have produced a long list of supposedly essential characteristics (heredity, metabolism, reproduction, homeostasis, adaptation, autopoiesis, etc.), but none of them has explicitly mentioned epigenesis (for a list of such definitions see Appendix).
The first principle of semantic biology is precisely this: epigenesis is a defining characteristic of life. Any living organism is a system that is capable of increasing its own complexity. Even single cells, for example, can be defined as systems where the phenotype is more complex than the genotype (Barbieri, 2001).
Modern biology has already acknowledged that complexity is at the very heart of life, but semantic biology goes further than that. It states that what is crucial to life is not complexity as such, but the ability to produce a convergent increase of complexity. The first principle of semantic biology, in short, is nothing less than a new definition of life.
Complexity has a straightforward intuitive meaning (it is the opposite of simplicity), but its scientific history is littered with the corpses of discarded definitions. There simply is no hope of achieving a general consensus on a comprehensive definition of complexity, and this implies that any attempt to give a mathematical formulation to the problem of epigenesis is apparently crippled at the very beginning by lack of a definition.
The second principle of semantic biology has the purpose of cutting the Gordian knot of complexity by formulating the problem of epigenesis without any explicit reference to it. More precisely, the principle states that achieving a convergent increase of complexity is equivalent, to all practical purposes, to reconstructing a structure from incomplete information.
The reconstruction of structures from projections is a problem that arises in many fields (for example in computerised tomography), and its mathematics is well known. This makes it possible to calculate the number of projections (the initial information) that allows a complete reconstruction of any given structure, and so it is also possible to define precisely what a reconstruction from incomplete information is. Such a reconstruction amounts to producing structures that belong to the object in question but for which there is insufficient initial information, and this is equivalent to saying that the reconstruction is producing a convergent increase of complexity. In the same way, to say that the phenotype of an organism is more complex than its genotype is equivalent to saying that any phenotype is reconstructed from a genotype which contains incomplete information.
The problem, of course, is to show that such reconstructions are possible, but this has been achieved by a particular class of iterative algorithms (Barbieri, 1974a). We have, therefore, mathematical models that allow us to simulate the problem of epigenesis in a meaningful way, and hopefully to understand the logic of its various steps. The second principle of semantic biology, in conclusion, is a new definition of epigenesis. It states that epigenesis is a reconstruction from incomplete information.
The iterative algorithms that have been proposed for the reconstruction of structures from insufficient information differ from all other methods because they perform in parallel two distinct reconstructions: one for the structure matrix, and one for the so-called memory matrix, i.e. for a matrix where any convenient feature can be stored. This is why these algorithms are collectively referred to as the Memory Reconstruction Method (MRM).
With non-linear operations, for example, it is noticed that values appear at each iteration which are above the maximum or below the minimum. The space distribution of these "illegal" values is apparently random, but if they are recorded in the memory matrix, a new kind of information becomes available. It is seen that the illegal values are truly random only in some points, while in others they keep reappearing with regularity at each iteration. These last points are called vortices and, once recognised, they can be fixed and taken away from the number of the unknowns. This steadily decreases the unknowns, and when their number becomes equal to the number of equations a complete reconstruction can be performed in a straightforward way. The memory matrix, in other words, is a place where new information about the original structure appears, thus compensating for the incomplete information that was given at the beginning.
The memory space is the only space where such novel information can be found, and it follows therefore that any reconstruction from incomplete information is possible only if some kind of memory is used. In biology, this amounts to saying that any living system must contain two distinct types of structures: some have the visible role of the phenotype, while others act as depositories of information. The third principle of semantic biology, in short, states that there cannot be a convergent increase of complexity without memory. Or, in other words, organic epigenesis requires organic memories.
The information that appears in the memory space cannot be transferred automatically to the structure space, and can be used only by employing specific conventions (the recognition of vortices in the memory matrix, for example, can be used only if a convention gives a meaning to the corresponding points of the structure matrix). This is another conclusion that leads to a universal principle, because it is necessarily valid for all systems.
New information can appear in a memory only if the memory space is truly independent from the structure space, because if they were linked (as real space and Fourier space, for example) one could only have the same information in different forms. Between two independent spaces, on the other hand, there is no necessary correspondence, and therefore a link can be established only by conventions, i.e. by the rules of a code.
This is the point where meaning enters the scene as a necessary entity, because the operation of establishing a correspondence between two independent worlds is equivalent to attaching a meaning to the structures of those worlds. Independent worlds, in other terms, can only be connected by codes, and if independent organic worlds do exist in life, then organic codes must also exist (the protein world and the nucleic acid world, for example, contribute to life only because there is a genetic code that builds a bridge between them).
The fourth principle of semantic biology, in short, states that there cannot be a convergent increase of complexity without codes. Or, in other words, organic epigenesis requires organic codes.
It may appear that only the fourth principle introduces the semantic dimension into biology, because it is only there that codes and meaning are explicitly mentioned, but this conclusion would be short-sighted. Organic codes and organic memories exist in life only because they are necessary to produce epigenetic systems, and so the fourth principle is dependent upon the idea that every living being is such a system (the first principle). In a similar way, any one of the above principles is a complement to the other three, and therefore all contribute to the building of semantic biology.
In 1981, the Journal of Theoretical Biology published "The ribotype theory on the origin of life", a paper which proposed two novel ideas:
(1) an origin-of-life scenario based on ribosome-like particles, and
(2) a theory of the cell as a system of three fundamental categories, more precisely as a system made of genotype, ribotype and phenotype. It is worth noticing that the term ribotype has later become fairly popular in the scientific literature, but has completely lost its original meaning. Now it is commonly used only to label RNA classes, and not to convey the idea that the ribotype is a true cell category, with the same "ontological" status as genotype and phenotype.
The origin-of-life scenario was instrumental for the new theory of the cell, because it led to the the conclusion that the ribotype had an evolutionary priority over genotype and phenotype. More precisely, the scenario described a precellular ribotype world (not to be confused with the RNA world) where some ribosoids could act as templates (ribogenotype), others as enzymes (ribophenotype), and others as polymerising ribosoids (ribotype) that were responsible for the growth and the quasi-replication of the ribonucleoprotein systems.
The first precellular systems were therefore made of three categories (ribogenotype-ribotype-ribophenotype) that evolved in different ways, the ribogenes being replaced by DNAs, and the ribozymes being dethroned by protein enzymes. In this way, the ribogenotype became a DNA genotype, and the ribophenotype turned into a protein phenotype, but the ancestral ribotype evolved without giving up the original function of making phenotypic products from genotypic instructions, even when quasi-replication evolved into exact replication. The precellular systems, in short, were based on three fundamental categories, and gave origin to cellular systems that have been based on equivalent categories ever since. Hence the idea that all cells have a genotype, a phenotype and a ribotype.
The ribotype cannot be given up because there is no DNA and no protein that can do the job of protein synthesis. Proteins and DNAs are two independent worlds, and only an organic code can build a bridge between them. The genetic code is that bridge, and that code is a quintessential RNA business (a correspondence between messenger RNAs and transfer RNAs). This concept can also be expressed in another way. As proteins are the seat of biological energy, and DNAs the seat of biological information, so RNAs are the seat of genetic coding, i.e. of biological meaning.
The first model of semantic biology, in conclusion, is the idea that "The cell is a trinitary system made of genotype, ribotype and phenotype." A more detailed version is the semantic theory of the cell: "The cell is an epigenetic system made of three fundamental categories (genotype, ribotype and phenotype) which contains at least one organic memory (the genome) and one organic code (the genetic code)."
The definition of epigenesis as a reconstruction from incomplete information suggests that embryonic development can be simulated (in a very abstract way) by the reconstruction of a super matrix made of a growing number of individual matrices, each of which would represent a cell. In this case, however, the reconstruction could be performed with two different strategies: one where the memory information is extracted only from individual memory matrices, and a second one where it is also extracted from a collective memory.
The biological equivalents of these strategies are two different kinds of embryonic development: one which exploits only cellular memories, and another which also makes use, from a certain point onwards, of a supracellular memory (the supracellular memory can exist only from a certain point onwards, because it is built by embryonic cells which must have already gone through a transformation phase).
The first kind of development (being continuous or single-phased) is an evolutionary precondition for the second one (which is two-phased or discontinuous), and this suggested that there might have been a transition from the first to the second developmental strategy in the history of life. Such a transition, incidentally, could well correspond to the Cambrian explosion, i.e. to the appearance of all known animal phyla in a geologically brief period of time.
Apart from the Cambrian explosion model, the interesting point is that the supracellular memory predicted by the reconstruction method does have the characteristics that are normally attributed to the body plan: they are both structures that appear from a certain point onwards in ontogenesis, and which function as depositories of supracellular information for the rest of the body's life. What has become known as the phylotypic stage of development, in other words, corresponds to the appearance of a structure which acts as a supracellular memory, and which can rightly be called phylotype, because it is characteristic of each phylum. And the phylotype is an intermediary between genotype and phenotype at the supracellular level, just as the ribotype is at the level of the single cell.
The second model of semantic biology, in conclusion, is the idea that "An animal is a trinitary system made of genotype, phylotype and phenotype." Another, more detailed, version of the model is the semantic theory of embryonic development: "Embryonic development is a sequence of two distinct processes of reconstruction from incomplete information, each of which increases the complexity of the system in a convergent way. The first process builds the phylotypic body and is controlled by cells. The second leads to the individual body and is controlled not only at the cellular level but also at the supracellular level of the body plan."
The scientific study of mental development has produced two outstanding discoveries. One is that there is an enormous gap between inputs and outputs (the so-called poverty of the stimulus), because children receive only very limited and erratic inputs of words in their learning period, and yet in the end they come up with a complete set of rules. The second is that children are predisposed to learn any language whatsoever, and so must develop, at some stage, a common inborn mind, a set of general rules that Noam Chomsky (1965) called universal grammar. So far, these discoveries have not been properly explained, probably because they have only been interpreted with ad hoc hypotheses. It may be worth noticing, therefore, that in the reference system of semantic biology they are accounted for in a very natural way.
The poverty of the stimulus is only another way of saying that mental structures are reconstructions from incomplete information, i.e. that they are the result of epigenetic processes. The universal grammar, on the other hand, is a structure that appears in human development from a certain point onwards, and which remains for the rest of the mind's life as a core deposit of information. As there is a phylotypic stage in embryonic development which is common to all members of a phylum, so there is a specietypic stage in mental development which is common to all members of our species. We can also say that as the body plan is the phylotype of an animal group, so the universal grammar is the specietype of mankind. According to semantic biology, in other words, mental development has the same fundamental logic of embryonic development, because both are reconstructions from incomplete information, and therefore both require memories and codes. The differences that divide them are mere by-products of the fact that organic structures and mental structures do not have identical physical substrates.
The third model of semantic biology, in conclusion, is the idea that "The mind is a trinitary system made of mental genotype, mental specietype and mental phenotype." Another version is the semantic theory of mental development: "Mental development is a sequence of two distinct processes of reconstruction from incomplete information, each of which increases the complexity of the system in a convergent way. The first process builds the specietypic mind (the universal grammar), while the second leads to the individual mind."
The idea that cultural evolution can teach us some deep truths about organic evolution has been dismissed by representatives of very different schools. The argument is that cultural evolution works with a Lamarckian mechanism, and therefore must have been produced by natural selection only after the arrival of nervous systems that were extravagant enough to start playing Lamarckian games.
Before such high-table reasoning, semantic biology can only take refuge in good old stubborn facts: Does the genetic code exist? Is it an organic code? Do signal transduction codes exist? Are they organic codes? Do splicing codes exist? And cell adhesion codes? And cytoskeleton codes, and compartment codes, and so on and on. If we agree that these are questions about nature-as-it-is (as opposed to nature-as-we-want-it-to-be), and if the answers are what the evidence tells us, then organic codes do exist. And if they exist, they had origins and histories. If organic meaning belongs to organic life, we must humbly accept that nature is just made that way. This is what semantic biology is about.
Such a view may help us to clarify at least two points of some weight. The first is Max Delbrück's question: "How could mind arise from matter?" (Delbrück, 1986). The answer from semantic biology is that organic life and mental life are both concerned with reconstructing structures from incomplete information, and so there is no vacuum between them. The materials are different, but the logic is the same. Nature could produce mental codes with the same craft with which she had been producing organic codes for 4 billion years.
The second point is about sudden changes in macroevolution. As long as the rules of an organic code evolve individually, not much seems to be happening, but when they are all in place and a new code emerges, something totally novel comes into existence. And that does explain how sudden changes of great magnitude could have taken place in the history of life.
Apart from these speculative detours, the basic issue is about the stuff of life, and the fourth model of semantic biology is merely the logical consequence of acknowledging the experimental reality of the organic codes. A particular version of that model is the semantic theory of evolution: "The origin and the evolution of life took place by natural selection and by natural conventions. The great events ofmacroevolution have always been associated with the appearance of new organic codes."
One day the above eight propositions will probably be regarded as a preliminary step, and that will be excellent news because it will mean that semantic biology has grown into a mature science, finally able to match the complexity of life. But that is a long way ahead. Today we have only just landed on an unexplored new continent, and we are in for many encounters with the unexpected. The natives seem friendly enough, though. The announcements of the sugar code (Gabius, 2000) and of the histone code (Strahl and Allis, 2000; Turner, 2000) prove that some pilgrim fathers have already reached a few nearby territories, and this is just the beginning. Our children will have the fortune of knowledge. We have before us the struggle and the thrill of discovery.
Was this article helpful?