The theory for the origin of life that emerges from bioepistemic evolution is further developed on the author's newer site - Evolution and Origin
In its rank0 section, that site includes all the origin of life work given here and also extends it to describe chemical, evolutionary mechanisms for the emergence of "bacterial protocells;" although lacking genetics, such protocells would otherwise have resembled bacteria, both chemically and morphologically.
Application of Bioepistemic Evolution to Prebiosis
A summary of bioepistemic evolution, with special reference to its use in constructing a theory of prebiotic evolution and the origin of life from simple chemistry.
The discussion addresses the distinction between "data" and "information" and the significance of this distinction for biology. The nature of the geneand its definition. Evolution as the sequence - data input, data interpretation, selection of interpretation into knowledge and the replication of knowledge with variation and its encoding as data for transmission to the next generation.
The "evolving data systems" approach to evolution. The need for high-powered data input for prebiotic evolution. The sun as a high-powered data source and prebiotic oscillations in the primordial soup as interpretations of solar data. The need for self-created boundary around competing evolving systems.
Bioepistemic Evolution for Prebiosis
Bioepistemic Evolution
2.1 Genes, Biology and Bioepistemic Evolution
2.2 The meaning of "Data"
2.3 Evolving Data and Evolving Systems
2.4 Utilizing Evolving Data
2.5 Design and Adaptation
Copyright Statement
2.1 Genes, Biology and Bioepistemic Evolution
The basic concept of bioepistemic evolution is that evolutionary theory must be based on data, not genes. That basic concept seems unarguable and will not change. However, the idea is new and its ramifications remain in a state of flux. Consequently, even those few readers who have previously encountered bioepistemic evolution will need a review of some of its relevant aspects before applying it to the prebiotic situation. This can best be done by reviewing it in the familiar setting of biology and then extending the discussion to the prebiotic situation.
2.1.1 Genes
The bioepistemic interpretation of biological evolution can be introduced most quickly by giving a bioepistemic definition of the gene.
Genes are subsets of the data set defined by the nucleotide sequence of DNA. To qualify as a gene, the data subset must be so formatted that it can be interpreted by an organism into a distinct biochemical activity. An important implication of this definition is that, because biochemical activities are distinct and chemically separable from other such activities, genes may become manifest as distinct and distinguishable, biological phenotypes. (The author would like to refine this definition of the gene to maximise its generality and would like to hear any critiques.)
This definition does not portray genes as fundamental entities, basic to science, but as derived entities. Many other subsets of chromosomal data could be defined but genes are a certain type of subset - one that can be interpreted, usually by transcription and translation, into separate molecules possessed of biochemical activity. It is this process of interpretation into molecules with separate biochemical activities that gives genes the apparently atomic properties noted in breeding studies. Nonetheless, because this definition does not define genes as fundamental entities, it means that the evolution of organisms cannot be solely about genes. In part, this is because some of the biologically important data on chromosomes is extragenic, not found in any gene, an example being the sequence of genes on a chromosome. Such data is not denoted by standard genetic terminology, which simply treats genes as acting independently of one another. Hence, population genetics does not amount to, and should not be treated as, a complete theory of evolution. Genetics is empirically valid within its range but only within its range and it should not be applied outside that range.
In addition, many living things contain forms of evolving data that are not present in DNA sequence at all, important examples being sensory and social data. The amount of this data is often greater than that in DNA but it is involved in different ranks of evolution so that its evolutionary impact is entirely unrecognized by genetics. These different forms of data lead to different ranks of evolution operating in a hierarchy. Genetics alone cannot describe the hierarchies of evolution actually seen in real organisms, for which a more general approach is needed. Bioepistemic evolution, with its data systems and subsystems approach, can aim to be a more complete theory, a description of evolution in terms that are consistent with genetics but general enough to incorporate non-genic forms of evolution. For a fuller description of bioepistemic evolution, see the appropriate document on the web site or Hewitt (2002).
The purpose of this paper is to apply bioepistemic evolution to the prebiotic situation. The discussion will centre round possible prebiotic evolving systems, their data inputs, their bounding processes, the way input data is replicated, with variation, the means by which the data is interpreted into information then selected into knowledge and, finally, the ways in which that knowledge is output as encoded data that will be used as a data input for to the next generation of that evolving system.
2.2 The Meaning of "Data"
Data is not about objects or individual molecules, data is about pattern or arrangement, so we should not, properly, talk about the evolution of objects, chemicals or even organisms – in each case, the evolution applies to the data patterns that underlie them.
2.2.1 Data and Information
The concept of data has close links with physics and thermodynamics and can be given a fairly formal, statistical mechanical meaning, the best known of which was given by Shannon (1948). However, there is a common confusion associated with the words "data" and "information." During the 1940s these words were used more or less interchangeably and, in his definition, Shannon used the word "information," not the word "data." Terminology has moved on since that time and, even in school textbooks, the field of information technology (IT) now separates the meanings of these two words. Shannon's definition of information applies to what we now call data, while information is defined as "interpreted data." This modern distinction between data and information is significant both for IT and for biology but, unfortunately, many observers still fail to distinguish them correctly.
Bioepistemic evolution does make this distinction and regards Shannon's definition of "information" as a definition of "data." On this basis, genetic data is found in the pattern or sequence of nucleotides in DNA molecules, while genetic information is found in the biochemical activities of the compounds formed when that data is interpreted. In the most common situation, DNA contains genetic data while proteins interpret that data into the enzymatic activity of proteins. Thus, the distinction between data and information encapsulates the different roles played by DNA sequence and biochemical function in the gene.
As noted earlier, data is about pattern and Shannon, who worked in telecommunications, considered the data patterns retained during transmission down a noisy communications channel. A telephone signal might enter one end of an electrical cable, be transmitted along its length and then be received after emerging from the other end. However, electrical cables are noisy. They can generate their own signals through thermal randomization or receive other, external signals that permeate through their insulation. The result is that signals degrade during communication and Shannon considered how much of an input data pattern would be detectable in the output.
Much of biological evolution can be considered in similar terms. One might, for example, consider the evolving data patterns communicated down the generations – with the variation in the data treated as noise in the channel. Hence, the genetic data pattern of an offspring arises from the data pattern inherent in the sequence of nucleotides transmitted from parents varied by random mutational factors that would be analogous to noise in a cable.
Bioepistemic evolution extends biological evolution in that it recognizes that other data patterns, such as social data, can also be transmitted down the generations and become subject to evolution. Bioepistemic evolution recognizes all such data transmission and does not accept that one form of evolving data is more important than another. In particular, bioepistemic evolution denies that genetic data and biological evolution are prototypical of evolution in general or that they are more fundamental than non-genetic data and non-biological evolution.
Bioepistemic evolution is concerned not just with data but also with the ways in which evolving data is processed and organized. The aim is to track and following the fate of data during each evolutionary generation and, so far as possible, to so construct bioepistemic evolution that it becomes independent of the forms of data or types of evolution being described. In other words, it is hoped to describe evolution in a way that applies equally to all evolution, whatever form the evolving data takes, such as biological or social, and whatever evolving systems (genes, organisms, superorganisms, groups, cultures or otherwise) become manifest as a result of its data processing.
2.3 Evolving Data and Evolving Systems
The science of information technology describes data processing in terms of "systems" and it is both natural and useful to consider evolution in those terms. Bioepistemic evolution approaches evolution in this way and analyzes evolution in terms of data processing systems.
2.3.1 Evolving Systems
An evolving system is a data system but a data system is not necessarily an evolving system. Any data system can be described by three sets of attributes, by its data inputs, by the data processes that occur within the system and by its data outputs. An evolving system is a data system with inputs, processes and outputs that are such as to generate a process of evolution and successive evolutionary generations. Bioepistemic evolution is concerned to identify those properties and specific characteristics that make data systems into evolving systems. It seeks to describe the data inputs, data processes and data outputs required if a data system is to be an evolving system.
2.3.2 Evolutionary Hierarchies
The data output from one data system can be used as input for another. This means that data systems have a very important general property, namely that they can be, and usually are, embedded within a hierarchy of systems. In other words, that data systems are usually assembled from subsystems which are, themselves, built from sub-subsystems etc. Most commercial data systems are hierarchical in this sense and bioepistemic evolution identifies such hierarchies in biological evolution. Thus, individual organisms are evolving systems in their own right. What is more, organisms can contain evolving subsystems and organisms can become subsystems within larger evolving systems. More concisely, an organism is an evolving system, can contain evolving subsystems and can become a subsystem to a larger evolving system. In short, evolution occurs in hierarchies.Our own species manifests this hierarchical nature of evolution very well. The human organism is an evolving system which contains other evolving systems. Humans contain genes that are subject to evolution and Darwinian machines, the brain and the immune system, which process exogenous data. Thus humans contain evolving subsystems. Humans can also become subsystems to larger evolving systems of groups and cultures. Thus, the biological evolution of the human organism is merely one rank in a multilevel hierarchy of evolutions.
Bioepistemic evolution recognizes the existence of this hierarchy in human evolution. It aims to identify those ranks and to understand how they interact with one another.
2.3.3 Providing Energy and Power to Data Systems
Data systems require power. All systems, subsystems and individual data processing operations must have power sources. All data systems and subsystems must have power inputs and outputs, through which free energy flows to drive the system and its processes. This is a very basic requirement and it applies, without exception, to all data processing devices and all data processes everywhere.This requirement for a power source is so fundamental that the diagrams used to describe IT systems are usually simplified by omission of their power inputs and outputs. This makes the diagrams much simpler and readers are expected to assume that a power supply will be needed to operate the system. However, the requirement for a power supply places important constraints on feasible mechanisms of early prebiosis. Accordingly, this article will explicitly discuss the power and data inputs into prebiotic evolving systems. As will be noted later, a prebiotic evolving system must receive both data and power from the same source so these inputs are best discussed together. The power sources of possible prebiotic evolving systems need to be identified, as do the power source(s) of each of their data processing operations.
The data inputs for evolving systems are normally the outputs from selection during the previous generation. This cannot apply to early prebiotic evolution because, at that time, few, if any, previous generations of evolution will have occurred. Hence we will ask, "what is the data input for the very first stages of evolution?"
2.4 Utilizing Evolving Data
Bioepistemic evolution is concerned with how the data in evolving systems is processed. In general, one finds that, for all evolving systems,
Data is interpreted into information which is selected from to produce knowledge which will be replicated with variation and encoded as input data for the next generation of evolution.
(This sequence is based partly on standard IT terminology, discussed above, and partly on evolutionary epistemology, see "The Nature of Knowledge," (Plotkin (1994)) which shows how knowledge can be viewed as the general product of evolution.)
The most scientifically familiar example of this sequence comes from molecular genetics, in which the data within genes is interpreted into the information of biochemical activities that are selected by natural selection to produce the level1 knowledge that is encoded as data and used as input for the next generation of evolution. The stages in the sequence are data, information and knowledge while the processes are interpretation, selection and replication with variation. The same stages and processes arise in other ranks and orders of evolution but the details of each vary.
An evolving system will often involve entities that act upon the data input and perform the interpretations and selections involved in maintaining the activities of the system. Since these entities act on the data input they are known as actors. An important conclusion from bioepistemic evolution is that, while evolving systems will often be selected by competition with other evolving systems, the separate actors within an evolving system will cooperate and even behave altruistically toward one another.
2.4.1 The Evolutionary Hierarchy in Humans
Bioepistemic evolution groups the evolving systems associated with humans into ranks depending on the type of data they use and, especially, according to the location of the selection used to produce knowledge.Rank1 evolution refers to biological evolution and occurs in the biosphere. The evolving systems are organisms and use the data on DNA, much of which is genetic data. That data is interpreted into biochemical activities which assemble into the biological activities of organisms. In rank1 evolution, selection is usually taken to be by natural selection or sexual selection and takes place in the biosphere.
Rank2 evolution involves data that originates outside an individual organism but is then received, interpreted and selected within that organism. Selection occurs within the receiver and the evolving system is the brain or the immune system. Most data for rank2 evolution is gathered by sense organs and interpreted and selected into sensory knowledge, level2 knowledge, within the Darwinian machine of the brain but the immune system also produces level2 knowledge.
Rank3 evolution takes input from social data, also known as cultural data, and produces social knowledge. The evolving systems of rank3 evolution are social groups composed of individuals communicating data to one another. Within social groups data is transmitted by communicative transmitters. In other words, the transmitters within social groups select the data they wish to transmit on the basis of the interpretation they want to communicate to the receiver. The data receivers gather that data through their own sense organs and interpret it within their brains but much of the selection has already been made by communicative transmitters. "Selection by transmitters," is the factor distinguishing rank3 evolution, producing level3 knowledge, which is social or cultural knowledge. By contrast, "selection by receivers" produces rank2 evolution and leads to sensory knowledge.
Rank4 evolution involves subcultural data and will not concern biologists.
Bioepistemic evolution does not conflict with biological evolution. Indeed, rank1 evolution is biological evolution but the processes of molecular genetics are too complex to have arisen by chance alone. Some earlier prebiotic, prebiological evolution must have occurred before the emergence of organisms, genes or biology as we now see it. That earlier evolution must have been the very first form of evolution to have occurred on the earth and be the first rank in the hierarchy of bioepistemic evolution. Prebiotic evolution is rank0 evolution and is the topic of this essay. The discussion of prebiotic evolution presents several special problems for the "systems" view of evolution.
2.5 Design and Adaptation
Commercial IT systems incorporate complex devices that have been engineered to perform specific data processing tasks. These devices are machines that were intelligently designed by engineers to fill a specific role. All such machines have separate data and power inputs. They all use the input data to control the effects generated by the power input, often through actuators, such as the arms of manufacturing robots.
2.5.1 Design and Adaptation in Biological and Prebiotic Evolution
Biological systems incorporate complex enzymes that are machine-like data processing devices that replicate the data in DNA sequence or interpret that data into biochemical activities. These data processing enzymes draw their power from the cell's energy supplies and their data from DNA. Examples of such data processing enzymes are DNA polymerase, the enzyme that copies DNA sequence and is thus involved in replication of data, or RNA polymerase, the enzyme that use the data in DNA sequence as a guide from which to make RNA, or the proteins of the ribosome, a large complex of RNA with protein/enzymes. Both these latter enzyme use RNA sequence as a guide to make protein and are thus involved in the interpretation of genetic data. All these systems draw their power input from high the energy polyphosphate phosphate, ATP, or similar compounds. Always, in these systems we see a separation between the data input of the evolving system and the power input.Evolutionary selection is also a data process and requires a free energy source. In the higher ranks of evolution selection has become adaptively designed. For example, Darwin's sexual selection requires functioning sense organs which draw their power from respiration. The clonal selection found in the immune system or the neuronal group selection theorised by Edelman as part of brain function both need energy inputs, from respiration, that are separate from the data being selected. These are complex data processing devices that could not have arisen by chance. The theory of evolution requires that they arose by progressive adaptation - that they were designed by the prior evolution that had occurred up to that point.
Natural selection is a selective data process but it is fundamentally different in character from those previously mentioned processes. Natural selection is governed by chance and thus draws its free energy supply from the increase in entropy associated with randomizing events. Thus natural selection has only one input which provides both data and power. As will become clear in section 3.2.1, the presence of separate power and data inputs into data process is a mark of previous design, be it adaptive or intelligent design. Natural selection has only one input, which delivers both data and power.
Therefore, natural selection may occur even in the absence of any previous history of design but other selective processes may also arise and natural selection alone is not sufficiently distinctive of early prebiosis. There is a need to say something more general about the differences between the kinds of process that might be found in prebiosis and those that are often found in modern biological systems. We need to say more about the things that are actually evolving and the processes involved. The most accessible distinction can be made clearer by distinguishing between "controlled" and "uncontrolled" processes in chemistry.
2.5.2 Controlled and Uncontrolled Chemistry
The distinction between "controlled" and "uncontrolled" chemistry can be demonstrated by comparing a chemistry laboratory with the outside world. In a laboratory, chemists can, achieve many results could never occur by chance in the outside world. They achieve their results by carefully controlling the purity of their chemical reagents and the laboratory conditions under which their reactions are performed. This is the difference between controlled and uncontrolled chemistry. Uncontrolled chemistry is the kind of chemistry that occurs, or might reasonably be expected to occur, within the complex mixtures expected in the real world outside. Uncontrolled processes are those that can occur even though no external agency has purified the chemicals involved, adjusted the temperature, pH or rate of stirring or added reagents at exactly the right time.By contrast, controlled chemistry includes all reactions and processes that require control of chemical composition or any variable reaction condition. A process is a controlled process if the chemicals involved must be purified or if the physical environment of the reaction must be carefully maintained to achieve the desired results. Any step that requires measurement of variables or precise energy inputs or purified reagents will make a process into a controlled process. A similar distinction between controlled and uncontrolled processes could be applied to physics. The general point is that any controlled process requires prior data inputs.
This discussion has three important implications.
First, the idea of previous adaptive design cannot apply to early prebiosis, because that period has no evolutionary history. Hence, an acceptable scientific theory of prebiotic evolution must describe the interpretation, replication with variation and selection steps of prebiotic evolving systems purely in terms of chemistry and physics. It can make no reference to ideas from theology because, if it did so, it would cease to be a scientific theory. Equally, it can make no reference to ideas from biology because, if it did so, it would no longer be a theory of biological origins.
Second, since controlled processes require prior data inputs, they will be prohibited as mechanisms of prebiotic evolution. Only uncontrolled chemical and physical processes, can be invoked during that phase of evolution. Modern biological systems are exquisitely controlled - they clearly involve controlled chemistry and physics, but their control demands prior data inputs and, therefore, a previous history of adaptive design. A valid mechanism for early prebiotic evolution must invoke uncontrolled chemical and physical processes for each data processing step of replication, interpretation and selection.
Third, given that controlled processes arose by adaptive design, they were selected for and did not arise by chance. A successful theory of prebiosis needs to suggest, by hypothetical examples, how evolutionary selection, operating using uncontrolled chemistry and physics, might produce and be replaced by adaptively designed, powered data processing devices that resemble those found in modern biological systems. That is, how evolutionary selection, applied to uncontrolled organic chemistry, might lead to the controlled processes of biochemistry.
2.5.3 Boundaries around Evolving Systems
Evolution is about selection but, before an evolving system can be selected it must be identifiably distinct from other evolving systems. For them to be distinguishable, the evolving systems must be bounded and, what is more, they must be self-bounding. In other words, the evolving system must interpret its data input in such a way that it becomes identifiably distinct from other, competing evolving systems. This concept of self-bounding is emerging as an important aspect of bioepistemic evolution. The system interprets its data to form a boundary and the boundary effectively defines the data set. The data set that interprets itself to create its own boundary becomes a unit of evolutionary selection.In biology, this means that the component actors of the evolving system, its proteins, build a boundary around the system, its skin, wall or outer epidermal layer, that both ensures mutual cooperation between the actors of the system and allows competition with other evolving systems. Such boundaries define evolving systems as separately identifiable entities, such as organisms, that can compete with one another and prove fit or unfit during the competition. To phrase this another way, it means that the various genes are interpreted in such a way as to produce actors, mostly proteins, that normally build a physical boundary around the organism, such as a cell wall or an outer epidermal layer. (Other ranks of evolution also need to be self-bounding and readers interested in the boundary forming processes of human social evolution might like to read my study into the nature and origin of humour.)
Understanding the bounding processes of prebiotic evolution and the formation of biological boundaries presents special problems that are very similar to those involved in understanding data processing in prebiotic evolution. Prebiotic evolving systems need to compete with one another and must be bounded but how can prebiotic evolving systems come to be bounded by uncontrolled chemistry and physics? How might adaptive design, arising from uncontrolled chemistry and physics, lead to the controlled processes that set boundaries around biological systems?
This article will propose a mechanism for prebiotic evolution that begins with organic chemistry and with uncontrolled chemical and physical processes. It will identify the sun as the most likely source of both power and data during prebiotic evolution. So far as possible, using bioepistemic evolution as a guide, it will identify the most likely interpretative and selective processes to produce level0 knowledge when the sun acts as both a data and power source on the primordial soup. It will discuss how this mechanism can lead to adaptively designed data processing. In the same vein, it will discuss boundary formation in uncontrolled prebiotic evolving systems and suggest how that led to boundary formation as a controlled, chemical process.
2.5.4 Note about "Vehicles" and Evolution
Evolutionary theorists (e.g. Dawkins, 1986) sometimes use the term "vehicle" to indicate that an organism essentially functions as a carrier for its own genes, while giving it a meaning that, in some ways, resembles that of a bounded evolving system. However, the author dislikes this word "vehicle," which misses the evolutionary point and connotes little that is accurate about evolving systems. The word "vehicle" implies some kind of mobile machine but evolving systems are not actually machines, do not need to be mobile and do not need to be bounded in physical space. This author prefers the term evolving system to "vehicle."Evolution is about data and the finite data sets into which that data formatted; it is about the boundaries that define those data sets and the evolving data systems that are linked to them. Evolving data sets are interpreted and replicated within whole bounded systems in competition with other evolving systems. Evolving systems are fit or unfit as whole systems and are selected as whole systems. In principle, bioepistemic evolution does not require that evolving systems contain genes but, even when they do, as in biology, the evolving system that is an organism is no mere assemblage of genes, no mere gene carrier. An organism exists as an entity in its own right and can only be described or defined in terms of its holistic properties. It is those properties that determine its fitness, whether it is selected and whether it reproduces.
Copyright Statement
© John A Hewitt MA PhD (Cantab.)
The work described here was performed as an independent investigation by John A Hewitt who asserts the right to be recognized as its author and as the originator of the novel ideas presented here. The topics to which this claim applies include, but are not limited to, the application of bioepistemic evolution to the prebiotic situation, the discussion of the sun as a data and power source for prebiotic evolving systems, the recognition of sun-induced chemical oscillations as information carriers subject to evolutionary selection and to the theories for the origin of biochemical pathways and self-oscillatory, allosteric and cyclic biochemistry that result.
This study is a greatly extended version of a poster originally presented at the Royal Society meeting on conditions for the emergence of life on the early earth, London, 13 & 14 February, 2006. This internet version was made available on 6 September, 2006. Comments and criticism are solicited - see the "contact & copyright" link for contact details.
