lördag 19 maj 2012

A Flirt with Models in Biology

In Biologists flirt with Models, says The Digital Biologist Gordon Webster, "little seems to have changed beyond the fact that the crisis in the pharmaceutical industry has deepened to the point that even the biggest companies in the sector are starting to question whether their current business model is sustainable."

Correlation has failed, which is no surprice sinse it is a faked one. It gives more questions than answers. Now biologists hope for the modeling!

Modeling can provide the kind of intellectual frameworks needed to transform data into knowledge, yet very few modeling methodologies currently exist that are applicable to the large, complex systems of interest to biologists.

The systems of interest to biologists tend to be far more refractory to modeling than the systems that are studied inphysics and chemistry. Living systems are more open, with feedbacks and feedforwards.
At the molecular level the fundamental processes that occur in living systems can also be described in terms of physics and chemistry. However, at the more macroscopic scales at which these systems can be studied as “biological” entities, the researcher is confronted with an enormous number of moving parts, a web of interactions of astronomical complexity, significant heterogeneity between the many “copies” of the system and a degree of stochasticity that challenges any intuitive notion of how living systems function, let alone survive and thrive in hostile environments. Biology is, in a word, messy.

Decoherent, says the physicists. Exactly what does this mean? Noncommutativity? Complexity is the problem, and how to create more coherent systems? Coherence and correlation are friends.

How master the complexity?
A tsunami of data in genetics due to HUGO? Linked to cure for cancer, which is a symptom of decoherence? No, kidding?

Looking back over almost a decade since the first working draft of the human genome was completed - while there have certainly been some medical benefits from this work, I think it is fair to say that the impact has not been on anything like the scale that was initially anticipated, largely as a result of having underestimated how difficult it is to translate such a large and complex body of data into real knowledge. Even today, our understanding of the human genome remains far from complete and the research to fill the gaps in our knowledge continues apace. The as yet unfulfilled promise of genomics is also reflected in the fact that many of the biotechnology companies that were founded with the aim of commercializing its medical applications, have disappeared almost as rapidly as they arose. This is not to say that the Human Genome Project was in any way a failure - quite the opposite.

 Ye, to learn how little we know? That we really need a theory of biology, but how get a start? The complexity must be mastered?

The crucial lesson for biology however, is that as our capacity to make scientific observations and measurements grows, the need to deal with the complexity of the studied systems becomes more not less of an issue, requiring the concomitant development of the means by which to synthesize knowledge from the data. Real knowledge is much more than just data - it does not come solely from our ability to make measurements, but rather from the intellectual frameworks that we create to organize the data and to reason with them.

So much, just to learn that? How much knowledge can we not gain by reinterpreting old results, but today we think the only way is by measuring and collecting more data. Piuh! Maybe we instead need fewer data? I cannot but admire old thinkers that came up with brilliant ideas formed out of fewer datas. The difficulty is how to solve out unnecessary measurements.

Scientists of all persuasions are (and always have been) modelers, whether or not they recognize this fact or would actually apply the label to themselves. All scientific concepts are essentially implicit models since they are a description of things and not the things themselves. The advancement of science has been largely founded upon the relentless testing and improvement of these models, and their rejection in the case where they fail as consistent descriptions of the world that we observe through experimentation. As in other fields, implicit models are in fact already prevalent in biology and are applied in the daily research of even the most empirical of biologists.

One such implicit model is the genome. We take it for real. Usually we forget that the clue is the phenotype, as a result of Nature and Nurture, a blend of Self and Environment? And the soul? Life is a trinity, to make possible evolution and unknowns?

The successes that explicit modeling approaches have enjoyed in biology tend to be confined to a rather limited set of circumstances in which already established modeling methodologies are applicable. One example is at the molecular level where the quantitative methods of physics and chemistry can be successfully applied to objects of relatively low complexity. Or  applied to biological systems that exhibit behavior that can be captured by the language of classical mathematics. Many if not most of the big questions in biology today deal with large, complex systems that do not lend themselves readily to these kinds of modeling approaches.

No, because life is coherent and quantum-like? It is open systems, not classic ones. Environment is part of the organism.

How do cells make decisions based upon the information processed in cell signaling networks? How does phenotype arise? How do co-expressing networks of genes affect one another? These are the kinds of questions for which the considerable expenditure of time, effort and resources to collect the relevant data typically stands in stark contrast to the relative paucity of models with which to organize and understand these data."

Ye, exactly! The theory is wrong?

If the data are measured carefully enough and can be weighted and scaled meaningfully with respect to one another, parametric divergences that can be detected between similar biological systems under differing conditions may reveal important clues about the underlying biology as well as identify the critical components in the system.

But this is classic physics. Why do we get so different measures? Why do we have all these curves? The Gauss distribution? Because Life is fuzzy, shows uncertainty. It is more quantum-like?

Have given genomic and proteomic profiling, biomarker discovery and drug target identification, screening methods...
they do tend to compound the central problem alluded to earlier of generating data without knowledge. Moreover, their limitations are now starting to become apparent... drug companies has seen the approval rate of new drugs continue to fall as levels of R&D investment soar. The field of biomarkers has also seen a similar stagnation, despite years of significant investment in correlative approaches. Cancer biomarkers are a prime example of this stagnation. Since we don’t yet have a good handle on the subtle chains of cause and effect that divert a cell down the path

No, because we have no universal theory for illness either.

We are forced to wait until there are obvious alarm bells ringing, signaling that something has already gone horribly wrong. To use an analogy from the behavioral sciences, broken glass and blood on the streets are the "markers" of a riot already in progress but what you really need for successful intervention are the early signs of unrest in the crowd before any real damage is done.

So all our health industry is an emergency without plannings? And we just continue to pay the price?

The lack of new approaches has also created a situation in which many of the biomarkers in current use are years or even decades old and most of them have not been substantially improved upon since their discovery.

So we just live in an illusion of progress?

Given the general lack of useful mechanistic models or suitable intellectual frameworks for managing biological complexity, the tendency to fall back on phenomenology is easy to understand. Technology in the laboratory continues to advance, and the temptation to simply measure more data to try to get to where you need to be, grows ever stronger as the barriers to doing so get lower and lower.

So, we just close our eyes? Another just as important question is why do not the patients follow the advices they get? Why do they not bother about their own health? What can possibly be more important? Do we at all take the human in consideration in our health care? 

What is it to be a human?

In effect what we have witnessed in biology over the last decade or so is a secular movement away from approaches that deal with underlying causation, in favor of approaches that emphasize correlation. However, true to the famous universal law that there’s no such thing as a free lunch, the price to be paid for avoiding biological complexity in this way is a significant sacrifice with respect to knowledge about mechanism of action in the system being studied. Any disquieting feeling in the healthcare sector that it is probably a waste of time and money to simply invest more heavily in current approaches is perhaps the result of an uneasy acknowledgement that much of the low-hanging fruit has already been picked and that any significant future progress will depend upon a return to more mechanistic approaches to disease and medicine.

Have we lost the battle, just because we refuse to handle with theory?

Biological models.
Biological modeling has to date tended to be almost exclusively the realm of theoretical biology, but as platforms for generating and testing hypotheses, models can also be an invaluable adjunct to experimental work.

That is the simulation thechnique, and this should be obvious.

One misconception that is common amongst scientists who are relatively new to modeling is that models need to be complete to be useful. Many (arguably all) of the models that are currently accepted by the scientific community are incomplete to some degree or other, but even an incomplete model will often have great value as the best description that we have to date of the phenomena that it describes.

Scientists have also learned to accept the incomplete and transient nature of such models since it is recognized that they provide a foundation upon which more accurate or even radically new (and hopefully better) models can be arrived at through the diligent application of the scientific method. Models  can clearly have predictive value, even when they diverge from experimental observations and appear to be “wrong”.

It is essential that the chosen modeling system be transparent and flexible. Transparency here refers to the ease with which the model can be read and understood by the modeler (or a collaborator). Flexibility is a measure of how easily the model can be modified. A model that is hard to read and understand is also difficult to modify and, very importantly in this age of interconnectedness, difficult to share with others. The importance of this last point cannot be overstated since one of the most often ignored and underestimated benefits of models is their utility as vehicles for collaboration and communication. It is interesting to note that biological models based upon classical mathematical approaches generally fall far short of these ideals with respect to both transparency and flexibility.

 In fact “biology” is essentially the term that we apply to the complex, dynamic behavior that results from the combinatorial [synergetic and coherent] expression of their myriad components. For this reason, models that can truly capture the “biology” of these complex systems are also going to need to be dynamic representations.

Ye, so why do we not then research the synergy and coherence? Why use classic math that focus on borders and decoherence, creating static pathway maps. 

 ...the elements of causality and time are absent...

An ideal modeling platform would offer a “Play” button on such maps, allowing the biologist to set the system in motion and explore the its dynamic properties. 

But the biology community in large part are unlikely to adopt modeling approaches that require them to become either mathematicians or computer scientists...

A movie, ye?

Finally, let us not forget that thanks to the internet, we live in an era of connectivity that offers hitherto unimaginable possibilities for communication and collaboration. The monster of biological complexity is in all likelihood, too huge to ever be tamed by any single research group or laboratory working in isolation and it is for this reason that collaboration will be key. With knowledge and data distributed widely throughout the global scientific community, a constellation of tiny pieces of a colossal puzzle resides in the hands of many individual researchers who now have the possibility to connect and to work together as never before, and to assemble a richer and more complete picture of the machinery of life than we have ever seen.

Ye, but the CV? The Big Ego? Can it be overrun? Again, what is it to be HUMAN?
What happen the day we have 40 supercomputers, and the computers find out the humans are idiots?

1 kommentar:

  1. Bradley K Sherman made a comment on linkedin:

    Biochemistry was reanimated at the end of the 19th century by redoing all the badly done experiments, very, very carefully. Big Pharma does not have the time or the inclination for this kind of project.

    The belief that biology is unique in its complexity is not really true. Chemistry is no more reducible to physics than is biology. No chemist has a model of hydrogen and oxygen that explains the properties of water. James D. Watson's chapter heading "Cells obey the laws of chemistry" is oft repeated and may well be true but unfortunately the chemists have yet to descend from the mountain with the stone tablets.