Quantum Rhetoric, An Introduction

Ideas are powerful, particularly ideas that help to shape our understanding of what reality is. Stuart Brand once said that the only real news is the revelations that science brings us about reality. What I’d like to do with this blog is to take a host recent science news and tell you a story about that news. I’d like to build a new framing for how we think about reality, one that’s based in real, solid news from the scientific frontier.

This aim is, by nature, philosophical. The news that I present and the framing for it — none of these things are novel. I’m not the first person to write about the multi world interpretation or to explain how spooky action at a distance has been observed. The framework for thinking about it, isn’t something I’ve seen elsewhere. I’ve been calling it quantum rhetoric. Quantum because it’s rooted in what quantum behavior tells us about the nature of reality; rhetoric because it provides a coherent, rich vocabulary for thinking and conversing and understanding each other and our world.

If physics is the study of the laws that describe the behavior of all things, quantum mechanics is the search for equations that generically describe the actions, movement, and relationships of the smallest, most fundamental particles in existence.

An interesting property has recently been discovered about quantum mechanics. That’s that many things we can know about the minute movements of these tiny particles aren’t fixed, but rather probabilistic. Our modeled understanding of an electron’s movement, for example, used to be simply expressed as a set of concentric rings, each ring representing a different “level” of energy. These rings, in a sense, are a lie. Electrons don’t move in set trackways about a photonic center. Their true location about a nuclei is better expressed as an electron cloud – a map of probabilities that depicts where an electron resides at any given moment. The behavior of the electron isn’t tracked — it’s probabilistic.

It turns out that probabilities are a fundamental property of quantum behavior. Feynman shows that light,its particle wave duality that’s never been fully resolved, at least not at a high school physics level of understanding, is largely a probabilistic process as well. The location that a light beam ends up is but the sum of its probabilities.

There’s some fishiness to these probabilities, however. A fishiness that has long perplexed even the most leading of lights in the physical field. Einstein called it “spooky action”. In many ways, this observed fishiness breaks laws — the speed of light limitation for how fast information can travel, probability itself. You can see this fishiness at work in two ways. David Deutsch, in his book The Fabric of Reality, outlines one such experiment.

The dual slit experiment, as it’s called, involves setting up a screen in front of a light source such that light can pass through two slits in the screen. Light particles are then sent through the screen, one particle at a time, kind of like a ball being thrown between slats in a fence. If the ball was covered in paint and there was a white wall behind the fence, when you throw a single ball, what pattern would you expect to see on the white wall?

The naive, normal physics of everyday things rules would suggest that you’d only see a single mark from the ball, or one shaft of light in the case of light particles. The reality, however, is much stranger. What you end up observing is a wave like interference pattern. One particle was sent, but it appears to be interacting with unseeable and immeasurable other particles, leaving not a single shaft of light on the wall, but instead a wavelike pattern, of interfering ripples.

You can solve this “problem” of waves and get the light beams to act in a rational manner in two ways. First, by closing off one of the slits in the fence. Second, by placing a sensor at one or both of the slit openings, such that you can track with absolute certainty which of the two slits the single light particle passes through. In both of these cases, the light pattern on the wall resolves to a single shaft of light. Something about observing the light passing through the slat seems to “fix” the strange interference problem.

Why? How? These are questions that physicists have some theories about, but as of yet have not been able to settle on a single, unifying framework for understanding why light seems to act like a probabilistic wave in one instance, and a normal “ball” in the other.

Further experimentation has only led to more puzzles of the same type. One such example is entangled particles. What we now call quantum entanglement was most famously observed by two particles being bound in extended unknowablity. The classic entanglement experiment goes as follows.

Take two particles that have been blasted apart by a laser. These particles are known to be spinning in opposite directions, yet which particular particle is spinning left and which spins right, is unknowable at the time that they are split. This state, of being in a joint unknowingness, is called quantum entanglement. The particles are entangled in that their final destiny, left or right, is bound to the fate of the other, yet both are in a state of suspended decision.

This may seem like a strange way to talk about the spin of two particles. If these were balls in an urn, one black and one white, it’d be a simple calculation of probability to figure which ball you might get when pulling one out at random. When you reach in, you have a fifty fifty chance to get the white ball. The same goes for the black ball — fifty fifty. Once you’ve drawn the white ball, you know with absolute certainty that the ball you’ve left behind must be black. So which ball will you get when you pull one out? It’s a toss up.

Entangled particles are like this urn, with a black and a white ball, with one exception. Instead of reaching into the urn, let’s say that you bounce one of these unknown particles through a filter. The filter is set up such that only left-spin particles can pass through it. If we took a ball, while blind folded, from the urn and then threw it to a judge who would accept only white balls, we’d expect the ball to be accepted half of the trials. We’d expect the same for particles passing through a “left only” filter. What actually happens is far stranger. Instead of a fifty percent pass rate, we get one hundred percent. If you change the filter, from a left filter to a right one, your pass rate remains at 100%. The particle that passes through the filter is the right spinning particle and the particle not passed through spins left.

That’s the same as saying I’m going to throw balls at a white judge, and then only drawing white balls. Then you say, ok, I will only throw balls at this black judge, and then proceeding to only draw black balls from the vase. The vase still contains two choices, black or white, but you’ve managed to predict with 100% certainty which ball you will draw based on the type of judge that looks at your ball.

How can these particles manage to spin the correct direction, every time? Are the particles communicating? Are you the luckiest physicist in the world? Einstein was perplexed by these results, so much so that he termed the phrase “spooky action at a distance” to describe how these particles managed to spin exactly opposite yet the correct way for the filter every time.

Every time.

These results of entanglement and the infallibility of the filtering mechanism have been replicated at great distances. The spooky action persists. Are the particles communicating? Are they traveling back in time? How is it that a single physicist can be so lucky, so many times in a row?

If the particles are communicating, physics has a problem. That problem is called the speed of light. Einstein himself showed that nothing, especially not information, can travel faster than this speed. The synchronized behavior of the particles is instantaneous, however. There is no delay between measuring an unknown particle with a “left spin” filter (thus making it left spinning) and observing a particle with right spin. If lightspeed still matters, then these particles aren’t communicating.

What else might explain the physicists unfathomable luck?

David Deutsch posits that the explanation for this is simple — that we don’t in fact live in a universe, but rather a multiverse, a multiverse constructed out of all the possibilities that can physically exist. Our multiverse is defined by the probability set. One universe of a black ball drawn, another with a white.

This multi world interpretation, or MWI as it’s colloquially known among the physicist set, explains the spooky action as follows: entangled particles are an urn of two balls, one black one white. There are two different universes that exist, forward in time. One universe in which you pull a black ball. Another in which you pull the white.

You can deterministically decide which universe you’d like to exist in. You do this by picking a filter through which you would like to observe the world — either the black filter or the white filter. Selecting the black filter and then applying it to the urn, or entangled particles, fixes you into the reality where the ball is black.

The dual slit experiment and the filtered entangled particles share one key commonality: the power of observation and its undeniable role in fixing the observed results. This is important. The probability of what ball will be picked has moved from the random chance of the universe to an explicit choice on the behalf of the observer. The observation is the sound of a universe, of two possible, being chosen.

Quantum rhetoric is simply this: the reckoning of existence in a branching multiverse, that becomes fixed into a coherent reality.

The Power of Explanation

What I’m hoping to do over the course of several posts is to lay out a foundation for a new way of thinking about ethics and reality, definitively casting a vote in favor of one interpretation of reality. My goal is persuade popular opinion in favor of what is currently considered a niche outlook.

How does one move an idea from the fringes to mainstream, though? Why should you change your mind about how you think about reality? This question of persuasion seems like an important place to start.

In a lot of ways, I am re-tracing the footsteps of David Deutsch, the British physicist who penned a couple of books around the topic of reality. Much of my own thinking on the concept of reality and explanation is largely derivative of his. So let’s start at a similar place, then, with a movement towards understanding the role of explanation, of narrative, as a method of transmitting and cementing paradigmatic thought.

A Survey of Theories about Information

I tweeted at @vgr a few days back about Deutsch’s theory of explanatory power, and he responded with a list of some other theories of information. 

vgr’s informative list of information theories

Admittedly, Venkat doesn’t (yet) have a good grasp of what I mean by Deutsch’s theory of the power of explanation and therefore doesn’t exactly offer up comparable theories or interpretations, but since these are the methods of information theory that he’s using to judge how to change his mind, it feels in scope to at least walk through what these different theories can tell us about information, and how humans process it.

As a way of classifying these, there are two distinct domains about which these theories — Kuhn’s paradigm shifts, Schmidhuber’s compression progress, Occam’s razor, Deutsch’s explanatory power, Kolmogorov information — propose explanations. We can roughly divide them into two groups: the first 1) concerning the measurement and encodeable size of information and another 2) concerning the validity of an explanation. The first, in other words, provides a classification for ideas and thoughts based on how much pure physical matter of bits I need to send you in order to wholly and completely communicate an idea (measurement and density of encodeability); the second deals with what information do I need to present you with and possibly in what order or framework, in order to change your mind about a topic (explanatory validity).

With these two buckets and a bit of explanation as to what the theories entail, we can now categorize these quite effectively.

Let’s start with Kolmogorov. In a way, Kolmogorov wholly defines the first category of theories — the encodeability of an idea. What does it mean, though, to be able to ‘encode’ an idea? The classic, computer science focused explanation usually involves saying something along the lines of “consider a photograph”. You can either represent it as a matrix of color points or in JPG format. The first, or matrix representation, often takes up orders of magnitude more space, in terms of bits, than the ‘condensed’ JPG format. Kolmogrov was concerned with the ultimate ability to take a complex idea such as a photo and represent it in the smallest number of bits. You can then judge information or an idea based on how compactly it can be expressed. The ‘complexity’ of an idea is measured by how few bits you need to transmit over the wire. The lower the number of bits needed to represent it, the lower its stated complexity.

Schmidhuber took this concept of the compressibility of information and developed several examples of how ‘low Kolmogorov complexity’ ideas can still, in reality, be quite complex.

As an aside, my favorite example of these is his Femme Fractale, an equation for a drawing that when executed, creates a set of intersecting lines. Schmidhuber then goes on to explain how insight or creativity can be derived from this relatively simplistic pattern, eventually highlighting one particular pathway that, to him, is evocative of women’s silhouettes. The syntactic expression of the original drawing (top left, below) is as follows: “The frame is a circle; its leftmost point is the center of another circle of the same size. Wherever two circles of equal size touch or intersect are centers of two more circles with equal and half size, respectively. Each line of the drawing is a segment of some circle, its endpoints are where circles touch or intersect”.

Tracing femmes in a low Komolgorov complexity fractal, source http://people.idsia.ch/~juergen/femmefractale.html

Schmidhuber’s compression progress builds upon Kolmogorov’s idea of information compressibility. Schmidhuber’s contribution was the insight that the extent to which an idea can be compressed is dependent upon the existing context which an information storage system has accumulated or that exists between communication partners. For example, when developing an encoding mechanism to use between two parties, the density of the encoding that is possible is based on how accurately you can predict, given a pattern of input or output, what any series of bits expands to represent. In the Femme Fractale example above, you as the decoder can “predict” what a circle looks like, so the English description doesn’t need to encode a definition of a circle. The definition of circle is a part of your shared context.

In an incredibly general sense, compression algorithms, then, are a manner of building and then transmitting data within a shared context. As the shared context becomes more descriptive, the encoding required for an idea decreases.

Schumidhuber proposes that as the amount of information that a system has seen increases, the amount of space or number of bits that an encoder needs in order to transmit or store that new idea decreases. This is ‘compression progress’, or the ability to more greatly compress ideas as you progress deeper into a context.

If you’re not incredibly up to speed on computer science primitives and what it means for an image to be ‘represented as a matrix’, there’s another, less pure yet more revelatory example I can present you with — Dawkin’s term ‘meme’.

As I mentioned earlier, there’s a bit of nuance here with information encoding measurement: all information encoding requires a decoder. How tightly you can pack information is a function of the pre-negotiated symbolism between two parties. An alphabet, for example, encodes information uniformly, sort of, by the construction of words. These words themselves encode meaning, however, such as ‘meme’. Broadcasting the word ‘meme’, in terms of information density, is quite small. It’s four ASCII letters. On a typical late 2010’s transmission line, we can get that idea ‘across’ a wire in approximately 32 bits, without compression (ignoring any transmission control or protocol data).

This 32-bits, however, is only enough to transmit the word itself.  The concept of what I mean by ‘meme’ is assumed to be encoded already within the recipient of the information. There’s already a shared understanding between the message sender and the message recipient.

Is it possible to encode the bigger question, namely ‘what is a meme’ in 32 bits? That, again, depends on the existing, shared context of the two parties wishing to communicate. If I have to explain what a meme is, in English, that will most definitively require greater than 32 bits of information, granted that at the least it requires a sentence or two of explanation. If I need to explain it in German, it’ll take even more bits as I’ll need to do a fair amount of translation in addition.

This progression, of building a shared understanding from an agreed upon  alphabet, to shared words such as English, to paragraph long explanations, to just sending you the short, four-character word meme — this is compression of expression, an encoding of information into progressively smaller and smaller amounts of signal that need to be transmitted as the base context of how and what we’re communicating becomes richer and, in some sense, more predictive.

Both Schmidhuber and Kolmogorov’s theories about information transmission can be loosely classified as theories about measurement and condensibility of information. These theories give us guidelines for how to measure the encodeability of an idea or information, as well as some general guidelines for understanding how we might be able to better compress information.

Let’s look now at Occam’s Razor and Explanatory Power, theories that deal with the explanatory validity of an idea. That is, how do we judge the validity of that idea, in terms of using it as a framing for how to understand our reality.

This question is related to the context question. A “valid framing” is a context that enables a compacted representation of reality. In fact, compactness is what Occam’s Razor expresses, that the most simple explanation is often reality.

Consider the following quote from Ludwig Wittenstein’s Tractatus Logico-Philosophicus:

If a sign is not necessary then it is meaningless. That is the meaning of Occam’s Razor.

Ludwig Wittgenstein, Tractatus Logico-Philosophicus 3.328

An unnecessary sign would be one that does nothing to contribute to the compactness of an idea, or that does not lend to the task of further compressing additional information or observation. Thus, the task of scientific endeavor is to find theories about reality that allow us to maximally compress the way that we represent it. We do this through the construction of the minimally required context.

But what is this context constructed of? How does this context get built? This is where Deutsch’s theory of Explanatory Power fills the gap. Deutsch, in his book The Fabric of Reality, extends Karl Popper’s anti-inductivism to conclude that all scientific theory is the building of a story-like explanation. Deutsch gives several guidelines for how explanatory power can be observed or identified:

  • if more facts or observations are accounted for; (compression)
  • if it changes more “surprising facts” into “a matter of course”; (prediction)
  • if it offers greater predictive power, i.e., if it offers more details about what we should expect to see, and what we should not; (prediction)
  • if it depends less on authorities and more on observations; (?)
  • if it makes fewer assumptions; (compression)
  • if it is more falsifiable, i.e., more testable by observation or experiment; (prediction)
  • if it is hard to vary.

Taken from the Wikipedia entry on Explanatory Power.

This transition, from ‘suprising facts’ to ‘matter of course’ closely mirrors the language used by Schmidhuber to describe the process of building a more compressible context for information.

Let’s look at this concept of compressibility on the example that Deutsch gives in his TED talk about the subject, on why the sun gets colder in the winter. He uses the ancient Greek explanation, that of Demeter getting sad because her daughter Persephone goes down into the underworld to pay a debt for the last part of the year. Deutsch gives a good idea for why, with additional observations, this explanation fails several of his above outlined criteria for a good explanation.

Let’s consider this same story in terms of compressibility. Knowing the story of Demeter and Persephone only gives us the ability to understand this one, specific phenomenon — why the sun weakens in the winter. It’s not very compact, as it doesn’t give us much ability to predict any other observations about the world.

Our current explanation of the Earth rotating on a skewed axis about the Sun, on the other hand, does a lot to predict many things that we can observe about our reality. It predicts seasons. It predicts the changes in day length that you observe at the equator as opposed to the poles. It predicts why the moon’s appearance is different on different parts of the planet. The mere concept of being on a planet that rotates about the Sun on a skewed axis gives us the context to situate and condense other phenomena that we observe about the world. In the terminology of Deutsch, we’d say that this idea of planetary rotation has high explanatory power, but it also goes a long way to contributing to our ability, as humans to compress the knowledge that we know about the world into tighter and more succinct representations. Deutsch is right that we create explanations about the world. Those explanations become the context that we can situate our observations and predictions into.

This just leaves us with Kuhn’s Paradigm‘s shifts, which is largely an observation that Kuhn made about the fact that a shift in the broader contextualization of information happens. In other words, Kuhn points out that when a compression algorithm (be it a human’s understanding or an actual programmatic decision matrix) discovers a new, more predictive way of organizing information, it will shift its interpretation or encoding of prior observations to be re-encoded using this new, more dense understanding.

In Exitus

So, how does one change their mind about a thing? These theories tell us that it is by finding new explanations that lend to more condensible encodings that then allow us to communicate and store our understanding of our reality in richer and more meaningful ways.

Errata

This condensed video of Schmidhuber talking it is pretty good; if you’ve got time the whole video might be worth watching. Schmidhuber’s work is largely descriptive of how learning systems learn new things and the representations that they then store of that information into, but he’s also got some really great observations about the mechanism of discovery and curiosity, or why humans are driven to look for more compressible interpretations.