Gervais, Reframed

I’ve been doing more reading of Venkat’s (@vgr) writings as of late, mostly driven by a friend, far more well versed in Ribbonfarm than me, who’s references and worldview I’d like to better understand. I got a copy of The Gervais Principle a few weeks ago, and finally got a chance to dig into it this weekend. 

Briefly, the Gervais principle is an analysis for how human interactions play out in organizations. Venkat’s discussion leans heavily on examples from The Office. In fact, I’d argue that it’s one part organizational theory, two parts literary criticism of the television series. If you’d like to read the whole thing (and I highly recommend it) Venkat has made the entire series of essays available on his Ribbonfarm blog.

I’ve been getting a lot of mileage lately out of the, now long running, realization that there’s often multiple plausible interpretations for any set of human interactions. Communication is layered; human intention is rarely straightforward; it’s a widely acknowledged fact that our brains filter out patterns that make sense to it from the vast actuality of stimuli that occurs at any given time.

What we see is a lossy view of the world. This lossiness gives rise to the plausibility of misunderstanding, of differing valid interpretations.

While the best piece of criticism of the GP comes naturally, from Venkat himself (in an analysis of how other theories of organizational behavior mechanics can be fit into the Gervais Principle’s triangle) what I’d like to do in this post is provide an alternative framing for viewing office politics. I’d like to put a different lens on the view of the organization that the Gervais Principle — a beautiful piece of cinematic crit[1].  The savage imagery of bureacratic interactions belies a dark worldview of organizational machinations, one that I’d argue isn’t necessarily true.


Sociopaths, in their own best interests, knowingly promote over-performing losers into middle-management, groom under-performing losers into sociopaths, and leave the average bare-minimum-effort losers to fend for themselves.

The Gervais Principle, source Ribbonfarm

In the way that only brilliant pieces of writing can, Gervais got under my skin. It’s an insightful, compelling, and, in the case of The Office references at any rate, incredibly predictive. It’s also frustrating, flat, and not very instructive. Most damning of all: I find that I don’t personally fit into the framework. Clearly there’s something missing here.

What Gervais Gets Right

The org chart of any company is messy and illogical. It is true that men who talk to upper management as peers, often get treated and rewarded as peers. (and the corollary, men who talk to upper management as sycophants, often get treated as loyalists). Self-interest is a valuable currency in organizational politics; so is blamelessness and the ability to claim ownership of “successful” projects.

A lot of people who don’t run the company or have the social skills to treat upper management as peers spend most of their time at work making friends and building a reputation for themselves in some unrelated field. Favors get traded at every level of the organization; personal brand and likability are hard currency that can be traded on. Middle managers can be loyal, self-satisfied individuals, a caste of like minded individuals who understand their role in protecting the organization and grooming new members for their ranks. Like understands like.

These are all true, and accurate observations about human social structure. They’re true of large organizations especially. 

Life, from a Different Angle

There’s two ways to view evolution. One, the most widely accepted and talked about in our current age, is through the lens of death. That is, evolution is driven by who survives long enough to pass on their genes to the next generation.

The butterflies with large spots on their wings? They’re ones that fool would be attackers into believing they’re animals, not tasty prey.

Ebola virus? Hasn’t wiped out humanity yet because it kills its host too fast to be able to spread to other humans fast enough.

Evolution isn’t a steady winnowing of the most competitive versus the best camo, however. It’s punctuated, it’s messy, it’s faster and more live than terminal survival. There’s bacteria involved and environment and random gene mutations, and, from at least one scientist’s perspective, cross-species love. It’s not about who eats who, but whom seduces whom.

Likewise, there’s another way to look at office politics. Venkat’s Gervais principle asks us to look at them through the lens of a sociopath: pitting the losers against each other, laughing at the clueless’s devotion to a job when they’re clearly the pawns of the situation. Viewing corporate hierarchy through this lens isn’t wrong, in fact. Much like seeing evolution only through the lens of death, it’s useful and instructive and leads to deep insights about the motivations and attitudes of your coworkers. 

However, it’s not particularly useful for driving change or moving within the structure that you’re surrounded by. Most people who subscribe to a death cult think that the only way out is in a coffin. Similarly, if you believe in evolution and only see movement and change as possible via death, you’re missing the deeper, broader implications of the here, of the now.  If evolution has set and fixed who you are — beta, alpha, omega, sociopath, loser, clueless — then you are stuck to act out the script that your genes established.

If you believe that finding and cultivating clueless people to be your fall men is the only way forward in the org chart, you wouldn’t be wrong. You’d also be limiting yourself into the reality where the only organization you know and see are ones that are the playgrounds of self-interested sociopaths. Perhaps this is a worldview you think you’d enjoy, because in your imagining of it, you’d be the sociopath. The Gervais interpretation is seductive because it explains your, the loser’s, hatred for your middle managers’ apparent idiocy. It excuses your slacker mentality. (Let’s be honest, the middle manager type probably isn’t spending many cycles reading into bureaucratic revenge porn).

This is not to say that there aren’t self-interested sociopaths at work who have developed a coded, aristocratic way of sorting peerage, or that some workers’ attitude and relationship with employment is entirely fixated on the transactional and formulaic (it’s just a job).  But there are other ways to understand these dynamics.

A Classification With Moral Precepts

Personally, I find Jane Jacob’s Guardian vs Commercial syndromes to be quite accurate for understanding organizations, as well as uplifting. Her identification of core values of different mindsets provides a blueprint for understanding the core values of people and social groups. (It’s not all rosy though — the monstrous hybrids truly are monstrous, but for a clear and specific corruption of values.)

A quick mapping, for those of you who are unfamiliar with the syndromes. Jane establishes two guiding value systems (Jane calls them ‘moral precepts’) for human organization: guardians and commercial. Guardians shun trading, seek vengeance, treasure honor, respect hierarchy and value loyalty. Commercial types shun force, compete, value honesty, respect contracts, and collaborate. (You can find the whole list of attributes and values on Wikipedia).

You can sort organizational structures themselves into these categories — ‘commercial’: open source software projects, manufacturing firms; ‘guardian’: the army, sales organizations; and ‘monstrous hybrids’: modern police forces, the mafia.

Although I think that most actors inside of a bureaucracy tend toward the guardian mindset, you can loosely map sociopaths and the clueless (definitely the most clueless) as operating from the guardian mindset; losers tend to float more commercial, depending on the broader industry that the organization is framed within.

Importantly, the guardian/commercial split offers something that the Gervais principle does not: a cohesive framing for understanding the value system of the syndromes. Being able to ascribe a value system to a ‘clueless’ person grants them a dignity that the sociopathic ‘clueless’ label would seek to rob them of.  Loyalty and honor above honesty may not be a value system that I subscribe to, but it is one that I can, albeit grudgingly, acknowledge as valid, and learn to at least respect, if not one that I can personally work within.

In Exitus

I love the Gervais Principle. It’s masterful, it’s insightful, it’s opinionated. Its insights are thrifty, efficient, and honest.

I feel that I haven’t quite lived up to my promise to show a lighter way of viewing self interest, rather I bowed out and pointed you instead towards Jane Jacobs’ syndromes; I’m afraid that a value framework and deep appreciation for the game is all I’ve got to offer.

Appendix: Situating Myself, Gervaically 

Let’s see if I can do it using what I learned from another lossy framework — I’m an early Scorpio with a double helping of Aquarius influence in my rising sign and moon. This maps loosely to a passive-aggressive honest sociopath who fails miserably at working towards my own self-interest. I tend to be employed in organizations at the Loser class. As an organizational operative of the Sociopathic bent however, I find that my principle motivator often leans toward revolution. This manifests itself in either organizing revolt*, in face of the obvious injustices meted down by the actually self-interested, or in doomed attempts at drumming up organizational support for exploring new projects or business ventures.

I tend not to last very long or exist very happily when embedded inside large, dysfunctional organizations, as I find office machinations endlessly fascinating, wholly distracting, and completely rage inducing.

*AMA about the time I successfully organized a lower-class faction at Walmart to (almost) sweep the end of summer intern awards.

[1] As an aside, the Gervais Principle is the best writing I’ve ever seen on my favorite usage of television — as a brilliant foil for the structure of our own reality. It reminds me of a short piece on political lessons from a few TV shows I wrote a few years back, with a more political than organizational bent.

Reflecting on Intelligence

My friend John recently published a pretty thoughtful review of Flowers for Algernon. I really liked his multiple interpretations of intelligence and wanted to add one of my own.

It’s been a long time since I last read Flowers, but the story has stuck with me. Briefly, it’s about a low IQ man who goes through a medical experiment which raises his IQ to astronomical levels, only to have it eventually regress to his original baseline.

In my mind, it’s a great story because it starkly questions so much of what we understand about ‘intelligence’ — what does it mean to have a high IQ, or a low one for that matter? Is intelligence itself life-making? Culturally, Americans are pretty obsessed with intelligence qua intelligence, both in the negative fearful sense and awe-struck aspects. We love and fear our geniuses, our cultural panthenon of modern gods is almost entirely devoted to them.

John’s review highlights a good number of different ways of understanding or interpreting the story. I’d like to add another, more personalized interpretation of why “IQ” can be isolating. 

What is IQ?

Let’s start off by saying that I don’t think I really understand what IQ measures. It seems to measure something real and tangible and descriptive about a capability of a mind, but the exact what isn’t something I feel qualified to opine on. Intelligence is something that we Americans use to bludgeon each other with, both in the has too much and doesn’t have enough sense. Given the propensity for abuse, it feels safest to talk from personal experience, as that’s both personally trustworthy (I trust my experiences) and also hard to generalize.

I’d like to conjecture that IQ is roughly a measure of someone’s ability to grasp and draw conclusions about reality based systems. Under this definition, there’s a few things that become important. The first, is your ability to notice and understand the nature of the reality that you exist in. This includes the ability to notice and appreciate deep details. Venkat retweeted a great article a few days back, about how being successful at systems building required this almost maniacal attention to details, how even beautifully simple constructions such as a set of stairs require a niche and complete understanding of the realities of angles and the nature of how wood bends.

It’s been a while since I’ve taken an IQ test, so I went to look one up online to test my theory about the ability to notice and appreciate the depth of detail about reality. I ended up doing this ‘might you be qualified to apply to Mensa’ test that doesn’t actually give you an IQ score. Instead it tells me that I got 28 out of 33 answers correct, or 84%, using an unlimited amount of time — if I had to guess, I probably didn’t spend any longer than 30 minutes on it. The whole thing is minute pattern matching. I never do very well on the number pattern ones but exposure to computer science has made it easier to spot certain patterns; my visual pattern matching skills have definitely improved in the last decade or so.

More interesting, in my mind is that I could probably tell you which of the 5 questions I got wrong. Tests like this don’t give you credit for knowing when you’re wrong or need more information — chalk that up as at least one concrete aspect of ‘intelligence’ that this IQ test is under counting. Being certain about what I know and why is relatively new ground for me, so maybe most people wouldn’t notice it.

But I noticed. And that’s the whole point of these tests. They’re all reasoning from patterns, drawing conclusions based on scant yet important information. It’s literally a rough measurement of what you notice about the reality aka pattern that they’re presenting to you. One nice thing about such encapsulated puzzles is that they’re guaranteed to provide at least enough signal to draw conclusions from — that level of guarantee is a rare thing for real world observations.

Noticing, then, is a large component of what IQ is a measure of. In fact, I think that I can strongly say that the skill this test is judging for is noticing and the consequent ability to draw a conclusion from the set of observations. IQ then, is a measure of what you notice and can predict from those observations.

The Nature of Alienation 

Charlie, in his ascent up the notice-patterning ability scale, finds himself increasingly unable to communicate with the woman character who fills the role of teacher, friend, and lover. At the pinnacle of his observation/pattern-matching performance, he’s as alienated from her as he was at the lower end of the IQ spectrum.

Why would the ability to notice things about reality make it difficult for you to interact with others?

One way of interpreting this is to say that a higher IQ means that what you notice about reality is far different from most people. Your shared context for what there is to see about the world and what that leads you to know about reality are so radically different that you are, in all practical ways, living in a different reality. Alienation, then, is the diverging of contexts such that communication loses a lot of its ability to be compressed. You have to send more signal to get ideas across, as the contexts break down.

In some ways, this is not unlike the struggle that Americans are having with the divergence of news outlets view and presentation on reality. An IQ-observation gap is one based, presumably, on the ability to pick out greater detail or signal from the same set of images. Modern ‘social’ media in the US  is taking the secondary tack of presenting two different images of events — each that lend to a differing interpretation such that the reality you experience as a consumer of political news can be entirely skewed by who you follow. It’s hard to know ‘what to believe’ when multiple images are presented. It’s even harder to communicate between these two realities, because the details that you’ve observed and picked up from the images presented to you are so incredibly disjoint as to have robbed us of the common context needed to have more compressible communication. In this reading, the political alienation across the aisle is real and quantifiable.

In Exitus

Charlie eventually loses both his ability to notice detail, as well his memory[1] of what it was even like. Eventually he falls back into a state where he’d like to know what it’s like to be able to see the details of reality that other, ‘normal IQ’ people see.

Kind of puts that old aphorism “the devil is in the details” in an entirely new light.

[1] Memory definitely plays a part of intelligence, but given the test that I took and the points that followed from those observations, I think this post can stand independent of a discussion on the importance and influence of memory. It’s definitely important and plays a large role, but there’s a nuance to observing details that doesn’t rely on memory.

The Power of Explanation

What I’m hoping to do over the course of several posts is to lay out a foundation for a new way of thinking about ethics and reality, definitively casting a vote in favor of one interpretation of reality. My goal is persuade popular opinion in favor of what is currently considered a niche outlook.

How does one move an idea from the fringes to mainstream, though? Why should you change your mind about how you think about reality? This question of persuasion seems like an important place to start.

In a lot of ways, I am re-tracing the footsteps of David Deutsch, the British physicist who penned a couple of books around the topic of reality. Much of my own thinking on the concept of reality and explanation is largely derivative of his. So let’s start at a similar place, then, with a movement towards understanding the role of explanation, of narrative, as a method of transmitting and cementing paradigmatic thought.

A Survey of Theories about Information

I tweeted at @vgr a few days back about Deutsch’s theory of explanatory power, and he responded with a list of some other theories of information. 

vgr’s informative list of information theories

Admittedly, Venkat doesn’t (yet) have a good grasp of what I mean by Deutsch’s theory of the power of explanation and therefore doesn’t exactly offer up comparable theories or interpretations, but since these are the methods of information theory that he’s using to judge how to change his mind, it feels in scope to at least walk through what these different theories can tell us about information, and how humans process it.

As a way of classifying these, there are two distinct domains about which these theories — Kuhn’s paradigm shifts, Schmidhuber’s compression progress, Occam’s razor, Deutsch’s explanatory power, Kolmogorov information — propose explanations. We can roughly divide them into two groups: the first 1) concerning the measurement and encodeable size of information and another 2) concerning the validity of an explanation. The first, in other words, provides a classification for ideas and thoughts based on how much pure physical matter of bits I need to send you in order to wholly and completely communicate an idea (measurement and density of encodeability); the second deals with what information do I need to present you with and possibly in what order or framework, in order to change your mind about a topic (explanatory validity).

With these two buckets and a bit of explanation as to what the theories entail, we can now categorize these quite effectively.

Let’s start with Kolmogorov. In a way, Kolmogorov wholly defines the first category of theories — the encodeability of an idea. What does it mean, though, to be able to ‘encode’ an idea? The classic, computer science focused explanation usually involves saying something along the lines of “consider a photograph”. You can either represent it as a matrix of color points or in JPG format. The first, or matrix representation, often takes up orders of magnitude more space, in terms of bits, than the ‘condensed’ JPG format. Kolmogrov was concerned with the ultimate ability to take a complex idea such as a photo and represent it in the smallest number of bits. You can then judge information or an idea based on how compactly it can be expressed. The ‘complexity’ of an idea is measured by how few bits you need to transmit over the wire. The lower the number of bits needed to represent it, the lower its stated complexity.

Schmidhuber took this concept of the compressibility of information and developed several examples of how ‘low Kolmogorov complexity’ ideas can still, in reality, be quite complex.

As an aside, my favorite example of these is his Femme Fractale, an equation for a drawing that when executed, creates a set of intersecting lines. Schmidhuber then goes on to explain how insight or creativity can be derived from this relatively simplistic pattern, eventually highlighting one particular pathway that, to him, is evocative of women’s silhouettes. The syntactic expression of the original drawing (top left, below) is as follows: “The frame is a circle; its leftmost point is the center of another circle of the same size. Wherever two circles of equal size touch or intersect are centers of two more circles with equal and half size, respectively. Each line of the drawing is a segment of some circle, its endpoints are where circles touch or intersect”.

Tracing femmes in a low Komolgorov complexity fractal, source

Schmidhuber’s compression progress builds upon Kolmogorov’s idea of information compressibility. Schmidhuber’s contribution was the insight that the extent to which an idea can be compressed is dependent upon the existing context which an information storage system has accumulated or that exists between communication partners. For example, when developing an encoding mechanism to use between two parties, the density of the encoding that is possible is based on how accurately you can predict, given a pattern of input or output, what any series of bits expands to represent. In the Femme Fractale example above, you as the decoder can “predict” what a circle looks like, so the English description doesn’t need to encode a definition of a circle. The definition of circle is a part of your shared context.

In an incredibly general sense, compression algorithms, then, are a manner of building and then transmitting data within a shared context. As the shared context becomes more descriptive, the encoding required for an idea decreases.

Schumidhuber proposes that as the amount of information that a system has seen increases, the amount of space or number of bits that an encoder needs in order to transmit or store that new idea decreases. This is ‘compression progress’, or the ability to more greatly compress ideas as you progress deeper into a context.

If you’re not incredibly up to speed on computer science primitives and what it means for an image to be ‘represented as a matrix’, there’s another, less pure yet more revelatory example I can present you with — Dawkin’s term ‘meme’.

As I mentioned earlier, there’s a bit of nuance here with information encoding measurement: all information encoding requires a decoder. How tightly you can pack information is a function of the pre-negotiated symbolism between two parties. An alphabet, for example, encodes information uniformly, sort of, by the construction of words. These words themselves encode meaning, however, such as ‘meme’. Broadcasting the word ‘meme’, in terms of information density, is quite small. It’s four ASCII letters. On a typical late 2010’s transmission line, we can get that idea ‘across’ a wire in approximately 32 bits, without compression (ignoring any transmission control or protocol data).

This 32-bits, however, is only enough to transmit the word itself.  The concept of what I mean by ‘meme’ is assumed to be encoded already within the recipient of the information. There’s already a shared understanding between the message sender and the message recipient.

Is it possible to encode the bigger question, namely ‘what is a meme’ in 32 bits? That, again, depends on the existing, shared context of the two parties wishing to communicate. If I have to explain what a meme is, in English, that will most definitively require greater than 32 bits of information, granted that at the least it requires a sentence or two of explanation. If I need to explain it in German, it’ll take even more bits as I’ll need to do a fair amount of translation in addition.

This progression, of building a shared understanding from an agreed upon  alphabet, to shared words such as English, to paragraph long explanations, to just sending you the short, four-character word meme — this is compression of expression, an encoding of information into progressively smaller and smaller amounts of signal that need to be transmitted as the base context of how and what we’re communicating becomes richer and, in some sense, more predictive.

Both Schmidhuber and Kolmogorov’s theories about information transmission can be loosely classified as theories about measurement and condensibility of information. These theories give us guidelines for how to measure the encodeability of an idea or information, as well as some general guidelines for understanding how we might be able to better compress information.

Let’s look now at Occam’s Razor and Explanatory Power, theories that deal with the explanatory validity of an idea. That is, how do we judge the validity of that idea, in terms of using it as a framing for how to understand our reality.

This question is related to the context question. A “valid framing” is a context that enables a compacted representation of reality. In fact, compactness is what Occam’s Razor expresses, that the most simple explanation is often reality.

Consider the following quote from Ludwig Wittenstein’s Tractatus Logico-Philosophicus:

If a sign is not necessary then it is meaningless. That is the meaning of Occam’s Razor.

Ludwig Wittgenstein, Tractatus Logico-Philosophicus 3.328

An unnecessary sign would be one that does nothing to contribute to the compactness of an idea, or that does not lend to the task of further compressing additional information or observation. Thus, the task of scientific endeavor is to find theories about reality that allow us to maximally compress the way that we represent it. We do this through the construction of the minimally required context.

But what is this context constructed of? How does this context get built? This is where Deutsch’s theory of Explanatory Power fills the gap. Deutsch, in his book The Fabric of Reality, extends Karl Popper’s anti-inductivism to conclude that all scientific theory is the building of a story-like explanation. Deutsch gives several guidelines for how explanatory power can be observed or identified:

  • if more facts or observations are accounted for; (compression)
  • if it changes more “surprising facts” into “a matter of course”; (prediction)
  • if it offers greater predictive power, i.e., if it offers more details about what we should expect to see, and what we should not; (prediction)
  • if it depends less on authorities and more on observations; (?)
  • if it makes fewer assumptions; (compression)
  • if it is more falsifiable, i.e., more testable by observation or experiment; (prediction)
  • if it is hard to vary.

Taken from the Wikipedia entry on Explanatory Power.

This transition, from ‘suprising facts’ to ‘matter of course’ closely mirrors the language used by Schmidhuber to describe the process of building a more compressible context for information.

Let’s look at this concept of compressibility on the example that Deutsch gives in his TED talk about the subject, on why the sun gets colder in the winter. He uses the ancient Greek explanation, that of Demeter getting sad because her daughter Persephone goes down into the underworld to pay a debt for the last part of the year. Deutsch gives a good idea for why, with additional observations, this explanation fails several of his above outlined criteria for a good explanation.

Let’s consider this same story in terms of compressibility. Knowing the story of Demeter and Persephone only gives us the ability to understand this one, specific phenomenon — why the sun weakens in the winter. It’s not very compact, as it doesn’t give us much ability to predict any other observations about the world.

Our current explanation of the Earth rotating on a skewed axis about the Sun, on the other hand, does a lot to predict many things that we can observe about our reality. It predicts seasons. It predicts the changes in day length that you observe at the equator as opposed to the poles. It predicts why the moon’s appearance is different on different parts of the planet. The mere concept of being on a planet that rotates about the Sun on a skewed axis gives us the context to situate and condense other phenomena that we observe about the world. In the terminology of Deutsch, we’d say that this idea of planetary rotation has high explanatory power, but it also goes a long way to contributing to our ability, as humans to compress the knowledge that we know about the world into tighter and more succinct representations. Deutsch is right that we create explanations about the world. Those explanations become the context that we can situate our observations and predictions into.

This just leaves us with Kuhn’s Paradigm‘s shifts, which is largely an observation that Kuhn made about the fact that a shift in the broader contextualization of information happens. In other words, Kuhn points out that when a compression algorithm (be it a human’s understanding or an actual programmatic decision matrix) discovers a new, more predictive way of organizing information, it will shift its interpretation or encoding of prior observations to be re-encoded using this new, more dense understanding.

In Exitus

So, how does one change their mind about a thing? These theories tell us that it is by finding new explanations that lend to more condensible encodings that then allow us to communicate and store our understanding of our reality in richer and more meaningful ways.


This condensed video of Schmidhuber talking it is pretty good; if you’ve got time the whole video might be worth watching. Schmidhuber’s work is largely descriptive of how learning systems learn new things and the representations that they then store of that information into, but he’s also got some really great observations about the mechanism of discovery and curiosity, or why humans are driven to look for more compressible interpretations.