What Can AI Safety Teach Us about Strategic Planning? Part 4: Building the House

Through parts 1, 2 and 3 we have established an understanding of the foundational thinking that supports the idea of CEV as an effective way of thinking about strategic planning in business.

In this final blog let’s establish, by way of summary, some things we need our initial dynamic to encapsulate so we can wrap up with an exploration of some of the steps we might like to take to put that initial dynamic in place.

The things

Rules for the initial dynamic:

  1. Survival of a company and financial success are simply emergent properties of a well-defined end goal.
  2. The suitability of that goal can be defined through an understanding of the organisation’s evolutionary purpose, and tested against it.
  3. We want to avoid resistance to changes that, in light of new and better information, need to be made to the final goal.
  4. We must allow for the accumulation of resources in line with pursuing a final goal state.
  5. We must enable informed choices through the continuous expansion of our bounded rationality.
  6. We must avoid perverse instantiation.
  7. All of the above should be defined according to the guiding principles of Coherent Extrapolated Volition.

Steps 1 and 2: Understanding and discovering evolutionary purpose

Frédéric Laloux summarises understanding evolutionary purpose as moving from a paradigm of “predict and control” to one of “sense and respond”.

A good first step on this journey is having a co-created mission and vision statement and values for your organisation. Not just something that looks great on your website and sounds noble, but is something against which the people who work for the organisation can test the actions they’re taking. As with a well-defined BHAG, they should also act as a filter, allowing your employees to decide what actions not to take, and goals not to pursue.

What do I mean when I say co-created? Involve as many people in the organisation as you can: get the input from as many levels and as many functions as is practical.

Remember that under the guise of evolutionary purpose we’re “looking at the organisation as if it’s a living organism with its own sense of direction”. A key differentiation to make here is that we’re not trying to anthropomorphize—we’re not trying to attribute human behaviour and values to the organisation—rather we’re trying to understand the things that drive the organisation if we imagine it as a living organism. What does this thing need, what does it want, might there be something that it is like to be this organisation? We don’t judge the dung beetle’s desire to roll poo into a ball based on human values, behaviours and desires (or, at least, we shouldn’t): impartiality is important here.

Let’s take a look at Container Solutions’ mission, vision and values as an example.


We use our technical expertise and proven practices to bring out the best in people and the businesses they’re part of.


Our vision is to lead change, future-proof organisations and improve the technology sector-one person, practice and project at a time.


We do this through:

  • prioritising the growth of people, both as technical experts and as humans.
  • proven practices rooted in psychological safety, inclusivity, collaboration and creativity.
  • technical excellence driven by industry-leading experts, continuous learning and untameable ambition.

We defined these through conversations with the people who work for CS, its founders and executive team and we went back and forth between the team responsible for creating it and the people at the company several times before we settled on our final versions. Any action that anyone within the organisation wants to take can be tested against this text and should offer a simple go/no go response.

Remember that when we view the organisation as an organism we understand that, as with our own personal decision making process, there is no central control point in which volition resides for a company. No matter how much you wish this to be the case. Rather, as a group, employees are implementing three processes which define an organisation’s tactical path:

  1. an early ‘what’ process, specifying which action to make
  2. a subsequent ‘when’ process determining the timing of the action, and
  3. a late-breaking ‘whether’ process, which allows for its last-minute cancellation or inhibition”

Well-constructed, co-created and socialised mission, vision and values provide the framework for those that work at an organisation to take the right action at the right time in the right way.

We can’t expect that to work properly if we don’t have that co-creation as part of the definition process. Think about how your own mind works: any decision you make about an action you have the option of taking (affordance) is taken (or not) based on your own utility—what is best for you—within the context of your own values and, as one example, whether you believe the “rightness” of an action should be judged on intent (deontology), result (consequentialism) or virtue. More often than not that decision feels entirely automated:

  1. There’s a purse on the floor with money in it—affordance.
  2. What’s best for me is to take that money—utility.
  3. But my moral values indicate that keeping the money for myself is fundamentally wrong so I will find its owner and return it—context defined by values leading to deciding the appropriate course of action.

That is not a “thought process”, it’s as automatic as 2+2=4 - I just know it. So I just do it.

If the actions the organisation needs and desires to take, based on its evolutionary purpose, are at odds with the values of its employees every decision would take conscious thought and almost certainly doublethink. Everything would slow down; fail; collapse. It would be an unmitigated disaster.

The rightness of an action needs to align from top to bottom, just as it did for Nike and Colin Kaepernick:

  1. Nike has a well-defined mission and values.
  2. The mission and values are well-understood by, and align with, the values of the people who work for Nike.
  3. So the decision was taken to stick with Kaepernick based on those values and mission.
  4. Stock value increased by 5%, sales increased by 31%.

Which neatly covers number 1 on our list: survival of a company and financial success are simply emergent properties of a well-defined end goal.

Step 3: Avoid resistance to changes that need to be made to the final goal

Once again let’s consider how we do things at Container Solutions for guidance here, specifically Hermes: The Container Solutions Strategic Execution Toolkit.

We start our year with a series of overall goals for the year ahead, defined as the intended strategy which, “forms the guardrails for the whole of the next year, providing focus for both the leadership teams and the organisation as a whole.” What’s important is that we have several opportunities to redefine those goals in light of new and better information that we gather as the year progresses.

In a way “avoid resistance to changes that need to be made to the final goal” is the wrong way of putting this. What you actually need to do as an organisation is to actively encourage changes based on new information and put in a well-defined framework to facilitate those changes—that’s basically what being a learning organisation, which CS prides itself on being, means.

Of course, that’s not enough on its own and the framework “must be accompanied by four key organisational characteristics: openness, judgement, creativity and flexibility”

The framework for this is well-defined in the Hermes whitepaper.

Step 4: Allow the accumulation of resources in line with pursuing a final goal state

The accumulation of resources, for our purposes, is what facilitates cognitive enhancement (you’re more likely to achieve your final goal if you can think/act faster, more effectively and more efficiently) and technological perfection (improvements to the infrastructure that facilitates that thinking/acting), which are the things that allow more effective and faster what/when/whether decisions by the business.

Before you can allow the accumulation of resources, you need to understand what you actually mean by “resources”. I would argue that resources are anything that allows the people within the organisation to do their job more easily and/or more effectively and/or with greater happiness. This extends from something as simple and low key yet enormously valuable as the unlimited book budget we have at Container Solutions all the way up to the much more organisationally challenging structure at FAVI in which machinists can order new CNC machines without approval from anyone.

We’re going back to Reinventing Organisations for this one, and the “Advice Process” which is used in Teal organisations:

“In principle any person within the organisation can make any decision. But, before doing so, that person must seek advice from all affected parties and people with expertise on the matter. The person is under no obligation to integrate every piece of advice; the point is not to achieve a watered-down compromise that accommodates everybody’s wishes. But advice must be sought and taken into serious consideration. The bigger the decision, the wider the net must be cast - including, when necessary, the CEO or board of directors. Usually the decision maker is the person who noticed the issue or the opportunity or the person most affected by it.”

This is not about “do whatever the hell you like” and letting, for example, your engineers spin up AWS resources willy-nilly on their corporate cards because they fancied it and couldn’t be bothered to go through the approval process. Rather, it’s about removing the approval processes that form blockers to the fulfilment of the organisation’s evolutionary purpose and trusting that people will communicate in the right way to the right people at the right time.

This fundamentally comes down to the question of whether or not you trust the people that work for you. If you do and you’ve done the work outlined in steps 1-3, there should be no issue with implementing the advice process with your organisation.

Step 5: Continuous expansion of our bounded rationality.

Quick reminder of the definition of bounded rationality: “people make quite reasonable decisions based on the information they have. But they don’t have perfect information, especially about more distant parts of the system.”

As much as anything else this is a cultural issue within an organisation. Realistically the only way of expanding the bounds of one’s own rationality is through the acquisition of information and, preferably, knowledge.

The steps outlined in 1-4 should help a great deal with this requirement. Well-defined and co-created mission, vision and values will help people to understand the motivations, desires and utility of others within the organisation and the organisation itself. Setting a framework for the continual review and alteration of established final goals through dialogue between departments (more on this in the next section) will create transparency on how and why people are succeeding or struggling, increasing collaboration and empathy for the work other people do. Unblocking the accumulation of resources, specifically the accumulation of expertise, will build a culture of learning. In combination, as Diana Wright et al put it quite simply:

“Change comes first from stepping outside the limited information that can be seen from any single place in the system and getting an overview. From a wider perspective, information flows, goals, incentives, and disincentives can be restructured so that separate, bounded, rational actions do add up to results that everyone desires.”

Step 6: Avoid perverse instantiation

Fortunately, unlike the hypothetical superintelligent agent that inspired this piece of work, it’s relatively easy to put the brakes on a group of humans operating within an organisation. The challenge, of course, is to ensure that the appropriate methods of communication are in place to ensure the brakes can be applied in time to prevent an incident.

Unfortunately that falls very much under the column marked “cure”, which is problematic for many reasons, not the least of which is the amount of time and effort that will have been put into getting that team to the point where people begin to scream, which has now gone to waste.

Far better is to work on prevention. In the conclusion of his 2012 paper on superintelligent will, Nick Bostrom points out that:

“...we cannot blithely assume that a superintelligence with the final goal of calculating the decimals of pi (or making paper clips, or counting grains of sand) would limit its activities in such a way as to not materially infringe on human interests. An agent with such a final goal would have a convergent instrumental reason, in many situations, to acquire an unlimited amount of physical resources and, if possible, to eliminate potential threats to itself and its goal system.”

As I alluded to earlier, if departments and teams are allowed to set their own goals and targets in a vacuum and just “start”, problems will inevitably arise especially if there are financial incentives attached to the delivery of those specific personal targets instead of the overall success of the organisation, therefore:

“...we cannot blithely assume that a person with a final goal whose attainment would bring financial or professional success would limit their activities in such a way as to not materially infringe on the interests of others or even the business itself. A person with such a final goal would have a convergent instrumental reason, in many situations, to gain control of an unlimited amount of resources and, if possible, to eliminate potential threats to itself and its goal system.”

It is, therefore, critical that all targets and final goals are set within the context of strong communication and alignment between teams and departments at all levels. Once the overall targets of the organisation have been set, in line with its values, mission and vision and an understanding of its evolutionary purpose, the manner in which the organisation will go about pursuing those goals must be collaborative meaning that “information flows, goals, incentives, and disincentives can be restructured so that separate, bounded, rational actions do add up to results that everyone desires.”

Step 7: Follow the guiding principles of Coherent Extrapolated Volition.

All of the above should be done within the context of adhering to the following principles, which I covered in detail in Part 3 of this series, but as a reminder:

1. Defend humans, the future of humankind, and human nature.

3. Humankind should not spend the rest of eternity desperately wishing the programmers had done something differently.

4. Avoid hijacking the future of mankind

5. Avoid creating a motive for modern day humans to fight over the initial dynamic

6. Keep humankind ultimately in charge of its own destiny

7. Help people

Bringing it all together

This has been a lengthy series of blogs. What have we learned?

By breaking Coherent Extrapolated Volition down into its key components, reframing them as questions and examining them through the lens of established strategic thinking, systems thinking and the complexities of AI value alignment, we can structure the way we set our initial dynamic in such a way that takes into account the inherent unpredictability of organisations that are fundamentally made up of a bunch of people with their own motivations, beliefs and values. We can minimise the possibility of confusion and stagnation, and facilitate a working culture that allows for quick reactions, incentivises trust and communication and makes selfish acts as impossible as possible.

We can know more and increase the speed of our organisation’s thinking and, by extension, the speed of action by facilitating the continuous expansion of our bounded rationality through, among other things, the unblocking of resource accumulation, the use of the Teal advice process for decision making and establishing relevant, co-created vision, mission and values statements.

We can try to be more the people we wish we were by establishing relevant, co-created vision, mission and values statements informed by a thorough understanding and recognition of the organisation’s evolutionary purpose and making every effort to interrogate our current moral position and how it might (will) grow and change as time goes on.

We can grow up farther together by moving from predict and control to sense and respond by putting a strategic framework in place like Hermes that allows for reflection, dialogue and adjustment to the direction of the organisation based on an understanding of its evolutionary purpose.

All of those things can ensure the extrapolation of our organisation’s volition converges rather than diverges if we always remember that “If we scream, the rules change; if we predictably scream later, the rules change now.”

Finally by taking steps to avoid perverse instantiation and viewing our organisations as entities with their own volition—whose actions are the result of a host of independent, but interrelated processes—we can go some way to making sure wishes cohere rather than interfere, volition is extrapolated as we wish that extrapolated and our plans are interpreted as we wish that interpreted.

The creation of future-proof organisations is impossible. What CEV allows us to do is create the most future-resistant organisations possible, built on trust, communication and value alignment to create a form of instrumental convergence that makes success as much of an inevitability as it can possibly be.

Most importantly of all however, it creates an organisation that is actually pleasant to work for and with. An organisation employees understand and feel connected to. An organisation in which employees feel served by what the organisation is and what it is trying to do.


Leave your Comment