Structure of a 2 day workshop on RCM

Day 1 
Session 1 – Introduction to RCM, History and 7 Questions
* Definition of Reliability, RCM and the 7 Vital Questions
* Maintenance Strategies
* Waddington Effect
* Nowlan & Heap’s Failure Patterns
* Inherent Reliability and its improvement strategy
Session 2 — Operating Context and Functions 
* Introduction to Operating Context
* Operating Context for a System
* Elements to be included
* Operating Context and Functions
* 5 general operating context
* Operating Context and Functional Failures
Session 3 – Failure Modes and Failure Effects  
* Introduction to Failure Modes
* Few thoughts about data
* Exploring Failure Modes
* 4 Rules for Physical Failure Modes
* Failure Effect
* Evidence that failure is occurring
Session 4 — Failure Consequence and Risk 
* Introduction to Decision Diagram
* Risk assessment — how each failure matter
* Is the function hidden or Evident
* Relation of time and Hidden vs Evident
* Safety and Environmental Consequences
* Operational and non-operational Consequences
Day 2 
Session 5 — Strategies and Proactive Tasks 
* Introduction to Proactive Tasks and PF interval
* CBM/On-condition tasks
* Scheduled Restoration and Scheduled Discard Tasks
* Determining Task Effectiveness
* Risk and Tolerability
* General Rules for following the decision diagram
Session 6 — Default Actions 
* Introduction to Default Actions
* Default tasks for hidden failures
* Failure Finding Task
* Failure finding Interval
* Design Out Maintenance — to do or to be
* Walk around checks with right timing
Session 7 — RCM Audits 
* Introduction to Audits
* Fundamental of Technical Audit
* Technical Audit process
* Fundamentals of Management Audit
* General Management Audit process
* What RCM achieves
Session 8 — Setting up a Successful Living Program 
* Using the power of facilitated group
* RCM Training
* Knowledge development and its process
* Failure Modes and Design Maturity
* RCM during scale up or expansion
* Summary and Conclusion

Rethinking Maintenance Strategy

As of now, maintenance strategy looks similar to strategy taken by the medical fraternity in themes, concepts and procedures.

If things go suddenly wrong we just fix the problem as quickly as possible. A person is healthy to the point when the person becomes unhealthy.

That might work fine for simple diseases like harmless flu, infections, wounds and fractures. And it is rather necessary to do so during such infrequent periods of crisis.

But that does not work for more serious diseases or chronic ones.

For such serious and chronic ones either we go for preventive measures like general cleanliness, hygiene, food and restoring normal living conditions or predictive measures through regular check ups that detects problems like high or low blood pressures, diabetes and cancer.

Once detected, we treat the symptoms post haste resorting to either prolonged doses of medication or surgery or both, like in the case of cancer. But unfortunately, the chance of survival or prolonging life of a patient is rather low.

However, it is time we rethink our strategy of maintaining health of a human being or any machine or system.

We may do so by orienting our strategy to understand the dynamics of a disease. By doing so, our approach changes radically. For example. let us take Type 2 diabetes, which is becoming a global epidemic. Acute or chronic stress initiates or triggers the disease (Initiator). Poor or inadequate nutrition or wrong choice of food accelerates the process  (Accelerator) whereas taking regular physical exercise retards or slows down the process (Retarder). Worthwhile to mention that the Initiator(s), Accelerator (s) and Retarder (s) get together to produce changes that trigger of unhealthy or undesirable behavior or failure patterns. Such interactions, which I call ‘imperfections‘ between initiator (s), accelerator (s) and retarder (s) change the gene expression which gives rise to a disease, which often has to be treated over the entire lifecycle of a patient or system with a low probability of success.

The present strategy to fight diabetes is to modulate insulin levels through oral medication or injections to keep blood sugar to an acceptable level. It often proves to be a frustrating process for patients to maintain their blood sugar levels in this manner. But more importantly, the present strategy is not geared to reverse Type 2 diabetes or eliminate the disease.

The difference between the two approaches lies in the fact — “respond to the symptom” (high blood sugar) vs “respond to the “imperfection” — the interaction between Initiators, Accelerators and Retarders”. The response to symptom is done through constant monitoring and action based on the condition of the system, without attempting to take care of the inherent imperfections. On the other hand, the response to imperfections involve appropriate and adequate actions around the I, A, R s and monitoring their presence and levels of severity.

So a successful strategy to reverse diabetes would be to eliminate or avoid the initiator (or keep it as low as possible); weaken or eliminate the Accelerator and strengthen or improve the Retarder. A custom made successful strategy might be formulated by careful observation and analysis of the dynamics of the patient.

As a passing note, by following this simple strategy of addressing the “system imperfections“, I could successfully reverse my Type 2 Diabetes, which even doctors considered impossible. Moreover, the consequences of diabetes were also reversed.

Fixing diseases as and when they surface or appear is similar to Breakdown Maintenance strategy, which most industries adopt. Clearly, other than cases where the consequences of a failure is really low, adoption of this strategy is not beneficial in terms of maintenance effort, safety, availability and costs.

As a parallel in engineering, tackling a diseases through preventive measures is like Preventive Maintenance and Total Productive Maintenance — a highly evolved form of Preventive Maintenance. Though such a strategy can prove to be very useful to maintain basic operating conditions, the limitation, as in the case of human beings, is that it does not usually ensure successful ‘mission reliability’  (high chance of survival or prolonging healthy life to the maximum) as demonstrated by Waddington Effect. (You may refer to my posts on Waddington Effect here 1 and here 2)

Similarly, predictive strategy along with its follow up actions in medical science, is similar to Predictive Maintenance, Condition Based Maintenance and Reliability Centered Maintenance in engineering discipline. Though we can successfully avoid or eliminate the consequences of failures; improvement in reliability (extending MTBF — Mean Time Between Failures) or performance is limited to the degree of existing “imperfections” in the system (gene expression of the system), which the above strategies hardly address.

For the purpose of illustration of IAR method, you may like to visit my post on — Application of IAR technique

To summarize, a successful maintenance strategy that aims at zero breakdown and zero safety and performance failures and useful extension of MTBF of any system may be as follows:

  1. Observe the dynamics of the machine or system. This might be done by observing  energy flows or materials movement and its dynamics or vibration patterns or analysis of failure patterns or conducting design audits, etc. Such methods can be employed individually or in combination, which depends on the context.
  2. Understand the failures or abnormal behavior  or performance patterns from equipment history or Review of existing equipment maintenance plan
  3. Identify the Initiators, Accelerators and Retarders (IARs)
  4. Formulate a customized comprehensive strategy  and detailed maintenance and improvement plan around the identified IARs keeping in mind the action principles of elimination, weakening and strengthening the IARs appropriately. This ensures Reliability of Equipment Usage over the lifecycle of an equipment at the lowest possible costs and efforts. The advantage lies in the fact that once done, REU gives ongoing benefits to a manufacturing plant over years.
  5. Keep upgrading the maintenance plan, sensors and analysis algorithms based on new evidences and information. This leads to custom built Artificial Intelligence for any system that proves invaluable in the long run.
  6. Improve the system in small steps that give measureable benefits.By Dibyendu De



Predicting Black Swans – Part II

In the earlier post we dealt with the concept of predicting a ‘black swan‘.

In this post, I intend to explore the concept a bit more: what exactly we monitor to notice a ‘black swan’ in time?

In doing so we would be forced to consider the natural response of a system.

The starting point of our exploration would be to understand how any system, as a whole, whether natural or engineered, would disturbed by a ‘black swan’.  A system is disturbed in three possible ways, which are as follows:

a) A system loses energy till it reaches a tipping point

b) A system gains more and more energy till it crosses the point of system resilience

c) A part of a system emits more energy than it is normally supposed to, that is going beyond the linear response of the part. 

So the natural way to watch a system to expect a ‘black swan’ in time, is to keep a tab on the ‘energy’ of a system in the following ways:

a) Monitor the entropy of a system. As a system functions the entropy of a system gradually rises till it hits a threshold limit indicating the appearance of a ‘black swan’ or an outlier. 

b) Monitor the energy gain of a system till it crosses the ‘resilience’ point to give birth to a ‘black swan’, outlier or a ‘wicked problem’. 

c) Monitor critical parts of a system for excess emission of energy till it goes beyond the linear response of a part. 

It is useful to remember that energy is transferred in ‘quanta‘ or in packets of energy. Therefore, it is natural to expect jumps of energy levels as we record by capturing the different manifestation of energy levels on monitoring trend charts. So when a ‘jump’ is big enough to cross a threshold limit or resilience point or linear response level indicated by its presence outside the Gaussian distribution range  we can be quite sure that a ‘black swan’ or an outlier or a ‘wicked problem’ would soon arrive on the scene. We call such an indicator as a signal.

Therefore, the central idea is to capture such signals in time, just before a ‘black swan’ makes it way to appear on the scene to dominate and change the system.

However, the question is how early can we detect that signal to effectively deal with the inherent ‘black swan’ in a system, which is yet to appear on the scene?

That would be explored in the next post.

Dealing with Authority Bias – A Blind Spot

In the old days of the airlines, the captain of an aircraft was the king. His commands could not be doubted or challenged not even by the co-pilot who was second in command. Even if the co-pilot noticed an oversight he could not openly point that out to his/her captain. Sometimes it was out of fear. Sometimes it was out of respect. Such a behavior caused many fatal accidents to take place. For example, when a very well respected captain of a Russian (former USSR) plane allowed his teenage son to take control of the plane his co-pilot lacked the power or the courage to point out the obvious flaw in decision making. The young child committed some mistake in controlling the plane, which lead to a fatal accident killing everyone on board. Such a behavior of blindly following authority, termed as ‘Authority Bias‘ was rampant in the airline industry.

Since this behavior was discovered, nearly every airline has instituted something called ‘Crew Resource Management‘ (CRM), which coaches pilots and their crews to discuss any reservations they have openly and quickly. This was a very creative way to slowly deprogram the authority bias. Needless to say, CRM has contributed more to flight safety in the past twenty years than any technical advances have.

Many companies are light years away from such foresight. Especially at risks are firms with domineering CEOs and JV partners. In such cases, employees are more likely to keep their opinions and judgments to themselves and not express or exchange their thoughts, observations and opinions openly — much to the detriment of the business. Authorities routinely crave recognition. So they constantly find ways to reinforce their decisions and their status. Slowly this sort of demand on employees leads to what is known as ‘groupthink‘ or ‘hivethink‘, which is perhaps one of the most dangerous phenomenon to emerge in any organization.

One of my clients was suffering from this ‘Authority Bias‘. They were almost bleeding to death. They had had a domineering JV partner, who was the technology provider to the firm. They would simply not allow anyone to have their say in anything. They always had their way. They blamed their partner for being lazy and undisciplined and what not. As a result, employees just closed themselves from doing anything on their own. They actively disengaged from work. Performance and profitability nose-dived to the point where the domineering JV partner decided to quit.

Soon after they were gone, performance rose to unexpected levels and stayed there. Productivity improved by more than 10 times. The organization saw profits for the first time in five years of their operations.

And they managed to survive.

But how does one check for the blind spot of ‘Authority Bias’ in an organization?

Rise and Fall of Nokia in India: Missing Patterns

On 28th March 2013, Nokia’s senior VP (India, Middle East, Asia) D Shivakumar quit the company after serving it for eight long years.

Shiv was known for his personal conviction on the importance of leadership. His conviction ran so deep that he sponsored many leadership programs throughout the region.

However, his tenure in India saw mixed results. While Nokia gained in brand image yet it suffered in sales.

Why was that?

Firstly, it completely missed out the emerging market of dual sim wave till it was too late. While competitors launched dual sim models in quick succession Nokia had nothing to offer. When it finally entered the market it was just too late. By that time their competitors have already grabbed 60% of the market share leaving Nokia with little or no elbow room to leverage. It substantially weakened Nokia’s leadership position.

Secondly, the company also failed to notice the emergence of smart phones with Android and Apple OS.

Nokia paid a price for not noticing two significant new market patterns in time – dual sim and smart phones. Their once enviable share of 60% of the market share quickly eroded to less than 40% in a matter of say two years. It now seems that this slide is irreversible.

All because leadership failed to see emerging patterns and act in time. And their aspiration did not match the aspiration of their consumers.

A costly mistake indeed.

Do you think ‘seeing patterns’ is leadership’s number 1 job?


Note: 11th Feb 2014:

That the above analysis made about a year back was correct is confirmed by this article dated 11th Feb, on Nokia’s attempt to stop the  slide 

My prediction is they would still not be able to stage a comeback. They missed a few more vital perspectives in their strategy.

Learning Complexity — Leadership Series – 1

Here is one of many toys I use in my classes on Leadership in Complexity to demonstrate complexity through play. It is a simple and common toy – a double pendulum. It is interesting to see how interactions between few elements really produce complexity. So, the question that I ask at the beginning of a session – ‘Can we predict what is going to happen?

We have made a video demonstration of it. It is about 5 mins. Hope you would find it engaging. You may choose to skip it if you like. I suggest a try. While you are viewing it mentally start predicting what might happen the next instant…

Predicting Complexity? ( <– click on the adjoining link to view the video)

What do you find?

Is complexity predictable or not?

On the face of it it appears that it isn’t predictable at all. The movements of the loose limbs of the double pendulum simply go crazy. It is not or nearly not possible to predict. Every time we think something like this might happen it usually turns out to be something else. It appears that there are no definite patterns about it. It is too random to make sense. No doubt this is what always happens in complex adaptive systems.

But then I show how complexity can be predicted along with many of its principles.

At first it feels rather strange to realize how all complex systems or complex adaptive systems are inherently predictable as an ensemble in the short run and how they all follow the same rules of the game.

That is really fascinating. It gives us tremendous hope to embrace complexity with faith. There is no point in ignoring complexity since we are entangled with it every moment of our lives. But once we embrace it knowing fully well how to read, learn and go about it —  life is simple indeed. The objective of learning about complexity and applying its principles is to make life simpler; not more complex.

That promises us an alternative way to lead our own lives through creativity and adaptation.

This alternative Leadership path can be summed up by three simple rules, which are —

1. Explain what is happening.

2. Institute methods to Foresee what might happen in the short term

3. Envision desired Interventions to make the system flow in the right direction.

Three of the best designed interventions that I found are a) Education b) Interactions c) Design. These give long term ongoing benefit for many.

So what do you feel and think about it?


(I personally thank my colleague Trichur for prototyping complexity through this model. )