Maintenance and Reliability Best Practices by Ramesh Gulati
Maintenance and Reliability Best Practices by Ramesh Gulati
and
Reliability
Best Practices
SECOND EDITION
Foreword by
Terrence O’Hanlon, CMRP
ISBN 978-0-8311-3434-1
10 9 8 7 6 5 4 3 2 1
Preface and Acknowledgements
to the Second Edition
The first edition of this book came out about three years ago;
since then, it has gone through several printings. The book has become the
most widely read by maintenance, operations, reliability, and safety pro-
fessionals and has been used as the textbook for maintenance and reliabil-
ity curricula in many colleges and universities. In 2011, the book received
the first prize (Gold Award) in the RGVA book competition, in the main-
tenance and reliability category, at MARTS/Chicago.
I want to thank you, my readers, professional friends, and col-
leagues at AEDC and in Jacobs for the book’s success and continued sup-
port. This second edition is the result of your valuable feedback which I
have received through emails, letters, comments, and personal discussions
at many conferences and professional society meetings.
I would like to thank David Hurst, Walt Bishop, Vijay Narain,
Sheila Sullivan, Scott Bartlett, Bart Jones, Lynn Moran, and many of other
my colleagues and management at AEDC/ATA – Jacobs for their contin-
ued support (and also for reviewing the manuscript). I would be unjust if
I didn’t acknowledge and give special thanks to Christopher Mears, a very
young and energetic engineer. Christopher is now head of the continuous
improvement section at AEDC; he has transformed himself into a true
reliability and maintenance expert in just a few years. Christopher and I
spent many hours reviewing and critiquing the manuscript to improve the
quality of the final product. I’m also thankful to Terrence O’Hanlon of
Reliabilityweb for his continued support and encouragement and for writ-
ing a very special foreword for this book.
I don’t want to miss the opportunity to thank John Carleo, Janet
Romano, and Robert Weinstein of Industrial Press, my publisher, for their
editorial support and having patience with my constantly changing sched-
ule. And finally, of course, my wife, Prabha and daughter, Sona for their
continued support, without which I wouldn’t have finished this valuable
work.
Ramesh Gulati
xv
Table of Contents
vii
viii
Index 467
Chapter 1
Introducing Best Practices
I have not failed. I have found 10,000 ways that won’t work.
— Thomas Edison
In addition, you will also be able to assess your knowledge about the
basics of Maintenance and Reliability by taking a short test.
1
2 Chapter 1
The notion of a best practice is not new. Frederick Taylor, the father
of modern management, said nearly 100 years ago, “Among the various
methods and implements used in each element of each trade, there is
always one method and one implement which is quicker and better than
any of the rest.” In recent times, this viewpoint has come to be known as
the “one best way” or “best practice.”
“Best practice” is an idea which asserts that there is a technique,
method, or process that is more effective at delivering a desired outcome
than any other technique, method, or process. The idea is that with this
technique, a project or an activity such as maintenance can be completed
with fewer problems and unforeseen complications. Simply, we can say
that a technique, method, or process may be deemed a “best practice”
when it produces superior results. A best practice is typically a document-
ed practice used by the most respected, competitive, and profitable organ-
izations. When implemented appropriately, it should improve perform-
ance and efficiency in a specific area. We also need to understand that
“best practice” is a relative term. To some it may be a routine or a stan-
dard practice; but to others, it may be a best practice because a current
practice or method is not effective in producing the desired results.
History is filled with examples of people who were unwilling to
accept or adopt the industry standard as the best way to do anything. The
enormous technological changes since the Industrial Revolution bear wit-
ness to this fact. For example, at one time horses were considered the best
form of transportation, even after “horseless carriages” were invented.
Today, most people drive a gasoline or diesel vehicle — all improvements
on the original horseless carriage. Yet concerns over oil costs, supplies,
and global warming are driving the next group of transportation improve-
ments.
In the 1968 Summer Olympics, a young athlete named Dick Fosbury
revolutionized the high-jumping technique. Using an approach that
became known as the Fosbury Flop, he won the gold medal by going over
the bar back-first instead of head-first. Had he relied on “standard prac-
tice,” as did all of his fellow competitors, he probably would not have won
the event. Instead, by ignoring standard practice, he raised the perform-
ance bar — literally — for everyone. The purpose of any standard is to
provide a kind of reference. Therefore, that standard must be, “what is
possible?” and not, “what is somebody else doing?”
Introducing Best Practices 3
Benchmark
Process of identifying, sharing, and using knowledge and best
practices. It focuses on how to improve any given business
4 Chapter 1
Best Practices
Technique, methods, or processes that are more effective at deliv-
ering a desired outcome than any other techniques, methods, or
processes. These are usually documented practices used by the
most respected, competitive, and profitable organizations.
Maintenance
The act of maintaining, or the work of keeping an asset in proper
operating condition.
Reliability
The probability that an asset or item will perform its intended
functions for a specific period of time under stated conditions. It
is usually expressed as a percentage and calculated using Mean
Time Between Failures (MTBF).
What are the best practices in the maintenance and reliability (M&R)
area and how could those be implemented to get better results? M&R best
practices are practices that have been demonstrated by organizations who
are leaders in their industry. These companies are the quality producers
with very competitive costs, usually the lowest in their industry. A few
examples (by no means an exhaustive list) of maintenance and reliability
best practice benchmarks along with their related typical world class val-
ues are listed in Figure 1.2.
ly misdirected until they have reviewed and improved their processes, and
applied best practices.
best practices. Operators must see assets as something they own. The
only way this transformation can occur is through education and
empowerment.
In working with many organizations over the years, we’ve noted that
benchmarking is not an easy process, particularly when there are no stan-
dard definitions of terms to benchmark. For example, RAV (replacement
asset value) may not have same meaning to Organizations A and B. Both
of them may have different definitions. This problem has been a major
challenge in M&R-related benchmarking initiatives. The Society for
Maintenance and Reliability Professionals (SMRP) has taken the lead
toward standardized maintenance and reliability terms, definitions, and
metrics. The author—along with Bob Baldwin, retired editor of
Maintenance Technology Magazine, and Jerry Kahn of Siemens—has
also published the manual Professional’s Guide to Maintenance and
Reliability Terminology to standardize the M&R terms.
When measuring performance against known benchmarks of best
practices, we will find that all benchmarks are interconnected and interde-
pendent. This is why an organization must have a clearly defined group of
maintenance and reliability processes to implement best practices.
Tailoring a best practice to suit your needs and working environment is
essential for its successful and effective implementation.
So far, we have discussed just a few examples of best practices and
their benchmarks. Throughout this book, we will be discussing practices
which may be standard, good, or best depending upon where you stand in
your journey for excellence in maintenance and reliability.
Once you complete the test, go to the Appendix and score yourself
appropriately. Try not to guess when answering any of the questions. If
you are uncertain, skip the question and review the text later; otherwise,
your results may give you a false sense of how well you know “best prac-
tices” when it comes to maintenance and reliability.
19. Understanding the known and likely causes of failures can help
design a maintenance strategy for an asset to prevent or predict
failure.
a. True
b. False
20. Reliability can be improved easily after a maintenance plan has been
put into operation.
a. True
b. False
24. The “F” on the P–F Interval indicates that equipment is still
functioning.
a. True
b. False
Introducing Best Practices 13
30. The 6th S in the 6 S (also called 5 S plus) process stands for safety.
a. True
b. False
34. The inventory turnover ratio for MRO store should be:
a. Less than 2
b. Between 4–6
c. Over 6
35. PM compliance is a _________ KPI.
a. Laggimg
b. Lagging or Leading
c. Leading
43. Properly training the M&R workforce can increase asset and plant
availability.
a. True
b. False
47. New incoming oil from the supplier is always clean and ready to be used.
a. True
b. False
48. Which phase of the asset life cycle has the highest cost?
a. Design
b. Acquisition
c. O & M
54. The biggest benefit of a Failure Modes and Effects Analysis occurs
during:
a. Operations phase
b. Maintenance phase
c. Design phase
d. None of the above
1.6 Summary
The work force needs to have the knowledge of best practices in the
area of maintenance and reliability to implement them effectively. A com-
mitment to using the best practices in the M&R field and utilizing all the
knowledge and technology at one’s disposal ensures success.
Q1.2 What are keys factors that impact the performance of plant
machinery?
2.1 Introduction
2.2 Key Terms and Definitions
2.3 Leadership and Organizational Culture
2.4 Strategic Framework: Vision, Mission, and Goals
2.5 Change Management
2.6 Reliability Culture
2.7 Measures of Performance
2.8 Summary
2.9 Self Assessment Questions
2.10 References and Suggested Reading
• Organizational culture
• Leadership and its role
• Vision, mission, and goals
• Reliability culture
• Change management and role of change agents
19
20 Chapter 2
2.1 Introduction
Change Management
The process of bringing planned change to an organization.
Change management usually means leading an organization
through a series of steps to meet a defined goal. Synonymous
with management of change (MOC).
Culture
A common set of values, beliefs, attitudes, perceptions, and
accepted behaviors shared by individuals within an organization.
Cultural Change
A major shift in the attitudes, norms, sentiments, beliefs, values,
operating principles, and behavior of an organization.
Leadership
An essential organizational role that must establish a clear vision,
communicate that vision to those in the organization, and provide
direction, resources, and knowledge necessary to achieve goals
and accomplish the vision. It may require coordinating and bal-
ancing the conflicting interests of all members or stakeholders.
Mission
An organization’s purpose.
Mission Statement
A broad declaration of the basic, unique purpose, and scope of
operations that distinguishes the organization from others of its
type.
Organizational Culture
The beliefs and values that defines how people interpret experi-
ences and how they behave, both individually and in groups.
Strategy
An action plan that sets the direction for the coordinated use of
resources through programs, projects, policies, and procedures,
as well as organizational design and the establishment of per-
formance standards.
22 Chapter 2
Vision
The achievable dream of what an organization or a person wants
to do and where it wants to go.
Vision Statement
An overarching statement of the way an organization wants to be;
an ideal state of being at a future point.
Recently General (ret.) Colin Powell said that “Leadership is the art
of accomplishing more than the science of management says possible”. In
fact, observes author Oren Harari, in The Leadership Secrets of Colin
Powell, aspiring business leaders would do well to adopt Powell’s style.
Powell captured his leadership approach within 18 lessons:
Lesson 2: The day soldiers stop bringing you their problems is the day
you have stopped leading them. They have either lost confidence that
you can help them or concluded that you do not care. Either case is a
failure of leadership.
Lesson 4: Don’t be afraid to challenge the pros, even in their own back-
yard.
Lesson 6: You don’t know what you can get away with until you try.
Lesson 9: Organization charts and fancy titles count for next to nothing.
Lesson 10: Never let your ego get so close to your position that when
your position goes, your ego goes with it.
Lesson 11: Fit no stereotypes. Don’t chase the latest management fads.
The situation dictates which approach best accomplishes the team’s mis-
sion.
Lesson 13: Powell’s Rules for Picking People: Look for intelligence
and judgment and, most critically, a capacity to anticipate, to see around
corners. Also look for loyalty, integrity, a high energy drive, a balanced
ego, and the drive to get things done.
Lesson 14: Great leaders are almost always great simplifiers, who can
cut through argument, debate, and doubt, to offer a solution everybody
can understand.
Lesson 15: Part I: Use the formula P = 40-to-70, in which P stands for
the probability of success and the numbers indicate the percentage of
information acquired. Part II: Once the information is in the 40-to-70
range, go with your gut.
Lesson 16: The commander in the field is always right and the rear ech-
elon is wrong, unless proved otherwise.
Lesson 17: Have fun in your command. Don’t always run at a break-
neck pace. Take leave when you’ve earned it. Spend time with your
families. Corollary: Surround yourself with people who take their work
seriously, but not themselves; those who work hard and play hard.
In modern days, Lincoln’s principle of “Get out of the office and cir-
culate among the troops” is known to us as Management by Wandering
Around (MBWA), as dubbed by Tom Peters and Robert Waterman in
their 1982 book, In Search of Excellence. The principle has also been
referred to us by other names and phrases, such as “roving leadership,”
“being in touch,” and “get out of the ivory tower.” It is simply the
process of getting out of the office and interacting with people. Peters
and Nancy Austin, in A Passion for Excellence, define MBWA as “the
technologies of the obvious.”
Culture and Leadership 27
heading and why they should be proud of it. An effective vision empow-
ers people and prepares for the future while also having roots in the past.
Leadership creates vision and energizes people to make organizations
and people successful. Figure 2.3 shows the results of a survey ranking
five key attributes of leadership.
1. Charisma
2. Competence
3. Communication
4. Energizing people
5. Vision (in creating)
Vision
A vision statement is a short, succinct, and inspiring declaration of
what the organization intends to become or to achieve at some point in the
future. Vision refers to the category of intentions that are broad, all-inclu-
sive, and forward-thinking. It is the image that a business must have of its
goals before it sets out to reach them. It describes aspirations for the
future, without specifying the means that will be used to achieve those
desired ends.
Corporate success depends on the vision articulated by the organiza-
tional leaders and management. For a vision to have any impact on the
employees of an organization, it has to be conveyed in a dramatic and
enduring way. The most effective visions are those that inspire, asking
employees for their best and communicating that constantly. A vision
statement is a pronouncement about what an organization wants to
become. It should resonate with all members of the organization and help
them feel proud, excited, and part of something much bigger than them-
selves. A vision statement should stretch the organization’s capabilities
and image of itself. It gives shape and direction to the organization’s
future. Visions range in length from a couple of words to several pages.
Sample vision statements are given in Figure 2.5.
Kraft Foods “Helping People Around the World Eat and Live Better.”
“To give ordinary folk the chance to buy the same things
Wal-Mart
as rich people.”
The Society for “To become the global organization known for providing
Maintenance & Reliability competitive advantage through improved physical asset
Professional’s (SMRP) management.”
The vision must convey the essence of how the organization desires
to accomplish feats that prove to be big, exciting and compelling.
Mission Statements
Mission and vision statements are very different. A mission statement
is an organization’s vision translated into written form. It’s the leader’s
view of the direction and purpose of the organization. For many corporate
leaders, it is a vital element in any attempt to motivate employees and to
give them a sense of priorities.
A mission statement should be a short and concise statement of goals
and priorities. In turn, goals are specific objectives that relate to specific
time periods and are stated in terms of facts. The primary goal of any busi-
ness is to increase stakeholder value. The most important stakeholders are
shareholders who own the business, employees who work for the busi-
ness, and clients or customers who purchase products or services from the
business.
The mission should answer four questions:
What are our values and beliefs? Values and beliefs guide our plans,
decisions, and actions. Values become real when we demonstrate them in
the way we act and the way we insist that others behave. In forward-look-
ing and energized organizations, values are the real boss. They drive and
keep the workforce moving in the right direction.
In his book, First Things First, Steven Covey points out that mission
statements are often not taken seriously in organizations because they are
developed by top executives, with no buy-in at the lower levels. But it’s a
pretty safe assumption that there probably is buy-in when we develop our
own mission statements.
First Things First is actually about time management, but Covey and
his co-authors use the personal mission statement as an important princi-
ple. The idea is that if we live by a statement of what’s really important to
us, we can make better time-management decisions. The authors ask,
“Why worry about saving minutes when we might be wasting years?”
A mission statement may be valuable, but how in the world do we go
about crafting one? As one way to develop a mission statement, Covey
talks about visualizing your 80th birthday or 50th wedding anniversary
and imagining what all our friends and family would say about you. A
somewhat more morbid, but effective approach is writing your own obit-
uary.
Can we visualize what it would be like if there were no asset failures
or if production met their schedule without any overtime for one month,
even three months, or if there was not a single midnight call for three
months and we were able to sleep without worrying?
Corporate Strategy
Strategy is a very broad term which commonly describes any think-
ing that looks at the bigger picture. Successful organizations are those
that focus their efforts strategically. To meet and exceed customer satis-
faction, the business team needs to follow an overall organizational
strategy. A successful strategy adds value for the targeted customers
over the long run by consistently meeting their needs better than the
competition does.
Strategy is the way in which an organization orients itself towards the
market in which it operates and towards the other companies in the mar-
ketplace against which it competes. It is a plan based on the mission an
organization formulates to gain a sustainable advantage over the competi-
tion.
Setting Goals
The major outcome of strategic planning, after gathering all neces-
sary information, is the setting of goals for the organization based on its
vision and mission statement. A goal is a long-range aim for a specific
period. It must be specific and realistic. Long-range goals set through
strategic planning are translated into activities that will ensure reaching
the goal through operational planning. Examples of an M&R goal include
achieving 90% PM compliance or reducing the overtime to less than 5%
in a specified time period.
No matter how many goals are set or how grand the vision, an organ-
ization can go only as far as the organizational culture will allow it. Any
organization that will succeed in a culture change must have some form
of a change management process. Even if the organization does not have
a formal (written-down) change management process, a process will exist
in that organization. However, the success rate of these changes will prob-
ably be limited by the charisma of the leadership or the vision of its team.
Changing a culture from reactive thinking to proactive follows a sim-
ilar process to what was mentioned in Section 2.2 in answering the ques-
tion “What Is Organizational Culture?” It has to be shown why preventive
and condition based approaches are better than reactive work. Making
people change what they do or how they think takes time. After all, it took
them a long time to build their current habit. People do certain things cer-
tain ways. In turn, when we ask them to do things differently or ask them
to buy into our plan (vision), we are taking them out of their comfort zone.
We must have very convincing reasons for people to change; we must
inspire them to accept change. These reasons would help greatly in the
implementation process. We have found that implementing changes in
small steps or in a small pilot area helps the process. Figure 2.7 shows the
culture change process.
Organizational change management takes a structured approach to
change, helping executive management, business units, and individual
employees make the transition from the current state to the desired future
state. The goal is to assist employees in assimilating change: to minimize
the disruption of expectations and loss of control that can easily lead to
resistance on the part of those who must actually change.
Culture and Leadership 37
Two key elements of any successful effort to change the culture are:
If an asset fails, it gets fixed quickly, the root cause is determined, and
action is taken to prevent future failures. Facility/asset reliability analysis
is performed on a regular basis to increase uptime. The workforce is
trained and taught to practice reliability-based concepts and best practices
on a continuous basis.
What kind of culture does this plant have? What kind of message is
being delivered to the workforce? It appears that this organization has a
reactive culture. Fixing things are recognized and appreciated.
Now, let us look into another plant with the same breakdown scenario,
but where the sequence of events happened a little differently:
Now let’s consider what happened in this plant. It seems like this plant
was doing fairly well. A lot of things went well during this repair and in
the follow-up actions suggested by the Maintenance Manager. But is
everything going as well as it could? Is the CMMS/EAM system provid-
ing the data we need to make the right decisions? What kind of message
is being delivered to the workforce? What kind of culture is in this plant?
In this plant, the CMMS/EAM system has provided the information
to help make the right decisions. The maintenance manager is emphasiz-
ing failure prevention. It’s a proactive culture — a step in the right direc-
tion.
Now let us look into another plant, a similar type of situation, but
where events happened a little differently. In this case, the plant opera-
tions (Operator) noted that on Valve # P–139:
What happened in this plant? What kind of culture does this organiza-
tion have? In this plant, “Failure” was identified and addressed before it
happened. Additionally,
Competencies
Typical critical skills/competencies for executives include:
Knowledge
Depending on their role, senior leaders need:
Experience
Before assuming leadership roles, leaders might require experience
in:
• Creating a corporate culture.
• Managing a diversity of functional areas.
• Having profit/loss responsibility for a business.
• Global leadership assignments.
• Managing a large operation.
• Playing a key role in a joint venture or merger.
Personal Attributes
Some of the key attributes successful leaders should demonstrate
include:
2.8 Summary
“Your system is perfectly designed to give you the results that you get.”
-- W. Edwards Deming
3.1 Introduction
3.2 Key Terms and Definitions
3.3 Maintenance Approaches
3.4 Others Maintenance Practices
3.5 Maintenance Management System: CMMS
3.6 Maintenance Quality
3.7 Maintenance Assessment and Improvement
3.8 Summary
3.9 Self Assessment Questions
3.10 References and Suggested Reading
• Why do maintenance?
• The objective of maintenance
• Benefits of maintenance
• Types of maintenance approaches
• Purpose of CMMS/EAM
• Maintenance quality challenges
• Importance of assessing your maintenance program regularly
49
50 Chapter 3
3.1 Introduction
This definition implies that the term maintenance includes tasks per-
formed to prevent failures and tasks performed to restore the asset to its
original condition.
However, the new paradigm of maintenance is related to capacity
assurance. With proper maintenance, the capacity of an asset can be real-
ized at the designed level. For example, the designed capacity of produc-
tion equipment of x units per hour could be realized only if the equipment
is operated without considerable downtime for repairs.
An acceptable capacity level is a target capacity level set by manage-
ment. This level cannot be any more than the designed capacity. Consider
production equipment that is designed to make 500 units per hour at a
maintenance cost of $150 per hour. If the equipment is down 10% of the
time at this level of maintenance, the production level will be reduced to
450 units per hour. However, if the maintenance department, working
with the production department together as a team, can find a way to
reduce the downtime from 10% to 5% at a slightly increased maintenance
cost/hour, this reduction will increase the output by another 25 units/hour.
Therefore, it is conceivable that management would be able to justify the
increased maintenance cost. Thus capacity could be increased closer to
designed capacity by reducing downtime.
Unfortunately, literature related to maintenance practices over the past
few decades indicates that most companies did not commit the necessary
resources to maintain assets in proper working order. Rather, assets were
allowed to fail; then whatever resources needed were committed to repair
or replace the failed asset or components. In fact, maintenance function
was viewed as the necessary evil and did not receive the attention it
deserved.
Understanding Maintenance 51
However, in the last few years, this practice has changed dramatical-
ly. The corporate world has begun recognizing the reality that mainte-
nance does add value. It is very encouraging to see that maintenance is
moving from so-called “backroom” operations to the corporate board
room. A case in point —in the 2006 annual report to investment brokers
on the Wall Street, the CEO of Eastman Chemical included a couple of
slides in his presentation related to maintenance and reliability stressing
the company’s strategy of increasing equipment availability by commit-
ting adequate resources for maintenance.
Component
An item or subassembly of an asset, usually modular and replace-
able, sometimes serialized depending on the criticality of its
application; interchangeable with other standard components
such as belt of a conveyor, motor of a pump unit, or a bearing.
Maintenance, Backlog
Maintenance tasks those are essential to repair or prevent equip-
ment failures that have not been completed yet.
Proactive (Maintenance)Work
The sum of all maintenance work that is completed to avoid fail-
ures or to identify defects that could lead to failures (failure find-
ing). It includes routine preventive and predictive maintenance
activities and work tasks identified from them.
Reliability
The probability that an asset or item will perform its intended
functions for a specific period of time under stated conditions.
• Vibration analysis
• Infrared (IR) thermography
• Acoustic / Ultrasonic — sound level measurements
56 Chapter 3
• Oil analysis
• Electrical — amperage plus other data
• Shock Pulse Method (SPM)
• Partial discharge & Corona detection
• Operational performance data — pressure, temperature, flow
rates, etc.
used, e.g., change filter, adjust drive belts, and take bearing clearance
data. The observers also document the abnormalities and other findings.
These abnormalities should be corrected before they turn into failures for
a PM program to add any value.
These PM inspections can be based on either calendar time or asset
runtime. If CBM is not being performed on a particular piece of equip-
ment, or if CBM cannot detect a particular failure, then the next best
approach is a runtime-based PM program, but only for equipment and
failure modes that have a time basis. If a calendar time-based PM program
is all that really adds value, then that approach is still better than a run-to-
failure strategy. The exception to this is when an analysis has been per-
formed that indicates the most cost-effective strategy is run-to-failure
because the total cost of maintenance is less than the corrective mainte-
nance necessary for this run-to-failure strategy (assuming that there is no
safety impact to this run-to-failure strategy).
The objective of preventive maintenance can be summarized as fol-
lows:
Proactive Maintenance
Proactive Maintenance is one of those terms used to mean different
things to different people. The term can be a controversial one. Some con-
sider CBM and PM approaches to be proactive —these approaches do
take a proactive approach as opposed to simply reacting to equipment fail-
ure. In some cases, tasks that are generated based on what is found during
CBM and PM tasks (including work identified as a result of root cause
and failure analysis) are considered proactive. In some organizations,
proactive maintenance is calculated as a ratio of all maintenance work
minus unscheduled corrective maintenance, divided by all maintenance
work. Another definition is that anything on the maintenance schedule is
proactive — that is, any maintenance work that has been identified in
advance and is planned and scheduled. These last definitions make better
Understanding Maintenance 59
The operators use the following four sensory tools to identify prob-
lem areas, then either fix them or get help to get the problems repaired
before they turn into major failures.
a. Look for any abnormalities — clean, in place, accessible.
b. Listen for abnormal noises, vibrations, leaks.
c. Feel for abnormal hot or cold surfaces.
d. Smell abnormal burning or unusual odors.
a) Operator Involvement
Operators can detect any abnormalities and symptoms at an early
stage and get them corrected before they turn into major failures. O&M
personnel can ensure that all the assets are properly secured and bolted.
The support structures — piping, hoses, guards, etc. — are not loose and
vibrating. These should be properly fastened.
b) Cleaning
Cleaning leads to inspection and timely detection of any incipient fail-
ures like cracks and damaged belts. Dirt and dust conceal small cracks and
leaks. If an asset is clean, we could assess if things are not working right,
e.g., leaking, rubbing, and bolt loosening, which may be an indication of
an incipient failure.
Keep assets and the surrounding area clean. A clean asset creates a
good feeling and improves employee safety and morale.
c) Lubricating
Lubrication helps to slow down wear and tear. Check if components
are being lubricated properly with the correct type of lubricants and that
oil is being changed at the proper frequencies. Don’t over-lubricate; use
the right amount. Ultrasonic guns can be used to ensure the required
amount of lubricant is used. Apply 5S plus or 6S practices to have a lubri-
cation plan, with pictures identifying all lube points and the type of lubri-
cant to be used.
d) Operating Procedures
All operating procedures available at the site should be current. Are
these procedures easily understood? Do operators know how to shut down
or provide lockout / tag out for the asset safely in case of an emergency?
Do they know what operating parameters — pressures, temperature,
62 Chapter 3
trip/alarm settings, etc., — to watch? Make sure that operators and other
support personnel have a good understanding of the answers to these
questions. It is a good practice and very desirable to have these operating
instructions laminated and attached to the asset.
e) Maintenance Procedures
Be sure that maintenance / repair procedures are current when used.
Maintenance personnel should have the right tools available to perform
maintenance correctly and effectively. Having a current procedure is an
ISO principle.
When an asset is ready to be repaired, all items identified in the work
plan should be staged at the asset site for craft personnel to execute their
work in the most effective and efficient manner. Specialized tools should
be kept at or near the asset with proper markings.
It is a good practice to laminate the procedures, drawings, part list,
wiring diagrams, logic diagrams, etc., and make them available at or near
each asset location.
f) Operating Conditions
All assets are designed to operate under specific conditions. Check
that assets are operating in the correct environment and are not being mis-
used, i.e., overloaded or unsafely used. If they are not being operated in
their designed environment, (e.g., they are being used at much higher
level of speed than normal use), take steps to see that appropriate safety
precautions are being taken and all concerned personnel are aware of the
risks involved.
g) Workforce Skills
Ensure that the workforce, operators, maintainers, and support staff
are all properly trained and have the right skill sets to operate and main-
tain the asset effectively. Although ignorance and lack of skill, etc., can be
overcome easily by proper training, the human attitude and mindset
towards asset failure is somewhat difficult to handle. It takes a lot of effort
and time to create the right culture.
h) Repair Documentation
Repair documentation — what we did, with some details — is very
important when performing an analysis. We often see entries such as
“Pump broke – repaired” or “Mechanical seal replaced.” Such entries
merely help in maintaining failure statistics, but not in failure analysis.
Understanding Maintenance 63
The challenge is usually how to make data input easy for our crafts
personnel. For a good reliability analysis, we need to have quality data to
understand how the asset was found before and after the failure, what
actions were taken to repair, parts used, time taken to repair, etc.
market, starting from $1,000 to over $250,000 depending upon the num-
ber of users or the size of the plant. Most of these systems are now web-
based. Basically, now there are no major differences in the way both types
of systems function so the terms CMMS and EAM are often used inter-
changeably.
CMMS / EAM systems should have the following capabilities, al-
though they are not limited to them:
Scheduling Module
Scheduling is an area where different CMMS packages provide sig-
nificant capabilities. CMMS should provide a schedule to match the work
demand for maintenance — open work orders with the labor resource
availability. Some systems compare the work backlog with a listing of
available hours, all similarly sorted and filtered. Some system displays
this data in graphs to help in workload balancing. A good way to display
this data can be a bar graph in the top half of the screen and the lists of
work orders in the bottom half.
Some CMMS packages increased their level of sophistication by
seamless linkage to home-grown or third-party project management soft-
ware. This gives users access to comprehensive features such as critical
path analysis, Gantt charting, and resource utilization optimization.
Probably the most exciting breakthrough in scheduling functionality
is the ability to perform “what-if” analysis. By playing with variables such
as estimated duration of work, work order priority, and labor availability,
the maintenance scheduler can fine-tune the schedule without having to
make a permanent change in the source data. Only after the scheduler and
craft supervisors are satisfied with the schedule, the data is frozen and the
Understanding Maintenance 67
Mobile Technology
The popularity of mobile technology continues to rise as more users
realize its power. Meanwhile, the telecommunications networks continue
to expand their geographic reach and their ability to handle interference.
Handheld devices are also improving in terms of functionality and afford-
ability. Much of the functionality of a desktop terminal can be put in the
hands of a mobile user, including uploading and downloading work order
and spare parts inventory information, accessing equipment history and
reports, and even viewing or redlining drawings and maps. The mobile
technology is one of the most important trends being adopted in the
CMMS industry, just as the BlackBerry, Smartphones and IPads took the
business world to a whole new level.
System Affordability
The need for and use of a CMMS is not specific to any one industry
or type of application. Any organization using assets to make products or
providing services is a potential candidate for a CMMS.
Computerized systems are becoming more attractive as more mainte-
nance personnel have become computer literate and prices of hardware
and software have dropped significantly. These factors make a CMMS an
attractive option for even smaller plants. CMMS packages are available in
modular format. In other words, organizations don’t have to buy all the
modules and options. For example, smaller plants can purchase only the
asset, PM, and work order modules to start. They can add other modules
later on. Also, many CMMS programs are designed with scaled-down
functionalities for smaller plants. These programs are fully functional and
relatively inexpensive. However, organizations must determine if a
CMMS is beneficial to their operations and have buy-in from all stake
holders.
Workforce average age and continuity of the organization’s knowl-
edge base is another important issue to consider. How much information
will leave the company when a key maintenance employee retires? Years
of critical information can be lost the moment that employee walks out the
door.
System Features
There are numerous features that the system should include. One such
feature is flexibility. The CMMS should be flexible in terms of allowing
users to enter information pertaining to your organization. It should also
accommodate both present and future needs. Organizations should also be
72 Chapter 3
Ease of Use
The CMMS should be easy to learn and should come with training
aids and documentation. It should also be easy to use. The package should
be icon and menu driven, contain input screens to enter information in an
orderly manner, and provide error handling and context-sensitive help.
Vendor Support
Consider the qualifications of the potential CMMS vendors.
Obviously we want a vendor who is both knowledgeable and experienced
when it comes to CMMS. Also consider the vendor’s financial strength. A
CMMS project is an investment in time, resources, and money. Therefore,
the vendor must be established. Ask about references, delivery, payment
options, source code, and warranty.
Also, investigate the level of vendor support for training. Whether this
training is provided at their facility or on site, this small investment can
save a great deal of money and frustration in the long run. Other factors
to consider include the vendor’s system support, upgrade policy, and over-
all system cost. Select the vendor that provides the best combination of
characteristics for your particular situation.
The bottom line is that there a need for a CMMS for maintenance no
matter how small or large plant is. We should be aware of the barriers and
be well prepared to face them during the justification process. We can
avoid failure by looking at why so many installations have failed and
making the right selection for application for the organization.
Understanding Maintenance 73
It is said that “Accidents do not happen, they are caused”. The same
is true for asset failure. Assets fail due to basically two reasons: poor
design and human error. Our negligence, ignorance, and attitude are the
prime factors of human errors. Several studies have indicated that over 70
percent of failures are caused by human errors such as overloading, oper-
ational errors, ignoring failure symptoms and not repairing an asset when
it needs to be taken care, and the skill level of our work force. There is
usually a human factor behind most asset failures. Because most failures
are caused and do not happen independently, they are preventable.
If a survey is taken among operations and maintenance personnel as
to whether there can be zero failures, the overwhelming answer will be
zero failures are theoretically possible, but impossible in an actual work
environment. Yes, zero failures are difficult to achieve, but they may not
Understanding Maintenance 75
Some key maintenance metrics, with some benchmark data, are listed in
Figure 3.1.
3.8 Summary
Q3.6 List five maintenance metrics and discuss why they are
important.
80 Chapter 3
Q3.7 Name five PdM technologies. Discuss how they can help
reduce maintenance costs.
4.1 Introduction
4.2 Key Terms and Definitions
4.3 Work Flow and Roles
4.4 Work Classification and Prioritization
4.5 Planning Process
4.6 Scheduling Process
4.7 Turnarounds and Shutdowns
4.8 Measures of Performance
4.9 Summary
4.10 Self Assessment Questions
4.11 References and Suggested Reading
81
82 Chapter 4
4.1 Introduction
Coordinators
Individuals who oversee the execution of all work within a facil-
ity, including maintenance. They are accountable to the asset or
process owner for insuring that the asset or process is available to
perform its function in a safe and efficient manner and to help pri-
oritize the work according to the operational needs.
Planned Work
Work that has gone through a formal planning process to identi-
fy labor, materials, tools, work sequence, safety requirements,
etc., to perform that work effectively. This information is assem-
bled into a job plan or work package and is communicated to craft
workers prior to the start of the work.
Planner
A dedicated role with the single function of planning work tasks
and activities.
Planning
The process of determining the resources and method needed
including safety precautions, tools, skills, and time necessary to
perform maintenance work efficiently and effectively. Planning is
86 Chapter 4
Schedule Compliance
A measure of adherence to the schedule. It is calculated by the
number of scheduled jobs (or scheduled labor hours) actually
accomplished during the period covered by an approved
daily/weekly schedule, expressed as a percentage.
Scheduled Work
The work that has been identified in advance and is logged in a
schedule so that it may be accomplished in a timely manner based
upon its criticality.
Schedulers
Individuals who establish daily, weekly, monthly, and/or rolling
yearly maintenance work schedules of executable work in their
facility. The schedule includes who will perform and when the
work will be performed. The schedule is developed in concert
with the maintenance craft supervisor and operations.
Scheduling
The process of determining which jobs get worked on, when, and
by whom based on the priority, the resources, and asset availabil-
ity. The scheduling process should take place before the job is
executed. In short, scheduling defines when and who execute the
work tasks.
Turnaround
Planned shutdown of equipment, production line, or process unit
to clean, change catalyst, and make repairs, etc., after a normal
run. Duration is usually in days or weeks; it is the elapsed time
between unit shutdown and putting the unit on-stream/online
again
Work Management: Planning and Scheduling 87
Work Plan
An information packet, sometime called job or work package,
provided to the worker; it contains job specific requirements such
as task descriptions sequenced in steps; job specific instructions;
and safety permits/ procedures, drawings, materials, and tools
required to perform the job effectively.
Figure 4.3 illustrates the work flow and key players in the mainte-
nance work flow process. The following are the key players in this
process:
• Coordinator—Asset / Resource
• Planner
• Scheduler
• Configuration Specialist/Systems Engineer
• Craft Supervisor
• Work Performer
As the work order (WO) gets routed from one stage to another, a WO
status is assigned based on what’s being done to that WO. Figure 4.4 is a
suggested list of Work Order Status Codes. In addition, work type, as sug-
gested in Figure 4.5, is also assigned by the coordinator or the
planner/scheduler. It is a good practice to code the work orders to help
analyze the data for improvements. More will be said about work order
classification in the next section.
Maintenance planners plan the job and create a work plan or job pack-
age that consists of what work needs to be done; how it will be done; what
materials, tools, or special equipment are needed; estimated time; and
skills required. The planners need to identify long delivery items and work
with stores and purchasing personnel to insure timely delivery. Planners
may need to work with maintenance / systems engineers and craft super-
visors for technical support to insure that the work plan is feasible with
sufficient technical details.
Maintenance schedulers—in working with the craft supervisor, coor-
dinator, and other support staff—develop weekly, monthly, and rolling
annual long-range plans to execute maintenance work. They are more
concerned with when the job should be executed in order to optimize the
available resources with the work at hand.
Craft supervisors take the weekly schedule and assign who will do the
job on a daily basis. In addition, they also review work plans from an exe-
cution point of view and recommend necessary changes in work plans to
the planner and the scheduler. It is also their responsibility to ensure that
the high work quality is maintained and details of work completed are
documented properly in the system.
Figure 4.6 illustrates a work flow process with its key elements; it
includes an example of a productivity report based on delay hours
reported.
PM—Run-Based
PM—Run-Based Maintenance (RBM) is typically the next step up
from calendar-based maintenance. It involves performing PMs based on
asset cycles or runtime. Intuitively, this approach makes sense. An asset
does not have to be checked repeatedly if it has not been used. Generally
speaking for some failure modes, it is the actual operation of the asset that
wears it down, so it makes sense to check the asset after it has been work-
ing for a specified amount of time to cause some wear. It may be neces-
sary either to adjust or replace the component.
PM—Condition-Based
PM—Condition-Based Maintenance (CBM), also known as
Predictive Maintenance (PdM), attempts to evaluate the condition of an
asset by performing periodic or continuous asset monitoring. This
approach is the next level up from runtime-based maintenance. The ulti-
mate goal of CBM is to perform maintenance at a scheduled point in time
when the maintenance activity is most cost effective yet before the asset
fails in-service. The “predictive” component stems from the goal of pre-
dicting the future trend of the asset’s condition. This approach uses prin-
ciples of statistical process control and trend analysis to determine at what
point in the future maintenance activities will be appropriate and cost
effective.
PM—Operator-Based
PM—Operator-Based Maintenance (OBM) uses the fact that opera-
tors are often the first line of defense against unplanned asset downtime.
OBM assumes that the operators who are in daily contact with the assets
can use their knowledge and skills to predict and prevent breakdowns and
other losses. OBM is synonymous with autonomous maintenance, one of
the basic pillars of Total Productive Maintenance (TPM). TPM is a
Japanese maintenance philosophy that involves operators performing
some basic maintenance activities. The operators learn the maintenance
skills they need through the training program and use those skills on a
daily basis during operations.
Job Priority
Priority codes allow ranking of work orders to get work accom-
plished in order of importance. Too many organizations neglect the ben-
efits of a clearly-defined prioritization system. Organizational discipline
that comes through communication, education, and management support
is key to the correct usage of priority codes.
Many organizations have more than one prioritization systems; how-
ever, most of them have been found to be ineffective. The drawbacks of
not clearly defining the priorities include:
The original priority of the work orders needs to be set by the origi-
nator of the work order and should be validated by the coordinator. The
work originator is the most qualified to make an initial assessment of asset
criticality and impact of the work. Listings of major assets and their crit-
icality will help in decision-making for final priority ranking. Lower crit-
icality items or areas will then be easier to recognize. The following cri-
teria can be used to assign asset criticality and work impact (if not correct-
ed), which can then be used to make an objective assessment of overall
job priority.
Asset Criticality
Criticality # Description
WO #2:
Asset Criticality of 4 and Work Impact of 4 gives an over
all job priority of 16.
Backlog Management
The combination of work classification and job priority allow an
organization to make sense out of their maintenance backlog. A mainte-
nance backlog is very simply the essential maintenance tasks to repair or
prevent equipment failures that have not been completed yet. By classify-
ing these maintenance tasks into different categories and then prioritizing
within those categories, maintenance backlogs can be developed from an
overall organizational perspective or within smaller organizational groups
or categories (e.g., PM, CM).
Why manage your backlog at all? Why not just work whatever main-
tenance tasks come due? The more toward proactive maintenance that an
organization moves, the more likely it is, at least in the beginning, that the
organization will identify more maintenance tasks than can possibly be
addressed within that immediate time period (typically that week).
Therefore, to keep from addressing the low priority tasks or the categories
of work that will not have the largest impact to the overall reliability of
the organization, a backlog management system must be developed. Then
the most effective approach to the backlog management system requires
appropriate work classification and priority.
Basics of Planning
Planning defines what work will be accomplished and how.
Scheduling identifies when the work will be completed and who will do
it. Planning and scheduling are dependent on one another to be effective.
However, planning is the first step. The ultimate goal of the planning
process is to identify and prepare a maintenance craft person with the
tools and resources to accomplish this work in a timely and efficient man-
ner. In other words, planning provides maintenance craft workers with
everything they need to complete the task efficiently.
Many maintenance engineers and managers consider planning to be
nothing more than job estimating and work scheduling. This is not true.
Planning is the key enabler in reducing waste and non productive time,
thereby improving productivity of the maintenance workforce. Many
organizations have started considering planning to be an important func-
tion.
However, they realize that proper planning is not an easy task to do.
It takes time to do it right. The time needed to plan a job properly can be
considerable, but it has a high rate of return. It has been documented by
many studies including Doc Palmer—noted author of Maintenance
Planning and Scheduling Handbook—and the author’s own experience
document that proper planning can save 1–3 times the resources in job
execution. If a maintenance job is repeatable, as most are, then it is essen-
tial to plan the work properly because it will have a much higher rate of
return.
Consider a maintenance shop AB where most of the work is per-
formed on a reactive basis. The shop has no planner or scheduler on the
staff. It has:
• 20 maintenance craft personnel
• 0 planner/schedulers
• 1 supervisor
• Estimated wrench time = 30%
lowing staff:
The XY shop has performed 156 hours (396 – 240) of additional work
with the same number of personnel as AB shop. This equates to a 65%
increase in resources or 13 more people on the staff.
But as mentioned earlier, planning requires more than just changing
personnel from a craft function to a planner/scheduler function. They
must have the skills and experience to understand the different types of
work and the various details that will need to be organized and assembled
for that specific task (skills and resources, steps and procedures, parts and
tools).
Understanding Work
The work to be performed needs to be clearly understood. If the scope
of the work has not been defined clearly, the maintenance planner must
talk to the requester, visit the job site, and identify what steps, procedures,
specifications, and tools are required to perform the job correctly. If the
job is too large or complicated, it may have to be broken down into small-
er sub-tasks for ease of estimating and planning.
basic understanding and knowledge of their trade and plant assets will
determine the level of detailed steps and work instructions required in the
planning process. Highly-skilled workforces may not need detailed
instructions. Job estimating can become easier and potentially more accu-
rate when the jobs are broken down into smaller elements. Long and com-
plex jobs can be difficult to estimate as a whole.
A job standards database such as Means Standards or other standard
benchmarks can be used to estimate jobs. It is a good practice to build a
labor standards library for specific jobs, e.g., removing/installing motors,
5-50HP, 100-500 HP, replacing brake shoes on an overhead crane or fork-
lift, or aligning a pump–motor unit. Predetermined motion times, time
studies, and slotting techniques can be used to develop good estimates if
tasks are repetitive in nature. An estimate should include work content,
travel time, and personal and fatigue allowances.
The following are essential to good estimating practices:
Craft Type:
Optional Parts:
(These parts are not required but could be needed if they are worn out)
Part# 3311 Bolts, Coupling (9-16 x 3); Location: Free Bin, Shop
Procedure:
Step 1: Lock out / tag out (see attached procedure for details).
Step 2: Disconnect motor, mark/label wires.
Step 3: Unbolt coupling, inspect coupling and remove motor bolts.
Step 4: Remove motor using jib crane available.
Step 5: Install new motor (check motor is rotating freely).
Step 6: Bolt motor and check for soft foot. – record and correct any soft
foot findings.
Step 7: Install coupling, bolt motor (torque bolts to xx ft. lbs) and align them
using dial gauge or laser within acceptable range +/- 0.xxx (organiza-
tion standard).
Step 8: Remove lock out / tag out.
Step 9: Connect the motor and check for right rotation.
Step 10: Test run.
Step 11: Clean up and return asset to service.
Step 12: Close out the work order in CMMS detailing what was done.
due dates.
Once a job is on the schedule, the materials list should go to the MRO
store for parts kitting and material staging before the specified schedule
date. In many organizations, the CMMS / EAM system does this work
automatically. In addition, the job work package will be delivered or made
available to the individuals who will execute the job.
When the scheduled time for the job arrives, the maintenance person-
nel will have everything they need for the job:
Planning & Can be planned and scheduled Planning & scheduling can’t be finalized
Scheduling in well advance until scope is approved
Manpower Fixed, usually don’t change Variable, changes a lot during execution
staffing much due to scope fluctuations
Schedule
Weekly or bi-monthly Shift and daily basis
update
projects because of the loss of production and the expense of the turn-
around itself. They can be complex, especially in terms of shared
resources; as the complexity increases, they become more costly and dif-
ficult to manage. Scheduled shutdowns usually are of a short duration and
high intensity. They can consume an equivalent cost of a yearly mainte-
nance budget in just a few weeks. They also require the greatest percent-
age of the yearly process outage days. Controlling turnaround costs and
duration represents a challenge.
A shutdown always has a negative financial impact. This negative
impact is due to both loss of production revenue and a major cash outlay
for the shutdown expenses. The positive side is not as obvious; therefore,
it is often overlooked. The positive impacts are an increase in asset relia-
bility, continued production integrity, investment in infrastructure, and a
reduction in the risk of unscheduled outages or catastrophic failure.
Scope management is one of the major challenges in a turnaround.
The scope will change, sometimes dramatically, and it will impact the
schedule. Typically, scope is developed based on information gathered
from operating parameters, capital investments, preventive maintenance
actions, and predictive tools. Sometimes, we don’t have a good under-
standing of the scope until an asset or system is opened for inspection. As
an asset is opened, cleaned, and inspected, the extent of required repairs
can be determined and planned.
There are distinct differences between turnaround maintenance work
and capital projects. Work scope is well defined in capital projects; how-
ever, in turnarounds, scope is dynamic and fluctuates a lot. Figure 4.10 list
major differences between capital projects and turnarounds.
Identifying and appointing a Turnaround Planner well in advance,
maybe six to eight months, is a good practice. This planner helps to devel-
op the scope, integrate the full scope of work including resources, and
assure readiness for execution of the turnaround. Similarly, identifying
and appointing a Turnaround Manager well in advance, maybe three to
four months, is also a good practice. The Turnaround Manager should
have the delegated authority to lead the turnaround effort to a successful
conclusion. In some organizations, new turnaround managers and plan-
ners get appointed just after completion of the last turnaround, as an ongo-
ing process to begin planning for the next turnaround. Lessons learned
from the previous turnaround are then transferred to the planning and exe-
cution of the next turnaround.
110 Chapter 4
4.9 Summary
Q4.2 Explain each role as shown in the workflow chart from Q 4.1.
Q4.8 What are the key differences between planning and scheduling
processes?
Q4.10 What are the key differences between capital projects and
turnarounds?
5.1 Introduction
5.2 Key Terms and Definitions
5.3 Types of Inventory
5.4 Physical Layout and Storage Equipment
5.5 Optimizing Tools and Techniques
5.6 Measures of Performance
5.7 Summary
5.8 Self Assessment Questions
5.9 References and Suggested Reading
5.1 Introduction
MRO
Maintenance, Repair, and Operations. Sometime “O” is referred to
as Overhaul.
Store
Maintenance, repair, and operations store; it stocks all the material
and spare parts required to support maintenance and operations.
Spare Parts
Replacement items found on a bill of material and/or in a CMMS;
inventory management system that may or may not be kept in inven-
tory to prevent excessive downtime in case of a breakdown.
Stratification
A technique that separates data gathered from a variety of sources so
that a pattern can be seen.
122 Chapter 5
Inventory Classifications
Inventories can be classified into three major categories based on their
usage rate:
1. Active inventory
2. Infrequently used inventory
3. Rarely used inventory
Materials, Parts, and Inventory Management 123
120
100
80
Percent
Active items
60 Infrequently/rarely
used items
40
20
0
# of items value $ # of transcations
Items
100
90
Percentage of Dollar Value
80
70
60
50
40 A
30
20
10 B
C
0
0 10 20 30 40 50 60 70 80 90 100
Percentage of Inventory Items
A items = Value over $1,000 and usage rate less than 6/year
B Items = Value of $100–999 and usage rate over 6/year
C items = Value below $100 and usage rate over 12/year
B 10010 Hydraulic Cyclinder # xxxx $850.00 6 10 4 $3,440.00 $23,630.00 22% 109 13%
C 10008 “O” rings - misc sizes kit $1.90 200 680 80 $152.00
C 10014 Utility Supply misc. $1.30 600 2000 340 $442.00 $5,678.00 5% 727 86%
Two different storerooms are shown in Figure 5.5. On the left is a dis-
organized store room and on the right is clearly a well-maintained store-
room. It will be difficult to find an item in the one on the left. A typical
material flow in a facility with a storeroom is shown in Figure 5.6. When
designing the storeroom, ensure that material flow is smooth and reduces
travel and procurement time.
Storage Equipment
Parts storage equipment can generally be broken down into two main
categories: man to part and part to man. The first category, man to part,
will be most familiar to personnel and consists of storage standbys like
pallet racks, shelving, and bin storage. In this arrangement, we go to the
part to pick it. This arrangement is very common in small stores.
Materials, Parts, and Inventory Management 131
In the part to man arrangement, the part comes to us. With the advent
of system-directed storage, and particularly when integrating with produc-
tion and distribution storage, part to man systems—such as horizontal and
vertical carousels, and Automated Storage and Retrieval System
(AS/RS)—have become viable. They may offer significant improvements
in parts storage efficiency.
Man to Part Man to part storage systems are the mainstay of parts stor-
age. Initially cheaper than automated part to man storage systems, they
can provide dense part storage. Of the two types, man to part is easier to
manage manually. On the downside, this type of storage (by itself) does
not provide part check-in / check-out control and inventory tracking,
which can lead to lower inventory storage accuracy. The three major types
of man to part storage are described below:
Pallet Rack The big brother to shelving and bin storage, pallet rack is
the second most common type of parts storage. It is used primarily for
parts that are too big or too heavy for shelving. Pallet rack storage has the
common advantage of having a low initial installation cost and virtually
no maintenance; it is very configurable. Negatives include a lower storage
utilization density than shelving or modular drawers. Furthermore, either
rack decking or actual pallets are required for storage on the rack beams.
Part to Man Part to man storage systems are usually automated storage
devices that offer several advantages over standard man to part methods.
These advantages include controlled access that provides more part pro-
tection and security, check-in / check-out processes that aid in access
supervision and tracking, and ease of access to a greater vertical dimen-
sion. This last feature often results in more effective storage density per
square foot of floor space.
One of the major disadvantages of part to man systems is high initial
cost. Automated storage systems are more difficult to reconfigure than
more traditional storage methods; they also have an on-going mainte-
nance cost associated with their use. Regardless of configurability and
upkeep concerns, the high initial cost of these systems has been most
responsible for the relatively low numbers of automated storage equip-
ment in use for stores. With the advent of integrated systems, and the
resulting ability to combine stores with production and inventory stores,
this investment cost is not specific to maintenance and can be spread
across other department budgets. This ability to spread investment costs
across departments has resulted in an increase in the use of automated
storage systems and warrants their inclusion in any discussion of planned
storage. The three major types of part to man storage are described below:
Materials, Parts, and Inventory Management 133
Inventory Accuracy
Achieving a high level of inventory accuracy is a critical factor in the
success of storeroom operations. Accurate inventory is defined as the
actual quantity and types of parts in the right location in the storeroom
matching exactly what is shown on the inventory system in the
CMMS/EAM system. If a part, quantity, or location is not correct when
matched against the system, then that location is counted as an error.
Some limited variance can be tolerated in the case of certain supplies such
as nuts and bolts, as they can be considered consumable items.
Inventory accuracy is important for several reasons. The conse-
quences of inaccurate inventory are:
Figure 5.9
Kitting Bins
Materials, Parts, and Inventory Management 137
where
D = Demand / Usage in units per year
S = Ordering cost per order
H = Inventory carrying cost per unit per year
Annual Demand
The number of units of an item used per year may be explained as the
annual usage.
Ordering Cost
Also known as purchase cost, this is the sum of the fixed costs that are
incurred each time an item is ordered. These costs are not associated with
the quantity ordered, but primarily with physical activities required to
process the order. There is a big variation in this cost. We have found that
an order cost varying $20–200 per order depending upon factors such as
organization size. Usually order cost includes the cost to enter the pur-
chase order or requisition, any approval steps, the cost to process the
receipt, incoming inspection, invoice processing, and vendor payment. In
some cases, a portion of the inbound freight may also be included in order
cost. These costs are associated with the frequency of the orders and not
the quantities ordered.
Carrying Cost
Also called holding cost, carrying cost is the cost associated with hav-
ing inventory on hand. It includes the cost of space to hold and service the
items. Usually this cost varies between 20–30% of the item’s value on an
annual basis.
Figure 5.10 graphically portrays the concept of a typical EOQ and
stocking levels. The illustration assumes a constant demand–consump-
tion, failure rate, and constant lead time. In real practice, demands are not
always constant and often reorder cycle changes with time.
The next two examples demonstrate how EOQ and total inventory
costs may be computed.
Example #1
A plant buys lubricating oil in 55-gallon drums and its usage rate is an
average of 132 drums of oil in a year. What would be an optimal order
quantity and how many orders per year will be required? What would be
the additional total cost of ordering and holding these drums in store if
ordering cost is increased by $10/order? Plant data indicates that:
Solution:
The economic order quantity (EOQ) is:
Where
Annual usage D = 132
Ordering cost S = $60
Annual carrying cost H = 22% of item cost = 0.22 x $500 = $110
Total annual cost (TC) of ordering and holding oil drums in inventory
Now, if the cost of ordering is increased by $10 to $70 per order, the
new EOQ and total cost (TC) can be calculated as follows:
Example #2
A plant maintenance department consumes an average of 10 pairs of
safety gloves per day. The plant operates 300 days per year. The storage
and handling cost is $3 per pair and it costs $25 to process an order.
Solution:
a) Given,
D = Annual demand = 10 x 300 = 3000 pairs
S = Ordering cost = $25/order
H = Carrying cost = $3/pair
142 Chapter 5
An order quantity of 200 or 250 would give us the same total cost.
Therefore, we could go with an EOQ of 200 gloves/order.
With increased carrying cost from $3.00 to $3.50 per glove, the new
EOQ is still 200, but the total cost of ordering and carrying safety gloves
would increase to $725 per year.
New Technologies
New technology such as bar codes, Radio Frequency Identification
Device (RFID), and handheld data collectors similar to those used in
supermarkets or FedEx/ inventory systems could effectively help improve
productivity of storeroom operations. The introduction of bar coding, auto
ID (identification) systems, and now RFID technology into storerooms
has resulted in a significant contribution to storeroom productivity, inven-
tory accuracy, and error elimination. Use of this new technology in store-
rooms is a best practice.
Automated ID Technology
No discussion of parts storage would be complete without some dis-
cussion of automated ID technology. Although commonly used in distri-
bution operations for years, the use of automated ID in —tied mainly to
CMMS use—is only now beginning to increase. Whereas a stand-alone
maintenance system (particularly a smaller one) may function well with
manual entry and tracking of parts, integration with manufacturing and
distribution parts storage systems will almost certainly warrant the invest-
ment in and use of some form of automated ID technology. The discus-
sion below covers the two most common automated ID technologies: bar
coding and radio frequency identification (RFID).
the two is that RFID uses radio waves to read the tag data, whereas bar-
code readers use light waves (laser scanners).
Although still in early adoption, RFID offers several distinct advan-
tages over traditional bar coding, including:
• No line of sight required
• Dynamic tag read/write capability
• Simultaneous reading and identification of multiple tags
• Tolerance of harsh environments
5.7 Summary
Q5.4 How you will organize a store room? Discuss the key features
of a small store room you have been asked to design.
Materials, Parts, and Inventory Management 149
Q5.10 What is meant by shelf life? What should be done to improve it?
6.1 Introduction
6.2 Key Terms and Definitions
6.3 Defining and Measuring Reliability and Other Terms
6.4 Designing and Building for Maintenance and Reliability
6.5 Summary
6.6 Self Assessment Questions
6.7 References and Suggested Reading
6.1 Introduction
Total cost
Optimum
level
Cost
Reliability
Reliability (availability)
Availability (A)
The probability that an asset is capable of performing its intend-
ed function satisfactorily, when needed, in a stated environment.
Availability is a function of reliability and maintainability.
Failure
Failure is the inability of an asset / component to meet its
expected performance. It does not require the asset to be inoper-
able. The failure could also mean reduced speed, or not meeting
operational or quality requirements.
Failure Rate
The number of failures of an asset over a period of time. Failure
rate is considered constant over the useful life of an asset. It is
normally expressed as the number of failures per unit time.
Denoted by Lambda (λ), failure rate is the inverse of Mean
Time Between Failure (MTBF).
Maintainability (M)
The ease and speed with which a maintenance activity can be
carried out on an asset. Maintainability is a function of equip-
ment design and usually is measured by MTTR.
Reliability (R)
The probability that an asset or item will perform its intended
functions for a specific period of time under stated conditions. It
is usually expressed as a percentage and measured by the mean
time between failures (MTBF).
Uptime
Uptime is the time during which an asset or system is either
fully operational or is ready to perform its intended function. It
is the opposite of downtime.
157
158 Chapter 6
time by the number of failures. Suppose an asset was in operation for 2000
hours (or for 12 months) and during this period there were 10 failures. The
MTBF for this asset is:
with minimal operating time). This type of trend line is essential for track-
ing impact of improvements. Figure 6.2a shows MTBF trend data, which
is increasing. This trend is a good one.
Figure 6.2b shows MTTR trend data, which is increasing. It is going
in the wrong direction. We need to evaluate why MTTR is increasing by
asking: Do we have the right set of skills in our work force? Do we iden-
tify and provide the right materials, tools, and work instructions? What
can we do to reverse the trend?
Figure 6.2c shows MTTR trend data, which is decreasing. In this case,
the trending is in the right direction. To continue this trend, we need to ask
the questions: What caused this to happen? What changes did we make?
Trending of this type of data can help to improve the decision process.
Availability
Availability (A) is a function of reliability and maintainability of the
asset. It is measured by the degree to which an item or asset is in an oper-
able and committable state at the start of the mission when the mission is
called at an unspecified (random) time.
In simple terms, the availability may be stated as the probability that
an asset will be in operating condition when needed. Mathematically, the
availability is defined:
MTBF Uptime
Availability (A) = =
MTBF+MTTR Uptime + Downtime
tations. In some cases, if assets are not very critical, the standard may be
lower. But in case of critical assets such as aero-engines or assets involved
with 24-7 operations, the standard may require 99% or higher availability.
In general, the cost to achieve availability above 95% increases expo-
nentially. Therefore, we need to perform operational analysis to justify
high availability requirements, particularly if it’s over 97 percent.
Example 1
A hydraulic system, which supports a machining center, has operat-
ed 3600 hours in the last two years. The plant’s CMMS system indicated
that there were 12 failures during this period. What is the reliability of this
hydraulic system if it is required to operate for 20 hours or for 100 hours?
– (0.003334) (20)
R (20) = e = 93.55%
ity of 71.65% is not acceptable. The system needs to have 95% or better
assurance (probability) to meet the customer’s need.
To have reliability requirements of 95% for 100 hours of mission
time, we need to calculate a new failure rate, λ. We use the reliability
equation,
(λ x 100) = 0.05
100 λ = 0.05
Thus,
This indicates that the failure rate needs to be dropped from 0.00334
(or an MTBF of 300 hours) to a new failure rate of 0.0005 (or an MTBF
of 2000 hours). If we consider the same 3600 operating hours, then the
number of failures needs to be reduced from 12 to 1.8. A root cause fail-
ure or FMEA analysis needs to be performed on this hydraulic system to
identify unreliable components. Some components may need to be re-
designed or replaced to achieve the new MTBF of 2000 hours.
Example 2
A plant’s air compressor system operated for 1000 hours last year. The
plant’s CMMS system provided the following data on this system:
Figure 6.6
Compressor
Failure and
Repair Time
Data
166 Chapter 6
and took 6 hours to repair; and so forth. The total repair time for 10 fail-
ures is 50 hours.
Calculating Availability
Earlier, we calculated,
Then,
or
This means that the asset is available 95% of the time and is down for 5%
of the time for repair.
Measuring amd Designing for Reliability and Maintainability 167
Calculating Reliability
As calculated earlier for the compressor unit,
This data indicates that the reliability of the air compressor unit in this
example is 90% for 10 hours of operation. However, reliability drops to
37% if we decide to operate the unit for 100 hours. For 20 hours of oper-
ation, reliability is 82%. If this level of reliability is not acceptable, then
we need to perform root cause failure or FMEA analysis to determine
what component needs to be redesigned or changed to reduce the number
of failures, thereby increasing reliability.
(RBD). This diagram shows logical connections among the system’s com-
ponents and assets. The RBD is not necessarily the same as a schematic
diagram of the system’s functional layout. The system is usually made of
several components and assets which may be in series, parallel, or combi-
nation configurations to provide us the designed (inherent) reliability. The
RBD analysis consists of reducing the system to simple series and paral-
lel component and asset blocks which can be analyzed using the mathe-
matical formulas.
Figure 6.7 shows a simple diagram, using two independent compo-
nents and assets to form a system in series.
R = R x R x R x R x…..R
sys 1 2 3 4 n,
And the reliability of system R as shown in Figure 6.7
sys12
Rsys = R1 x R2
12
or –(λ1 + λ2)t
Rsys =e
12
Figure 6.8
An Example of a Parallel System
Measuring amd Designing for Reliability and Maintainability 169
R = 1 – (1 – R ) (1 – R )
sys34 3 4
or
R = R3+ R4– (R3x R4)
sys34
+λte λ
–(λ)t –( )t
Rsys-standby = e
Example 3
In a two-component parallel system with a failure rate of 0.1 /hour of
each component, what would be the active and standby reliability of the
system for one hour of operation?
170 Chapter 6
= R + R – (R R )
3 4 3 4
Note that
–(λ )t –(0.1)1
R =R =e 3 =e = 0.09048
3 4
and therefore
–(λ)t –(λ)t
Rstandby = e +λte or = R + λ t R
= 0.9048 + (0.1 x 1 x 0.9048) = 0.9953
R =R xR xR xR xR
sys 1 2 B 10 C
m - # of
Working
element Overall System Reliability
1 out of 2 R2 + 2R (1-R) = 1 - (1-R)2
2 out of 2 R2
= (0.512) + (0.384)
= 0.896
The whole system reliability
Example 4
Figure 6.11 shows a compressor drive system consisting of seven compo-
nents, motors, gear boxes, the compressor itself, electrical controls, and lube
oil system, with failure rates based on four years of data. All components are
assumed to be in series arrangement. Figure 6.12 shows the same compressor
system in RBD format.
The total system failure rate based on the last 4 years of data,
Therefore,
R =R xR xR xR xR xR xR
sys 1 2 3 4 5 6 7
X1 = 0.9772
X2 = 0.8545
=R +R – (R xR )
X1 X2 X1 X2
= 0.9772 + 0.8545 – (0.9772 x 0.8545)
= 1.8317 – 0.8350
= 0.9967 or 99.67%
Measuring amd Designing for Reliability and Maintainability 175
Figure 6.14 Example of a Reliability Block Diagram at Plant Systems
176 Chapter 6
=R xR
X1 X2
= 0.9772 x 0.8545
= 0.8350
So, when there is need for only one compressor unit, we are 98% reli-
able. However, when there is need for both compressors, we are only 85%
reliable to meet the customer’s needs. This level may be acceptable. If not,
we may need to redesign or replace some of the components in compres-
sor X2 to make it more reliable.
Similarly, a reliability block diagram for a process, a manufacturing
line, or a plant could be developed, as shown in Figure 6.14. This type of
reliability block diagram can provide the information needed to improve
the reliability of systems in the plant.
• Acquisition Cost
• Design and Development
• Demonstration and Validation (mostly applicable to one-of-a-
kind, unique systems)
• Build and Installation (including commissioning)
• Disposal
Measuring amd Designing for Reliability and Maintainability 177
For a Typical
DoD System* Industrial
Based on the calculations and data above, we can specify the following
requirements for this new system we are procuring.
Measuring amd Designing for Reliability and Maintainability 181
• Reliability Analysis
• Lowers asset and system failures over the long term
• System reliability depends on robustness of design, as well as
quality and reliability of its components
• Maintainability Analysis
• Minimizes downtime — reduces repair time
• Reduces maintenance costs
• Logistics Analysis
• Reduces field support cost resulting from poor quality, relia-
bility, maintainability, and safety
• Insures availability of all documentation, including PM plan,
spares, and training needs
6.5 Summary
incurred in the future are set during the design and development phase of
the asset. Therefore, we must adequately address reliability, maintainabil-
ity, and safety aspects of the system in order to reduce the overall life
cycle cost of the assets during the design and building of the assets.
Q6.9 What is the impact of O&M cost on the total life cycle cost of
an asset?
7.1 Introduction
7.2 Key Terms and Definitions
7.3 The Role of Operations
7.4 Total Productive Maintenance (TPM)
7.5 Workplace Organization: 5S
7.6 Overall Equipment Effectiveness (OEE)
7.7 Measures of Performance
7.8 Summary
7.9 Self Assessment Questions
7.10 References and Suggested Reading
7.1 Introduction
Five S (5S)
5S is a structured program to achieve organization-wide clean-
liness and standardization in the workplace. A well-organized
workplace results in a safer, more efficient, and productive
operation. It consists of five elements: Sort, Set in Order,
Shine, Standardize, and Sustain.
Operator Driven Reliability 191
Utilization Rate
The percentage of time an asset is scheduled to operate divided
by the total available time (which could be 24 hours a day, 365
days a year, etc,).
192 Chapter 7
Visual Workplace
A visual workplace uses visual displays to relay information to
employees and guide their actions. The workplace is setup with
signs, labels, color-coded markings, etc., so that anyone unfamiliar
with the assets or process can readily identify what is going on,
understand the process, and know both what is being done correct-
ly and what is out of place.
Figure 7.1 lists Japanese words and definitions that have a special indus-
trial meaning. Some of these words are becoming part of our routine work-
place terminology.
Right from the start, TPM requires effective leadership and involve-
196 Chapter 7
Benefits of TPM
1. Increased productivity
2. Reduced manufacturing cost
3. Reduction in customer complaints.
4. Satisfy the customer’s needs by 100%
• Delivering the right quantity
• At the right time
• With best, required quality
5. Reduced safety incidents and environmental concerns.
The employees are empowered and gain a real sense of owning the
assets they operate.
TPM Pillars
TPM consists of eight pillars of activities that impact all areas of the
organization. These pillars are:
The following are six major losses that can become a focus of kaizen
teams to improve effectiveness:
1. Breakdown losses
2. Setup and adjustment losses
3. Idling and minor stoppage losses
4. Speed losses
5. Defective product losses (quality) and rework
6. Equipment design losses
5–6 minutes). They are usually difficult to record. As a result, these loss-
es are usually hidden from production reports. These are built into
machine capabilities, but provide substantial opportunities for improving
production efficiencies.
Speed losses Sometime equipment must be slowed down to prevent
quality defects or minor stoppages, resulting in production losses. In most
cases, this loss is not recorded because the equipment continues to oper-
ate.
Quality defect losses These losses result from out-of-spec produc-
tion and defects due to equipment malfunction or poor performance, lead-
ing to output which must be reworked or scrapped as waste.
Equipment design losses These losses are typical of heavy wear and
tear on equipment due to “non-robust” design, which reduces their
durable and productive life span. Such designs lead to more frequent
equipment modifications and capital improvements.
By using a detailed and thorough analysis, equipment design losses
are reduced or eliminated in a systematic manner using tools such as
Pareto, 5-Why Analysis, and Failure Modes and Effects Analysis
(FMEA). The use of such tools are not limited to production areas, but can
be employed in administrative and service areas as well to eliminate loss-
es or waste. These and other improvement tools are discussed in more
detail in Chapter 11.
• Zero accidents
• Zero health concerns (damage)
• Zero environmental incidents
Implementing TPM
Many successful organizations usually follow an implementation
plan that includes the following 10 steps:
S1 Sort (Seiri)
S2 Set-in-Order (Seiton)
S3 Shine (Seiso)
S4 Standardize (Seiketsu)
S5 Sustain (Shitsuke)
S1 Sort (Seiri)
Sort is the first step in making a work area tidy. It refers to the act of
throwing away all unwanted, unnecessary, and unrelated materials in the
workplace and freeing up additional space. This step makes it easier for
operators and maintainers to find the things they need. This step requires
keeping only what is necessary. Materials, tools, equipment, and supplies
that are not frequently used should be moved to a separate, common-stor-
age area. Items that are not used should be discarded. Don’t keep things
around just because they might be used someday.
As a result of the sorting process, we will eliminate (or repair) broken
equipment and tools. Obsolete fixtures, molds, jigs, scrap material, waste,
and other unused items and materials are discarded.
People involved in Sort must not feel sorry about having to throw
away things. The idea is to insure that everything left in the workplace is
related to work. Even the number of necessary items in the workplace
must be kept to its absolute minimum. Because of the Sort concept, the
simplification of tasks, effective use of space, and careful purchase of
items will follow.
204 Chapter 7
S2 Set-in-Order (Seiton)
Set-in-order (also, sometimes known as Straighten) or orderliness is
the second step and is all about efficiency. It requires organizing, arrang-
ing, and identifying everything in a work area. Everything is given an
assigned place so that it can be accessed or retrieved quickly, as well as
returned to that same place quickly. If everyone has quick access to spe-
cific items or materials, work flow becomes efficient, and the worker
becomes productive. The correct place, position, or holder for every tool,
item, or material must be chosen carefully in relation to how the work will
be performed and who will use which items. Every single item must be
allocated its own place for safekeeping. Each location must be labeled for
easy identification of its purpose.
Commonly-used tools should be readily available. Properly label stor-
age areas, cabinets, and shelves. Clean and paint floors to make it easier
to spot dirt, waste materials, and dropped parts and tools. Outline areas on
the floor to identify work areas, movement lanes, storage areas, finished
product areas, etc. Put shadows on tool boards, making it easy to quickly
see where each tool belongs.
In an office environment, provide bookshelves for frequently used
manuals, books, and catalogs. Label the shelves and books so that they are
easy to identify and return to their proper place.
Again, the objective in this step is to have a place for everything and
everything in its place, with everything properly identified and labeled.
Many M&R professionals have started calling these practices of using
labels, color-coded markings, etc., a visual workplace. This practice helps
operators and anyone unfamiliar with the asset or process to readily iden-
tify what is going on, understand the process, and know what is to be done
correctly and what is out of place. A visual workplace uses visual displays
to relay information to operators and other employees, and to guide their
actions.
Figure 7.2a indicates safe operating parameters for oil and pressure
levels and also when to tighten the chain or belt. Figure 7.2b displays an
organized tool box and provides examples of labels applied to a switch
box and a Danger Area. Figure 7.2c demonstrates the ASME standard
color code scheme suggested for piping. Figure 7.2d shows examples of
color-coded pipes and hoses.
Operator Driven Reliability 205
Figure 7.2a Safe oil and pressure levels and chain tightening
S3 Shine (Seiso)
Shine is all about cleanliness and housekeeping. The Seiso principle
says that everyone is a janitor. The step consists of cleaning up the work-
place and giving it a shine. Cleaning must be done by everyone in the
organization, from operators to managers. It would be a good idea to have
every area of the workplace assigned to a person or group of persons for
cleaning. Everyone should see the workplace through the eyes of a visitor
— always wondering if it is clean enough to make a good impression.
While cleaning, it’s easy to inspect the equipment, machines, tools,
and supplies we work with. Regular cleaning and inspection makes it easy
to spot lubricant leaks, equipment misalignment, breakage, missing tools,
and low levels of supplies. Problems can be identified and fixed when
they are small. If these minor problems are not addressed while small,
they could lead to equipment failure, unplanned outages, or long, unpro-
ductive waits while new supplies are delivered.
When done on a regular, frequent basis, cleaning and inspecting gen-
erally will not take a lot of time. In the long run, they will most likely save
time.
S4 Standardize (Seiketsu)
The fourth step is to simplify and standardize. Seiketsu translates to
standards for all operational activities, including cleanliness. It consists of
208 Chapter 7
S5 Sustain (Shitsuke)
The final step is to sustain the gain by continuing education, training,
and maintaining the standards. In fact, Shitsuke means discipline. It pro-
motes commitment to maintaining orderliness and to practicing the first
four steps as a way of life. The emphasis of Shitsuke is elimination of bad
habits and constant practice of good ones.
Continue to educate people about maintaining standards. When there
are changes such as new equipment, new products, and new work rules
that will affect the 5S program, adjustments will be needed to accommo-
date those changes, to modify changes in the standards, and to provide
training that addresses those changes.
If your organization is planning to implement Lean Manufacturing,
5S is one of the first activities that need to be carried out once Lean has
been adopted.
Some organizations have added a sixth S to emphasize safety in their
program, calling the program 5S Plus or 6S.
(TEEP), which is based on 24 hours per day and 365 days per year oper-
ations. TEEP also considers equipment utilization.
OEE and TEEP measure the overall utilization of assets and equip-
ment for manufacturing operations, directly indicating the gap between
actual and ideal performance. OEE quantifies how well a manufacturing
unit performs relative to its designed capacity, during the periods when it
is scheduled to run. TEEP measures how well an organization creates
value from its assets by effective utilization based on 24 hours per day,
365 days per year availability.
OEE and TEEP are calculated as:
OEE breaks the performance of an asset into three separate but meas-
urable elements: availability, performance, and quality. Each element
points to an aspect of the process that can be targeted for improvement.
OEE may be applied to any individual asset or to a process. It is unlikely
that any manufacturing process can run at 100% OEE. Many manufactur-
ers benchmark their industry to set a challenging target; 85% is not
uncommon.
Figure 7.3 illustrates the concept of OEE and TEEP and how differ-
ent production losses impacts productivity.
Calculating OEE
Example 7.1
A given asset, a machining center, experiences the following:
Availability of asset = 88.0%
Asset Performance = 93.0%
Quality it produces = 95.0%
OEE = 88% (Availability) X 93% (Performance) X 95%
(Quality) = 77.7%
Calculating TEEP
TEEP = Utilization X Availability X Performance X Quality
Example 7.2
Whereas OEE measures effectiveness based on scheduled hours,
TEEP measures effectiveness against 24 hours per day, 365 days per
year operation. In the example above, suppose this same asset — the
machining center — operates 20 hours a day, 300 days in a year.
OEE of machinating center (calculated above) = 77.7%
Machining Center Utilization
= (20 hours X 300 days) / (24 hours X 365 days) = 68.5%
TEEP = 68.5% (Utilization) X 77.7% (OEE) = 53.2%
Example 7.3
A six-station hammer assembly machine shows the following opera-
tional data from the CMMS and operational log of an assembly machine.
Scheduled Downtime:
5 PMs, each at 1000 operating hours, each requiring 2 people at 8
hours each of downtime (16 man hours per PM)
Unscheduled Downtime:
8 failures resulting in 50 hours of downtime (132 man hours of fail-
ure repair work) 22 setups and tooling changes resulting in 20 hours
of downtime
Performance Losses:
Minor stoppages / jams (Less than 5 min each) 750 instances per
year — average 3.2 minute each
During winter (about 120 days in year), the system runs slower in
the morning for 30 minutes. This increases the machine cycle time
from 1 unit/minute to 1 unit / 1.5 minutes during this period.
Quality Losses:
On average, every hour the assembly unit produces 57 good quality
units, 2 units needing some repair, and 1 unit scrapped.
Asset Utilization
Ideally, total hours available for production
= 365 days X 24 hours/day = 8,760 hours/year
Idle hours = Hours the asset doesn’t run due to lack of demand or
factors beyond the control of asset/plant.
212 Chapter 7
Asset Availability
Asset availability (%) = Uptime x 100 / (Uptime + Downtime)
Asset Performance
Asset performance – Efficiency in %
= Actual production rate / Designed (best) Production rate x 100
Performance losses
Minor stoppages (Hours/year) = 750 X 3.2 min = 40 hours
Speed losses (machine running slow) = 120 x 0.5 x [(1.5 - 1) / 1.5]
Operator Driven Reliability 213
Performance Efficiency %
= (Uptime hours – Performance losses) / Uptime hours
= (4580-60) / 4580 = 98.7%
Quality Losses
The quality portion of the OEE represents the good units produced
as a percentage of the total units. The quality performance is a pure
measurement of process yield that is designed to exclude the effects
of availability and performance.
• 5 S audit results
• % of assets covered by 5 S plus principles
• Asset — area cleanliness / housekeeping
• Asset condition — visual inspection
• Color coded labels of piping, hoses, valves, etc.
• Check lists — PMs instructions attached to the asset
• Required tools properly placed and labeled
7.8 Summary
1. Autonomous maintenance
2. Focused improvement — Kaizen
3. Planned maintenance
4. Quality maintenance
5. Training and development
6. Design and early equipment management
7. Office improvement
8. Safety, health, and environment
• Safer workplace
• Employee empowerment and improved morale
• Increased production / output
• No or minimum defects
• No or minimum breakdowns
• No or fewer short stoppages
• Decreased waste
• Decreased O&M costs
Q7.2 Define TPM. What are TPM’s various elements — the pillars
of TPM?
8.1 Introduction
8.2 Key Terms and Definitions
8.3 Understanding Failures and Maintenance Strategies
8.4 Maintenance Strategy — RCM
8.5 Maintenance Strategy — CBM
8.6 Other Maintenance Strategies
8.7 Summary
8.8 Self Assessment Questions
8.9 References and Suggested Reading
• What is a failure?
• What is RCM?
• What does it take to implement RCM effectively?
• What CBM technologies are available?
• What are the different maintenance strategies?
• How can you integrate PM and CBM into RCM methodology
• When would RTF be a good maintenance strategy?
219
220 Chapter 8
8.1 Introduction
Maintenance has entered the heart of many organizational activities
due to its vital role in the areas of environment preservation, productivity,
quality, system reliability, regulatory compliance, safety, and profitability.
With this new paradigm, new challenges and opportunities are being pre-
sented to maintenance and operations professionals. Central to mainte-
nance is a process called Reliability Centered Maintenance, or RCM.
RCM helps determine how assets can continue to do what their users
require in certain operating contexts. RCM analysis provides a structured
framework for analyzing the functions and potential failures of assets such
as airplanes, manufacturing lines, compressors or turbines, telecommuni-
cation systems, etc. RCM was developed in the commercial aviation
industry in the late 1960s to optimize maintenance and operations activi-
ties. RCM strategy (or some call it process), can help in developing an
effective maintenance plan by selecting appropriate strategies such as PM,
CBM, or RTF.
Preventive maintenance (PM) is the planned maintenance of assets
designed to improve asset life and avoid unscheduled maintenance activ-
ity. PM includes cleaning, adjusting, and lubricating, as well as minor
component replacement, to extend the life of assets and facilities.
Condition-based maintenance (CBM) is another maintenance opti-
mizing strategy. CBM attempts to evaluate the condition of assets by per-
forming periodic or continuous condition monitoring. The ultimate goal
of CBM is to perform maintenance at a scheduled point in time when the
maintenance activity is most cost-effective and before the asset loses opti-
mum performance.
Recent developments in technologies have allowed instrumentation of
assets to provide us information regarding its health. Together with better
tools for analyzing condition data, today’s maintenance personnel are bet-
ter able to decide the right time to perform maintenance on assets. Ideally,
CBM allows maintenance personnel to do the right things — minimizing
asset downtime, time spent on maintenance, and spare parts cost. CBM
uses real-time data to prioritize and optimize resources.
Although many would not consider it to be a maintenance-optimizing
strategy, Run-to-Failure (RTF) can be a viable and economical choice for
certain equipment. Selecting RTF should be a deliberate choice because it
will lead to unplanned downtime and increased corrective maintenance
cost for the specific equipment selected for this maintenance strategy.
However, if the facility and personnel risk is low, RTF may be the most
cost-effective strategy for an organization’s overall maintenance program.
Maintenance Optimization 221
Age Exploration
An iterative process used to optimize preventive maintenance
(PM) intervals.
Critical Asset
Assets that have been evaluated and classified as critical due to
their potential impact on safety, environment, quality, produc-
tion/operations, and maintenance if failed.
Emissivity
A fundamental property of a material, emissivity is the ratio of
the rate of radiant energy emission at a given wavelength from
a body with an optical smooth surface, as a consequence of its
temperature only, to the corresponding rate of emission from a
black body at the same temperature and wavelength.
222 Chapter 8
Failure
Failure is the inability of an asset / component to meet its
expected performance.
Failure Cause
The reason something went wrong.
Failure Mode
An event that causes a functional failure; the manner of failure
Ferrography
An analytical method of assessing machine health by quantify-
ing and examining ferrous wear particles suspended in the
lubricant or hydraulic fluid.
Functional Failure
A state in which an asset / system is unable to perform a specif-
ic function to a level of performance that is acceptable to its
user.
Hidden Failure
A failure mode that will not become evident to a person or the
operating crew under normal circumstances.
Maintenance Optimization 223
Operating Context
The environment in which an asset is expected to be used.
P–F Interval
The interval between the point at which a potential failure
becomes detectable and the point at which it degrades into a
functional failure. It is also sometime called lead time to fail-
ure.
Potential Failure
A condition that indicates a functional failure is either about to
occur or in the process of occurring.
Prognosis
A forecast or prediction of outcome such as how long this asset
or component will last or remaining life left.
Run–to-Failure (RTF)
A maintenance strategy (policy) for assets where the cost and
impact of failure is less than the cost of preventive actions. It is
a deliberate decision based on economical effectiveness.
Viscosity
Measurement of a fluid’s resistance to flow. It is also often
referred to as the structural strength of liquid. Viscosity is criti-
cal to oil film control and is a key indicator of condition related
to the oil and the machine.
224 Chapter 8
(time) location. Our discussion assumes that these points are fixed in time,
yet this is not the case in practice. They may vary based on the nature of
the defects and the environment. Our goal is to catch any defects before
they shut us down.
The best strategy is to find a defect or any abnormal condition in
Zone B as soon as possible, utilizing condition-based tasks. RBM and/or
PM can be used to identify the sources of these defects and correct them
in their early stages.
Traditional thinking has been that the goal of preventive mainte-
nance (PM) is to preserve assets. On the surface, it makes sense, but the
problem is in that mindset. In fact, that thinking has been proven to be
flawed at its core. The blind quest to preserve assets has produced many
problems, such as being overly conservative with any maintenance
actions that could cause damage due to intrusive actions, thereby increas-
ing the chances of human error. Other flaws include both thinking that all
failures are equal and performing maintenance simply because there is an
opportunity to do so.
In the last few decades many initiatives have been developed in
cost reduction, resource optimization, and bottom line focus of any action
we take. The mentality of preserving assets quickly consumed resources,
put maintenance plans behind schedule, and overwhelmed the most expe-
rienced maintenance personnel. Worse, this mentality sometimes caused
maintenance actions to become totally reactive.
The development of a Reliability Centered Maintenance approach
has provided a fresh perspective in which the purpose of maintenance is
not to preserve assets for the sake of the assets themselves, but rather to
preserve asset functions. At first, this might be a difficult concept to accept
because it is contrary to our ingrained mindset that the sole purpose of
preventive maintenance is preserving equipment operation. But in fact, in
order to develop an effective maintenance strategy, we need to know what
the expected output is and the functions that the asset supports — that is,
the real purpose of having the asset.
and to Air Force F-4J aircrafts under a contract with the U.S. Department
of Defense (DOD). In 1975, DOD directed that the MSG concept be
labeled Reliability–Centered Maintenance (RCM) and be applied to all
major military systems. In 1978, United Airlines produced the initial
RCM “bible” under DOD contract.
RCM development has been an evolutionary process. Over 40 years
have passed since its inception during which RCM has become a mature
process. However, industry has yet to fully embrace the RCM methodol-
ogy in spite of its proven track record. In recent years, Anthony (Mac)
Smith and Jack Nicholas have been leaders in creating increased RCM
awareness. Examples discussed in this section are the result of work per-
formed by Mac Smith, Glen Hinchcliffe and the author in optimizing PMs
utilizing RCM methodology.
Effective means that we are sure that this task will be useful and we
are willing to spend resources to do it. In addition, RCM recognizes the
following:
Elements of RCM
The SAE JA1011 standard describes the minimum criteria to which a
process must comply to be called RCM. An RCM Process answers the fol-
Maintenance Optimization 229
RCM shifts the emphasis of maintenance from the idea that all fail-
ures are bad and must be prevented, to a broad understanding of the pur-
pose of maintenance. It seeks the most effective strategy that focuses on
the performance of the organization. It might include not doing something
about a failure or letting failures happen. The RCM approach encourages
us to think of more encompassing ways of managing failures.
• System description
• Functional block diagram
• IN / OUT interfaces
• System work breakdown structure
• Equipment /component history
a selected system and adjacent systems. Figure 8.4 illustrates an FBD with
functional interfaces including sub-systems.
In an actual team setting, it is desirable to have a discussion first
regarding various possibilities that should be considered in creating func-
tional sub-systems. When the FBD is finalized, it will show a decision on
the use of functional subsystems as well as the final representation of the
IN / OUT interfaces.
Listing all components as part of the System Work Breakdown
Structure (SWBS) is very desirable. The SWBS is the compilation of the
line items list for the system. SWBS is a system hierarchy listing parent-
child relationships. In most cases, the SWBS should be what’s in the
CMMS for the system being analyzed. In older plants, where the reference
sources could be out of date, the RCM team should perform a system walk
down to assure accuracy in the final SWBS. This practice is a good one,
even if the system is well documented, to help the team familiarize itself
with the system.
The last item in Step 3 is to collect historical system data. It will be
beneficial for the analysis team to have a history of the past 2–5 years of
component and system failure events. This data should come from correc-
tive maintenance reports or from the CMMS system. Unfortunately, it is
not uncommon to find a scarcity of useful failure event information. In
many plants, the history kept is of very poor quality. Most of the time, the
repair history will simply state “Repaired pump” or “Fixed pump.”
Improving data quality is a challenge for many organizations. If a good
failure history is not available, the team can work together to develop a
list of failure events over the last few years. This list of failures would
help in performing the FMEA analysis in Step 5.
interfaces).
The next step is to specify how much of each function can be lost, i.e.,
functional failures. Most functions have more than one loss condition if
we have done a good job with the system description. For example, the
loss condition can range from total loss and varying levels of partial loss
which have different levels of plant consequences (and thus priority) to
failure to start on demand, etc. The ultimate objective of an RCM analy-
sis is to prevent these functional failures and thereby preserve function. In
Step 7, this objective will lead to the selection of preventive maintenance
tasks that will successfully avoid the really serious functional failures.
236 Chapter 8
ures as shown in Figure 8.6. FMEA addresses the second RCM principle,
to “determine the specific component failures that could lead to one or
more of the functional failures.” These are the failures which defeat func-
tions and become the focus of the team’s attention.
In reviewing failure modes, teams can use the following guidelines in
accepting, rejecting, or putting aside for later considerations:
For each failure mode retained for analysis, the team then decides on
its one or two most likely failure causes. A failure cause is, by definition,
a 1–3 word description of why the failure occurred. We limit our judg-
ments to root causes. If the failure mode can occur only due to another
previous failure somewhere in the system or plant, then this is considered
a consequential cause.
Each failure mode retained is now evaluated as to its local effect.
What can it do to the component; what can it do to the system functions;
how can it impact the system / plant output? If safety issues are raised,
238 Chapter 8
they too can become part of the recorded effect. In the failure effects
analysis, assume a single failure scenario. Also allow all facets of redun-
dancy to be employed in arriving at statements of failure effect. Thus,
many single failure modes can have no effect at the plant or system level,
in which case, designate the failure mode as low priority and do not pass
them to Step 6, Logic Tree Analysis. If there is either a system or plant
effect, the failure mode is passed on to Step 6 for further priority evalua-
tion. Those failure modes considered as low priority here are assigned as
candidates for run-to-failure (RTF) and are given a second review in Step
7 for final RTF decision.
RTF does not imply that this component or asset is unimportant;
instead components that are designated as RTF have no significant conse-
quence as the result of a failure. It does not matter if failed components
are restored immediately as long as they are repaired to an operable status
in a timely manner.
Thus, every failure mode passed to Step 6 receives one of the follow-
ing labels or categories: A, B, C, D, or any combination of these. Any fail-
ure mode that contains an A in its label is a top priority item; a B is the
second and next significant priority item; a C is essentially a low priority
item that, in the very practical sense, is probably a non-issue in allocating
preventive maintenance resources. All C and D/C failure modes are good
candidates for RTF. Primary attention will be placed on the A and B labels,
which are addressed in Step 7.
• Task selection
• Sanity check
• Task comparison
instructions for each craft group, depending upon union contract require-
ments. However, the coordination between the craft should be part of each
RCM Benefits
• Reliability. The primary goal of RCM is to improve asset relia-
bility and availability cost-effectively. This improvement comes
through constant reappraisal of the existing maintenance program
and improved communication between maintenance supervisors
and managers, operations personnel, maintenance mechanics,
planners, designers, and equipment manufacturers. This improved
communication creates a feedback loop from the maintenance
craft in the field all the way to the equipment manufacturers.
244 Chapter 8
1. Planning (Concept)
2. Design and Build
3. Operations and Maintenance
4. Disposal
Maintenance Optimization 245
Data Collection
Asset condition data is collected basically in two ways:
Vibration Analysis
Vibration monitoring might be considered the “grandfather” of condi-
tion / predictive maintenance, and it provides the foundation for most
facilities’ CBM programs.
Vibration usually indicates trouble in the machine. Machine and struc-
tures vibrate in response to one or more pulsating forces that may be due
to imbalance, misalignment, etc. The magnitude of vibration is dependent
on the force and properties of the system, both of which may depend on
speed.
There are four fundamental characteristics of vibration: frequency,
period, amplitude, and phase. Frequency is the number of cycles per unit
time and is expressed in the number of cycles per minute (CPM) or cycles
per second (Hz). The period is the time required to complete one cycle of
vibration, the reciprocal of frequency. The amplitude is the maximum
value of vibration at a given location of the machine. Phase is the time
relationship between vibrations of the same frequency and is measured in
degrees.
The three key measures used to evaluate the magnitude of vibra-
tions are:
• Displacement
• Velocity
• Acceleration
The units and descriptions of these measures are shown in Figure 8.9.
Displacement measurement is dominant at low frequency and is
caused by stresses in flexible members of the machine. It is typically
Maintenance Optimization 249
ment, e.g., fans and pumps, usually do not deteriorate fast enough to war-
rant continual real time data collection. However, critical and expensive
assets may warrant having real-time, continuous data collection system.
Spectrum Analysis and Waveform Analysis Spectrum analysis is the
most commonly-employed analysis method for machinery diagnostics. In
this type of analysis, the vibration technician focuses on analyzing specif-
ic “slices” of the vibration data taken over a certain range of CPM.
Spectrum analysis can be used to identify the majority of all rotating equip-
ment failures (due to mechanical degradation) before failure. Waveform
analysis, or time domain analysis, is another extremely valuable analytical
tool. Although not used as regularly as spectrum analysis, the waveform
often helps the analyst more correctly diagnose the problem.
Shock Pulse Analysis This type of analysis is used to detect impacts
caused by contact between the surfaces of the ball or roller and the race-
way during rotation of anti-friction bearings. The magnitude of these puls-
es depends on the surface condition and the angular velocity of the bear-
ing (RPM and diameter). Spike energy is similar in theory to shock pulse.
Alignment Misalignment of shafted equipment will not only cause
equipment malfunctions or breakdowns; it may also be an indicator of
other problems. Checking and adjusting alignment used to be a very slow
procedure. The advent of laser alignment systems has reduced labor time
by more than half and increased accuracy significantly.
Laser alignment is a natural compliment to vibration analysis.
Properly aligning shafts eliminates one of the major causes of vibration in
rotating machines and also drastically extends bearing life. For the mini-
mal amount of work involved, the payback is great.
Vibration Equipment For permanent data collection, vibration
analysis systems include microprocessor-based data collectors, vibration
transducers, equipment-mounted sound discs, and a host personal com-
puter with software for analyzing trends, establishing alert and alarm
points, and assisting in diagnostics. Portable handheld data collectors con-
sist of a data collection device about the size of a palm-top computer and
a magnetized sensing device.
The effectiveness of vibration monitoring depends on sensor mount-
ing, signal resolution, machine complexity, data collection techniques,
and the ability of the analyst. This last factor, the ability of the analyst, is
probably the most important aspect of establishing an effective vibration
monitoring program. The analyst must be someone who possesses a thor-
ough understanding of vibration theory and the extensive field experience
necessary to make the correct diagnosis of the acquired vibration data.
254 Chapter 8
Infrared Thermography
As one of the most versatile condition-based maintenance technolo-
gies available, infrared thermography is used to study everything from
individual components of assets to plant systems, roofs, and even entire
buildings.
Infrared inspections can be qualitative or quantitative. Qualitative
inspection concerns relative differences, hot and cold spots, and devia-
tions from normal or expected temperatures. Quantitative inspection con-
cerns accurate measurement of the temperature of the target. One must be
careful not to put too much emphasis on the quantitative side of infrared
because temperature-based sensors are better for accurate temperature
measurements.
Infrared instruments include an optical system to collect radiant ener-
gy from the object and focus it, a detector to convert the focused energy
pattern to an electrical signal, and an electronic system to amplify the
detector output signal and process it into a form that can be displayed.
Most instruments include the ability to produce an image that can be dis-
played and recorded. These thermographs, as the images are called, can be
interpreted directly by the eye or analyzed by computer to produce addi-
tional detailed information. Mid-wave range instruments detect infrared in
the 2–5 micron range; long-wave range instruments detect the 8–14
micron range. High-end systems can isolate readings for separate points,
calculate average readings for a defined area, produce temperature traces
along a line, and make isothermal images showing thermal contours.
It is essential that infrared studies be conducted by technicians who
are trained in the operation of the equipment and interpretation of the
imagery. Variables that can destroy the accuracy and repeatability of ther-
mal data, for example, must be compensated for each time data is
acquired.
Infrared Thermography (IRT) cameras are non-contact, line-of-sight,
thermal measurement and imaging systems. Because IRT is a non-contact
technique, it is especially attractive for identifying hot and cold spots in
energized electrical equipment, large surface areas such as boilers and
building roofs, and other areas where “stand-off” temperature measure-
ment is necessary. Instruments that perform this function detect electro-
magnetic energy in the short wave (3–5 microns) and long wave (8–15
microns) bands of the electromagnetic spectrum.
Because of the varied inspections (electrical, mechanical, and struc-
tural) encountered, the short wave instrument is the best choice for facil-
ity inspections. However, the short wave instrument is more sensitive than
Maintenance Optimization 255
long wave to solar reflections. Sunlight reflected from shiny surfaces may
make those surfaces appear to be “hotter” than the adjacent surfaces when
they really are not. IRT instruments–cameras are portable, usually sensi-
tive to within 0.20oC over a range of temperatures from –100 to +3000oC,
and accurate within +/–3 percent. In addition, the instrument can store
images for later analysis.
IRT inspections attempt to accurately measure the temperature of the
item of interest. To perform an inspection requires knowledge and under-
standing of the relationship of temperature and radiant power, reflection,
emittance, and environmental factors, as well as the limitations of the
detection instrument. This knowledge must be applied in a methodical
manner to control the imaging system properly and to obtain accurate
temperature measurements.
The qualitative inspections are significantly less time-consuming
because the thermographer is not concerned with highly-accurate temper-
ature measurement. In qualitative inspections, the thermographer obtains
accurate temperature differences (ΔT) between like components. For
example, a typical motor control center will supply three-phase power,
through a circuit breaker and controller, to a motor. Ideally, current flow
through the three-phase circuit should be uniform so the components
within the circuit should have similar temperatures. Any uneven heating,
perhaps due to dirty or loose connections, would quickly be identified
with the IRT imaging system.
IRT can be used very effectively to identify degrading conditions in
facilities’ electrical systems such as transformers, motor control centers,
switchgear, substations, switch yards, or power lines. In mechanical sys-
tems, IRT can identify blocked flow conditions in heat exchangers, con-
densers, transformer cooling radiators, and pipes. IRT can also be used to
verify fluid level in large containers such as fuel storage tanks. IRT can
identify insulation system degradation in building walls and roofs, as well
as refractory in boilers and furnaces. Temperature monitoring, infrared
thermography in particular, is a reliable technique for finding the mois-
ture-induced temperature effects that characterize roof leaks, and for
determining the thermal efficiency of heat exchangers, boilers, building
envelopes, etc.
Deep-probe temperature analysis can detect buried pipe energy loss
and leakage by examining the temperature of the surrounding soil. This
technique can be used to quantify ground energy losses of pipes. IRT can
also be used as a damage control tool to locate mishaps such as fires and
leaks.
256 Chapter 8
Ultrasonic Testing
Ultrasonic testing is extremely useful in the diagnosis of mechanical
and electrical problems. Testing instruments are usually portable hand
held devices. Their electronic circuitry converts a narrow band of ultra-
sound (between 20 and 100 kHz) into the audible range so that a user can
recognize the qualitative sounds of operating equipment through head-
phones. Intensity of signal strength is also displayed on the instrument.
Ultrasonic instruments–scanners are most often used to detect gas, liquid,
or vacuum leaks.
Ultrasonic detectors are somewhat limited in their use. For example,
they may help identify the presence of suspicious vibrations within a
machine, but they are generally not sufficient for isolating the sources or
causes of those vibrations.
On the plus side, ultrasonic monitoring is easy, it requires minimal
training, and the instruments are inexpensive. Airborne ultrasonic devices
are highly sensitive listening “guns” (similar in size to the radar speed
guns used by police at speed traps). They provide a convenient, non-intru-
Maintenance Optimization 257
of these areas. The three areas are not unrelated. Changes in lubricant con-
dition and contamination, if not corrected, will lead to machine wear.
Lubricant Condition Bad lubricating oil is either discarded or recon-
ditioned through filtering or by replacing additives. Analyzing the oil to
determine the lubricant condition is, therefore, driven by costs. Small
machines with small oil reservoirs have the oil changed on an operating
time basis. An automobile is the most common example of time-based
lubricating oil maintenance. In this example, the costs to replace the auto-
mobile oil change (which includes the replacement oil, labor to change the
oil, and disposal costs) are lower than the cost to analyze the oil, e.g., the
cost of sample materials, labor to collect the sample, and the analysis. In
the case of automobile oil, time-based replacement is cheaper than analy-
sis due to competition and the economies of scale that have been created
to meet the consumer need for replacing automobile oil.
In an industrial set-up, lubricating oil can become contaminated due
to the machine’s operating environment, improper filling procedures, or
through the mixing of different lubricants in the same machine. If a
machine is “topped off” with oil frequently, we should periodically send
the oil out for analysis to check the machine for any serious problems.
The full benefit of oil analysis can be achieved only by taking fre-
quent samples and trending the data for each asset in the program. The
length of the sampling intervals varies with different types of equipment
and operating conditions. Based on the results of the analyses, lubricants
can be changed or upgraded to meet the specific operating requirements.
It cannot be overemphasized that the sampling technique is critical to
meaningful oil analysis. Sampling locations must be carefully selected to
provide a representative sample and sampling conditions should be uni-
form so that accurate comparisons can be made.
Standard Analytical Test Types Lubricating oil and hydraulic fluid
analysis should proceed from simple, subjective techniques such as visu-
al and odor examination through more sophisticated techniques. The more
sophisticated tests should be performed when conditions indicate the need
for additional information and based on asset criticality.
Visual and Odor Simple inspections can be performed weekly by the
equipment operator to look at and smell the lubricating oil. A visual
inspection looks for changes in color, haziness or cloudiness, and parti-
cles. This test is very subjective, but can be an indicator of recent water or
dirt contamination and advancing oxidation. A small sample of fresh
lubricating oil in a sealed, clear bottle can be kept on hand for visual com-
parison. A burned smell may indicate oxidation of the oil. Other odors
262 Chapter 8
cost. Analyze more frequently for machines that are indicating emerging
problems and less frequently for machines that operate under the same
conditions and are not run on a continuous basis. A new baseline analysis
is recommended following machine repair or oil change out.
Grease is usually not analyzed on a regular basis. Although most of
the testing that is done on oil can also be done on grease, getting a repre-
sentative sample is usually difficult. The machine may have to be disas-
sembled in order to get a good sample that is a homogeneous mixture of
the grease, contaminants, and wear.
Oil Contamination Program A concern common to all machines
with lubricating oil systems is keeping dirt and moisture out of the system.
Common components of dirt, such as silica, are abrasive and naturally
promote wear of contact surfaces. In hydraulic systems, particles can
block and abrade the close tolerances of moving parts. Water in oil pro-
motes oxidation and reacts with additives to degrade the performance of
the lubrication system. Ideally, there would be no dirt or moisture in the
lubricant; this, of course, is not possible. The lubricant analysis program
must therefore monitor and control contaminants.
Oil analysis is a reliable predictive maintenance tool and is very effec-
tive in detecting contaminants in oil that are a result of ingressed dirt or
internal wear debris generated by the effect of machine degradation and
wear. An increase in contaminant levels accelerates the wear-out process
of all components in industrial machine applications. Oil cleanliness is
measured by ISO standard, ISO 4406. For each numerical increase in ISO
contaminant level, the amount of contaminants in the oil almost doubles.
If the standard is a 16/14/11, then the increase in contaminants in the oil
for a 22/21/17 is about 64 times dirtier than the standard.
Contaminants in oils can be prevented. Good filtration on the return
side of hydraulic power units will help in taking out dirt and other
ingressed particles. Usually 3-micron filtration with a 200 beta ratio is the
standard set for most machinery.
Large systems with filters will have steady-state levels of contami-
nants. Increases in contaminates indicate breakdown in the system’s
integrity (leaks in seals, doors, heat exchangers, etc.) or degradation of the
filter. Use of “Air Breathers” in gearboxes and hydraulic system tanks is
a best practice to control contaminants. Unfiltered systems can exhibit
steady increases during operation. Operators can perform a weekly visual
and odor check of lubricating systems and provide a first alert of contam-
ination. Some bearing lubricating systems may have such a small amount
of oil that a weekly check may not be cost effective.
266 Chapter 8
1. Establish the target fluid cleanliness levels for each machine fluid
system.
2. Select and install filtration equipment (or upgrade current filter
rating) and contaminant exclusion techniques to achieve target
cleanliness levels.
3. Monitor fluid cleanliness at regular intervals to achieve target
cleanliness levels.
Technologies Limitations
The technologies discussed earlier can be divided into two categories:
inspection does not produce the harmful radiation experienced with radi-
ography. Ultrasonic inspection is based on the difference in the wave
reflecting properties of defects and the surrounding material. An ultrason-
ic signal is applied through a transducer into the material being inspected.
The speed and intensity with which the signal is transmitted or reflected
to a transducer provides a graphic representation of defects or discontinu-
ities within the material.
Due to the time and effort involved in surface preparation and testing,
ultrasonic inspections are often conducted on representative samples of
materials subjected to high stress levels, high corrosion areas, and large
welds.
Magnetic Particle Testing (MPT) Magnetic Particle Testing uses
magnetic particle detection of shallow sub-surface defects. It is a very
useful technique for localized inspections of weld areas and specific areas
of high stress or fatigue loading. MPT provides the ability to locate shal-
low sub-surface defects. Two electrodes are placed several inches apart on
the surface of the material to be inspected. An electric current is passed
between the electrodes producing magnetic lines. While the current is
applied, iron ink or powder is sprinkled in the area of interest. The iron
aligns with the lines of flux. Any defect in the area of interest will cause
distortions in the lines of magnetic flux, which will be visible through the
alignment of the powder. Surface preparation is important because the
powder is sprinkled directly onto the metal surface and major surface
defects will interfere with sub-surface defect indications. Also, good elec-
trode contact and placement is important to ensure consistent strength in
the lines of magnetic flux.
A major advantage of MPT is its portability and speed of testing. The
handheld electrodes allow the orientation of the test to be changed in sec-
onds. This allows for inspection of defects in multiple axes of orientation.
Multiple sites can be inspected quickly without interrupting work in the
vicinity. The equipment is portable and is preferred for onsite or in-place
applications. The results of MPT inspections are recordable with high
quality photographs.
Hydrostatic Testing Hydrostatic Testing is another NDT method for
detecting defects that completely penetrate pressure boundaries.
Hydrostatic Testing is typically conducted prior to the delivery or opera-
tion of completed systems or subsystems that act as pressure boundaries.
During the hydrostatic test, the system to be tested is filled with water or
the operating fluid. The system is then sealed and the pressure is increased
to approximately 1.5 times operating pressure.
Maintenance Optimization 273
This pressure is held for a defined period. During the test, inspections
are conducted to find visible leaks as well as monitor pressure drop and
make-up water additions. If the pressure drop is out of specification, any
leaks must be located and repaired. The principle of hydrostatic testing
can also be used with compressed gases. This type of test is typically
called an air drop test and is often used to test the integrity of high pres-
sure air or gas systems.
Eddy Current Testing Eddy current testing is used to detect surface
and shallow subsurface defects. Also known as electromagnetic induction
testing, eddy current testing provides a portable and consistent method for
detecting surface and shallow subsurface defects. This technique provides
the capability of inspecting metal components quickly for defects or
homogeneity. By applying rapidly varying AC signals through coils near
the surface of the test material, eddy currents are induced into conducting
materials. Any discontinuity that affects the material’s electrical conduc-
tivity or magnetic permeability will influence the results of this test.
Component geometry must also be taken into account when analyzing
results from this test.
Run-to-Failure (RTF)
cific maintenance strategy.) For assets where the cost and impact of fail-
ure is less than the cost of preventive (PM and CBM) actions, RTF may
be an appropriate maintenance strategy. It is a deliberate decision based
on economical effectiveness.
Many times, we do consider and accept RTF for specific non-critical
assets or components. However, we usually fail to document this fact in
our CMMS. It is imperative that we document that RTF was chosen on
purpose and what the criteria or basis was for this decision. Additionally,
we must have a plan to repair the failure, if and when it happens. An
example is a spare parts program for the specifically-selected RTF
assets/components that allow for minimal downtime for these.
Documentation minimizes the excitement when RTF failure occurs. We
need to understand that it was a deliberate economical decision not to
have a PM program for that asset.
This maintenance strategy can be a valid stand-alone maintenance
strategy, especially in today’s budget constrained environment. However,
the RTF strategy needs more discussion throughout the maintenance com-
munity regarding how to establish it more formally as an effective piece
of your overall maintenance strategy. Such a discussion will allow this
strategy to stretch beyond its near-reactive tendencies by some organiza-
tions.
8.7 Summary
Reliability-Centered Maintenance, often known as RCM, is a main-
tenance improvement approach focused on identifying and establishing
the operational, maintenance, and design improvement strategies that will
manage the risks of asset failure most effectively. The technical standard
SAE JA1011 has established evaluation criteria for RCM, which specifies
that RCM address, at a minimum, the following seven questions:
Q8.1 What is RCM? How did it get its start? Tell a little about RCM’s
history.
Q8.8 What is meant by CBM and PdM? What methods are used to
perform these?
9.1 Introduction
9.2 Key Terms and Definitions
9.3 Identifying Performance Measures
9.4 Data Collection and Data Quality
9.5 Benchmarking and Benchmarks
9.6 Summary
9.7 Self Assessment Questions
9.8 References and Suggested Reading
283
284 Chapter 9
9.1 Introduction
Accountability
Well-designed performance measures document progress toward
achievement of goals and objectives, thereby motivating and catalyzing
organizations to fulfill their obligations to their employees, stakeholders,
and customers.
Benchmark
A standard measurement or reference that forms the basis for
comparison; this performance level is recognized as the stan-
dard of excellence for a specific business process.
Benchmarking
American Productivity and Quality Council (APQC) defines
benchmarking as the process of identifying, learning, and adapt-
ing outstanding practices and processes from any organization,
anywhere in the world, to help an organization improve its per-
formance. Benchmarking gathers the tacit knowledge—the
know-how, judgments, and enablers.
Benchmarking Gap
The difference in performance between the benchmark for a
particular activity and the level of other organizations. It is the
measured performance advantage of the benchmark organiza-
tion over other organizations.
Best-in-Class
Outstanding process performance within an industry; words
used as synonyms are best practice and best-of-breed.
Best Practice
A method or technique that has been found to be the most effec-
tive and has consistently achieved superior results compared to
results achieved with other means, e.g., current practices while
minimizing the use of an organization’s resources. This practice
becomes a benchmark.
Managing Performance 287
Generic Benchmarking
Benchmarking process that compares a particular business func-
tion or process with other organizations, independent of their
industries.
Goals
An observable and measurable end result having objectives that
will be achieved within a more or less fixed time frame. Goals
indicate the strategic direction of an organization.
Internal Benchmarking
Benchmarking process that is performed within an organization
by comparing similar business units or business processes.
Metric
A metric is a standard measure to assess performance in a spe-
cific area. Metrics are at the heart of a good, customer-focused
process management system and any program directed at con-
tinuous improvement.
Networking
A practice of building up informal relationships with people,
with a common set of values and interests that can help both par-
ties professionally.
Objective
The set of results to be achieved that will deploy a vision
into reality.
Performance
The results of activities of an organization over a given peri-
od of time.
Performance Measurement
A quantifiable indicator used to assess how well an organization
or business is achieving its desired objectives.
Vision
The achievable dream of what an organization wants to do and
where it wants to go.
288 Chapter 9
World-Class
Practices that are ranked by the customers and industry experts
to be among the best of the best. An exemplary performance
achieved independent of industry, function, or location.
It is often said that “what gets measured gets done.” Getting things
done, through people, is what management is all about. Measuring things
that get done and the results of their effort is an essential part of success-
ful management. But too much emphasis on measurements or the wrong
kinds of measurements may not be in the best interest of the organization.
A few vital indicators which are important for evaluating process per-
formance are called KPIs—Key Performance Indicators. KPIs are an
important management tool; they measure business performance, includ-
ing maintenance. There only are a few “hard” measures of maintenance
output and the metrics that are commonly used are often easy to manipu-
late. Maintenance and operations KPIs must be integrated to make them
effective and balanced. There are three other criteria that should be con-
sidered when deciding what aspects of maintenance to measure:
1. The performance measures should encourage the right behavior.
2. They should be difficult to manipulate to “look good.”
3. They should not require a lot of effort to measure.
• % Schedule compliance
• % Planned work
• % PM / CBM work compliance (completed on time)
• Work order cycle time
• % Rework
• Planner to craft workers ratio
Lagging indicators are results that occur after the fact. They monitor
the output of a process. They measure the results of how well we have
managed an asset, process, or overall maintenance business. Some exam-
ples of M&R-related lagging metrics are:
Balanced Scorecard
Most of the time, we measure what’s important from the financial and
productivity prospective of a process or an organization. The balanced
scorecard suggests that we view the process or an organization from four
perspectives. We should also develop metrics, collect data, and analyze
the data relative to each of these perspectives to balance out any bias.
The Balanced Scorecard is a strategic management approach devel-
oped in the early 1990s by Dr. Robert Kaplan of Harvard Business School,
Managing Performance 293
even though the current financial picture may look good. For maintenance
organizations, their customers are the operations. If they are not happy
with the service due to increasing failure rate and downtime, they could
outsource the maintenance.
In developing metrics for satisfaction, customers should be analyzed
in terms of the kinds of customers and processes for which the organiza-
tion is providing a product or service to those groups. M&R related exam-
ples of this perspective include:
• Percent downtime
• Percent availability
• Percent delivery on time (asset/system back to operation as
promised)
• Customer satisfaction with the services maintenance provides,
such as turn-around (no cost overruns, worked right the first time,
asset operates at 100% performance, etc.)
How good are the metrics? The following questions may serve as a
checklist to determine the quality of metrics and to develop a plan for
improvement:
• Do the metrics make sense? Are they objectively measurable?
• Are they accepted by and meaningful to the customer?
• Have those who are responsible for the performance being meas-
ured been fully involved in the development of this metric?
298 Chapter 9
What Is Benchmarking?
Benchmarking is the process of identifying, sharing, and using knowl-
edge and best practices. It focuses on how to improve any given business
process by exploiting topnotch approaches rather than merely measuring
the best performance. Finding, studying, and implementing best practices
provide the greatest opportunity for gaining a strategic, operational, and
financial advantage.
Informally, benchmarking could be defined as the practice of being
modest enough to admit that others are better at something, and wise
enough to try to learn how to match, and even surpass them.
Benchmarking is commonly misperceived as simply number crunching,
site briefings and industrial tourism, copying, or spying. It should not be
taken as a quick and easy process. Rather, benchmarking should be con-
sidered an ongoing process as a part of continuous improvement.
Benchmarking initiatives help blend continuous improvement initiatives
and breakthrough improvements into a single change management sys-
tem. Although benchmarking readily integrates with strategic initiatives
Managing Performance 299
Types of Benchmarking
Generally there are two types of benchmarking activities. They
include:
1. Internal
2. External
a. Similar industry
b. Best Practice
Internal Benchmarking
Internal benchmarking typically involves different processes or
departments within a plant or organization. This type of benchmarking has
some advantages such as ease of data collection and comparison—some
of the enablers such as employee’s skill level and culture would be gener-
ally similar. However, the major disadvantage of internal benchmarking is
that it is unlikely to result in any major breakthrough in improvements.
External Benchmarking
External benchmarking is performed outside of an organization and
compares similar business processes or best in any industry.
Similar industry benchmarking uses external partners in a similar
industry or with similar processes; it shares their practices and data. This
process may be difficult in some industries, but many organizations are
open to share non-proprietary information. This type of benchmarking ini-
tiative usually focuses on meeting a numerical standard rather than
improving any specific business process. Small or incremental improve-
ments have been observed in this type of benchmarking.
Best Practices benchmarking focuses on finding the best or leader in
a specific process and partnering with them to compare their practices and
data.
Benchmarking Methodology
One of the essential elements of a successful benchmarking initiative
is to follow a standardized process. Choosing an optimal benchmarking
partner requires a deep understanding of the process being studied and of
the benchmarking process itself. Such understanding is also needed to
300 Chapter 9
Legal
Don’t enter into discussions or act in any way that could be construed
as illegal, either for you or your partner. Potential illegal activities could
include a simple act of discussing costs or prices, if that discussion could
lead to allegations of price fixing or market rigging.
Be Open
Early in your discussion, it helps to fully disclose your level of expec-
tations with regard to the exchange.
Managing Performance 301
Confidentiality
Treat the information you receive from your partners with the same
degree of care that you would for information that is proprietary to your
organization. You may want to consider entering a non-disclosure agree-
ment with your benchmarking partner.
Use of Information
Don’t use the benchmarking information you receive from a partner
for any purpose other than what you have agreed to.
Benchmarks
A benchmark refers to a measure of best practice performance where-
as benchmarking refers to the actual search for the best practices.
Emphasis is then given to how we can apply the process to achieve supe-
rior results. Thus, a benchmark is a standard, or a set of standards, used as
a point of reference for evaluating performance or quality level.
Benchmarks may be drawn from an organization’s own experience, from
# Metric SMRP Pillar # Metric SMRP Pillar
1 Actual Cost to Planning Estimates #5 - Work Management 36 Planner to Craft Ratio #5 - Work Management
2 Availability #2 - Process Reliability 37 Planning Variance Index #5 - Work Management
3 Condition Based Maintenance Cost #5 - Work Management 38 PM & PcM (CBM) Compliance #5 - Work Management
4 Condition Based Maintenance Hour #5 - Work Management 39 PM & PcM (CBM) Effectiveness #5 - Work Management
5 Continuous Improvement Man Hours #5 - Work Management 40 PM & PcM (CBM) Work Order Backlog #5 - Work Management
6 Contractor Manpower #5 - Work Management 41 PM & PcM (CBM) Work Order Overdue #5 - Work Management
7 Corrective Maintenance Cost #5 - Work Management 42 PM & PcM (CBM) Yield #5 - Work Management
8 Corrective Maintenance Hours #5 - Work Management 43 Preventive Maintenance (PM) Cost #5 - Work Management
9 Craft Workers on Shift #5 - Work Management 44 Preventive Maintenance (PM) Hour #5 - Work Management
10 Emergency Purchase Orders #5 - Work Management 45 Proactive Work #5 - Work Management
11 Idle Time #2 - Process Reliability 46 Ratio of Indirect to Direct Maintenance Personnel #5 - Work Management
Figure 9.4
12 Inactive Stock #5 - Work Management 47 RAV $ per Maintenance Craft Head Count #1 - Business Management
List of 13 Indirect Maintenance Personnel Cost #5 - Work Management 48 Reactive Work #5 - Work Management
SMRP 14 Indirect Stock #5 - Work Management 49 Ready Backlog #5 - Work Management
15 External Maintenance Personnel Cost #5 - Work Management 50 Schedule Compliance #5 - Work Management
Metrics
16 Maintenance Cost as % of Asset Replacement Value (RAV) #1 - Business Management 51 Schedule Compliance - Work Orders #5 - Work Management
17 Maintenance Cost per Unit of Production #1 - Business Management 52 Schedule Downtime #3 - Equipment Reliability
18 Maintenance Margin #1 - Business Management 53 Standing Work Orders #5 - Work Management
19 Maintenance Material Cost #5 - Work Management 54 Stock Outs #5 - Work Management
20 Maintenance Shutdown Cost #5 - Work Management 55 Store Value as % of RAV #1 - Business Management
9.6 Summary
Q9.10 List five metrics that can be used to measure overall plant
level performance of maintenance activities. Discuss the rea-
son for your selection.
10.1 Introduction
10.2 Key Terms and Definitions
10.3 Employee Life Cycle
10.4 Understanding the Generation Gap
10.5 Communication Skills
10.6 People Development
10.7 Resource Management and Organization Structure
10.8 Measures of Performance
10.9 Summary
10.10 Self Assessment Questions
10.11 References and Suggested Reading
309
310 Chapter 10
10.1 Introduction
People make it happen. They get things done. We may have great
plans and the best processes, but if we don’t have the people available
with the right skills, these plans and processes can’t be implemented or
carried out effectively. Developing people—the workforce—and empow-
ering them to give their best is key to defining the difference between an
ordinary company and a great organization. In addition, the right process-
es and technology must be in place to nurture and harness the potential of
human capital.
Maintenance and reliability processes are no different than any other
processes in the workplace. Organizations that are considered to be the
“Best of the Best” or “World Class” use many of the same key principles.
Dr. W. Edwards Deming, a world-renowned expert in the field of quality,
gave us the following 14 principles, which any organization can use to
improve its effectiveness.
7. Institute leadership.
There is a distinction between leadership and mere supervision. The
latter is quota and target based. The aim of supervision should be to help
people, machines, and gadgets to perform a better job.
Baby boomers
The generation born following World War II. They are idealists
in nature and usually very loyal to their organization. They feel
a sense of belonging and dedication based on their history. They
are motivated by power and prestige, learning opportunities, and
long-term benefits.
Certification
The result of meeting the established criteria set by an accredit-
ing or certificate-granting organization.
Communication
Effective transfer of information from one party to another;
exchanging information between individuals through a common
system of symbols, signs, or behavior.
Diversity (Workplace)
Workplace diversity refers to the variety of differences between
people in an organization. Diversity encompasses race, gender,
ethnic group, age, personality, cognitive style, tenure, organiza-
tional function, education, background, and more.
Employee Involvement
A practice within an organization whereby employees regularly
participate in making decisions on how their work areas operate,
including making suggestions for improvement, planning, goal
setting, and monitoring performance.
Generation Xers
People born primarily from the mid-1960s through the 1970s.
They are very ambitious and independent in nature and strive to
balance the competing demands of work, family, and personal
life.
314 Chapter 10
Generation Yers
People born after 1980, also known as the millennial generation.
They are technologically savvy with a positive, can-do attitude.
Qualified People
People who because of their knowledge, training, qualifications,
certification, or experience, are competent to perform the duties
of their job.
Succession Planning
A key element of the workforce development process. It identi-
fies and prepares suitable employees through mentoring, train-
ing, and job rotation to replace key players in the organization.
• Hire
• Inspire
• Admire
• Retire
Hire
This first step is probably the most important. It is important to hire
the best people we can find. This is not a time to be cheap. The cost of
replacing a bad hire far exceeds the marginal additional cost of salary or
expenses needed to hire the best person in the first place. Hire talent, not
just trainable skills. Skills can be taught to a talented employee.
Make your organization a place people want to come to and work for.
An organization’s culture can be a powerful recruiting tool. Ensure that
the new hire understands the goals your department or organization wants
to achieve.
Inspire
Once we have recruited the best employees to come to work on our
team, the hard part begins. We have to inspire them to perform to their
best abilities. We have to challenge and motivate them. That is where we
316 Chapter 10
will get their best effort and creativity that will help the organization
excel.
Make them welcome. Make them feel like part of the team from day
one. Set goals for them that are hard, but achievable. Be a leader, not just
a manager. It is a good practice to assign a mentor to help them get
acquainted with the new environment.
Mentoring is another good practice to implement during early stages
for new employees to help them adjust to a new environment. It is a good
practice to assign a senior person in the area to mentor new employees.
Admire
Once we have hired the best employees and have challenged and
motivated them, our job to inspire them continues. The biggest mistake is
made when a manager ignores them. As soon as we start to slack off, their
satisfaction and motivation decreases. If we don’t do something, they will
become disenchanted and will leave. They will become part of the
“employee turnover” statistics that need to be avoided.
We want TGIM (thank goodness, it’s Monday) employees, not TGIF
(thank goodness, it’s Friday) ones.
Give employees positive feedback as much as possible, even if it’s
just a few good words. Provide appropriate rewards and recognition for
jobs well done. Provide them additional training to develop new skill sets.
Retire
The time when somebody retires after a long service is when we know
that we have been successful. When employees see the organization as the
employer of choice, they will come and work for us. When they recognize
us as a good boss and a real leader, they will stay around. As long as we
continue to inspire, motivate, and challenge them, they will continue to
contribute at the high levels the organization needs in order to beat the
competition. They will be long-term employees; even staying with the
organization until they retire. They will refer other quality employees,
including their relatives. Organizations will attract and retain second and
even third generation loyal employees.
Along the way, the organization will have had some of the most cre-
ative and productive employees with the lowest employee costs in the
market.
Workforce Management 317
Silent
The Silent generation is adaptive in nature and has a very intense
sense of loyalty and dedication. They are also known as Traditionalist.
Many of them have been puppets, paupers, or pirates. As a result, they
have a wealth of knowledge to contribute to any organization. Some in
this group are not leaving the workforce yet. It is important to note that
some can’t leave due to financial reasons. Others love their work, are still
in good health, and want to continue to contribute to the organization.
The work ethic of the Silent Generation is built on commitment,
responsibility, and conformity as tickets to success. Most came of age too
late to be the heroes of World War II, yet too early to participate in the
318 Chapter 10
Baby Boomers
Boomers, idealists in nature, have always been seen as loyal to their
organizations. They feel a sense of belonging and dedication based on
their history. The Baby Boom Generation changed the physical and psy-
chological landscape forever. As products of “the Wonder Years,” they
were influenced by the can-do optimism of John F. Kennedy and the hope
of the post-World War II dream. But the intense social and political
upheaval of Vietnam, assassinations, and civil rights, led them to rebel
against conformity and to carve a perfectionist lifestyle based on person-
al values and spiritual growth. They welcome team-based work, especial-
ly as an anti-authoritarian declaration to “The Silents” ahead of them, but
they can become very political when their turf is threatened. Rocked by
years of reorganizing, reengineering, and relentless change, they now long
to stabilize their careers.
One of the most common complaints Boomers make about Gen Xers
and Gen Yers is that “they don’t have the same work ethic.” This does not
mean that they are not hardworking employees, but it does mean that they
place a different value and priority on work.
Workforce Management 319
as they seek employment that offers them both better benefits and more
opportunity for professional growth as well as personal fulfillment. They
want, and expect, their employers to hear what they have to say. They
have an interest in understanding the “big picture” for the organization
and how this influences their employment and growth. They are less like-
ly to accept a “because I said so” attitude from a supervisor.
Some of the ways to motivate Gen Xers to maximize productivity:
Encourage growth.
Provide feedback on the employee’s performance. Make sure the
employee understands expectations. Involve them in the decision-
making process whenever possible.
Provide Training.
Pay for employees to attend workshops and seminars. Provide on-site
classes where employees can learn new skills or improve upon old
ones. Challenge them.
Encourage Teamwork.
Provide opportunities for collaboration and teamwork. This genera-
tion “fuels their fire” through teamwork.
Create Ownership.
Help employees understand how the business operates. They need to
experience a sense of ownership. Encourage this by providing them
with information about new products or services, advertising cam-
paigns, strategies for competing, etc. Let each employee see how he
or she fits into the plan and how meeting their goals contribute to
meeting the organization’s objectives.
Workforce Management 321
Build morale.
Have an open work environment; encourage initiative, and welcome
new ideas. Don’t be afraid to spend a few dollars for such things as
free coffee or tea for employees, or ordering a meal for employees
who have to work overtime. Take time to speak with an employee’s
spouse or family when you meet them and let them know you appre-
ciate the employee. Gen Xers look for more than just fair pay; they
need and want personal acknowledgment and job satisfaction.
Communication Process
Problems with communication can be at every stage of the communi-
cation process (see Figure 10.1):
• Sender (source)
324 Chapter 10
• Encoding
• Channel—media
• Decoding
• Receiver
• Feedback
Importance of Listening
Listening is one of the most important skills we can have. How well
we listen has a major impact on our job effectiveness, and on the quality
of our relationships with others. We listen to:
• Obtain information
• Understand
326 Chapter 10
• Enjoy
• Learn
1. Pay attention.
Give the speaker undivided attention and acknowledge the message
.
• Look at the speaker directly.
• Put aside distracting thoughts.
• “Listen” to the speaker’s body language.
• Refrain from side conversations and avoid being distracted by
environmental factors.
• Nod occasionally.
• Smile and use other facial expressions.
• Encourage the speaker to continue with small verbal comments
like “yes” and “uh huh.”
3. Provide feedback.
Our personal filters, assumptions, judgments, and beliefs can distort
what we hear. As a listener, our role is to understand what is being said.
This may require us to reflect what is being said and ask questions.
• Reflect what has been said by paraphrasing. “What I’m hearing
is…” and “Sounds like you are saying…” are great ways to reflect
back.
• Ask questions to clarify certain points. “What do you mean when
you say…?” or “Is this what you mean?”
4. Defer judgment.
Interrupting is a waste of time. It frustrates the speaker and limits full
understanding of the message. If you don’t agree with the speaker’s view,
wait till you hear the full story or till the end during the question and
answer period to bring your viewpoint. It’s the speaker who is speaking.
• Issue an agenda.
• Start the discussion and encourage active participation.
• Work to keep the meeting at a comfortable pace—not moving too
fast or too slow.
• Summarize the discussion and the recommendations at the end of
each logical section.
• Ensure all participants receive minutes promptly.
Managing Meetings
Choosing the right participants is key to the success of any meeting.
Make sure all participants can contribute; choose good decision-makers
and problem-solvers. Try to keep the number of participants to a maxi-
mum of 12, preferably fewer. Make sure the people with the necessary
information for the items listed in the meeting agenda are the ones who
are invited.
As a meeting leader, work diligently to ensure everyone’s thoughts
and ideas are heard by guiding the meeting so that there is a free flow of
debate with no individual dominating and no extensive discussions
between two people. As time dwindles for each item on the distributed
Workforce Management 329
• Start on time.
• Don’t recap what you’ve covered if someone comes in late. It
sends the message that it is OK to be late for meetings, and it
wastes everyone else’s valuable time.
• State a finish time for the meeting and don’t over-run. If needed
to over-run, ensure everyone is available to stay for a stated peri-
od.
• To help stick to the stated finish time, arrange your agenda in
order of importance so that if you have to omit or rush items at the
end to make the finish time, you don’t omit or skimp on impor-
tant items.
• Finish the meeting before the stated finish time if you have
achieved everything you need to.
Minutes record the decisions of the meeting and the actions agreed.
They provide a record of the meeting and, more important, they provide a
review document for use at the next meeting so that progress can be meas-
ured—this makes them a useful disciplining technique as individuals’ per-
formance and non-performance of agreed actions is given high visibility.
ple resources? There are several ways to help increase personal satisfac-
tion and productivity that will benefit the employees and the organization.
Are people performing their jobs to the best of their abilities? What
additional training would allow particular employees to do their job bet-
ter? (Answers may include personal development, skills training, or both.)
Can job rotations or on-the-job training (OJT) with experienced cowork-
ers enhance employees’ skills and awareness of how their work fits into
overall organization’s goals? Could their tasks be automated, allowing
people to grow in other areas? These are some of the questions we need
to consider while developing a plan to enhance knowledge and skill sets
of our people. Job task analysis is one of the techniques that could be used
to identify specific training needs, ensuring that employees have appropri-
ate knowledge and skill sets to do their jobs effectively.
Performance Review
Job analysis can be used in the performance review to identify or
develop:
• Goals and objectives
• Performance standards
• Evaluation criteria
• Job duties to be evaluated
• Time period between evaluations
Environment
The work environment may include unpleasant conditions such as
temperature extremes, offensive odors, and physical limitations (or con-
straints) that can hinder the job performance. There may also be definite
risks such as noxious fumes, radioactive exposures such as X-rays, and
dangerous explosives.
Relationships
Is supervision required or not? What kind of interaction does this job
require—internally with fellow employees or externally with others out-
side the organization?
Requirements
Knowledge, skills, and abilities (KSAs) are required to perform the
job. Although an incumbent may have higher KSAs than those required
for the job, a job analysis typically states only the minimum requirements
to perform the job.
to see the results. Figure 10.2 lists typical benchmark data, based on
reviewing several benchmark studies and my personal discussion with
many M&R and Training Managers.
1. Asset/System Level
2. M&R Technologies
3. M&R Professionals/Managers
4. Plant/Facility Level
Asset/System Level
Assets and systems are getting much more complex. Many organiza-
tions have started to qualify operators and maintainers for critical and
complex assets. Organizations need to assure that people who are going to
operate and maintain have the appropriate skills. Operators and maintain-
ers are required to go through a training curriculum specifically design to
educate key aspects of that asset and then test to ensure that they have
comprehended the knowledge. Usually this type of certification or quali-
fication requirements are developed and administered within the organi-
Workforce Management 337
M&R Technologies
Several technology and general maintenance-related certifications
are available such as oil analysis, machinery vibration analysis, infrared
(IR) thermography, ultrasonic testing, motor current testing, hydraulics,
and pressure vessels. These certifications are valuable to both employees
and organizations. They test knowledge in their respective area of main-
tenance technologies. Training for most of these certifications is provided
by the key suppliers of these technologies or related professional soci-
eties, who also administer the test.
M&R Professionals/Managers
In this category, maintenance and reliability engineers, capital project
engineers, designers, managers, and other professionals working in the
M&R field are tested for their broad knowledge of M&R.
Two key certifications are available in this category. One is Certified
Plant Engineer (CPE)/Certified Maintenance Manager (CMM) offered by
the Association of Facility Engineers (AFE). The other is Certified
Maintenance and Reliability Professional (CMRP) conducted by the
Society of Maintenance and Reliability Professionals Certifying
Organization (SMRPCO). AFE and some industrial training companies
provide the necessary training for the test.
The CMRP certification process is accredited. According to require-
ments, SMRPCO is prohibited from providing any specific training. They
want to keep the certification—testing process away from trainers.
However, SMRPCO has a study guide that identifies M&R body of
knowledge, available in hard copy format from SMRP headquarters. This
guide can be downloaded from their website www.smrp.org/certification
free of cost.
CMRP certification was initiated in 2000 and is now recognized as the
standard of M&R certification by many organizations worldwide. The
certification process evaluates an individual’s skills in the five pillars of
knowledge defined by SMRP: Business and Management, Manufacturing
/Operations Processes Reliability, Equipment Reliability, Organization &
Leadership (People Skills), and Work Management. Many organizations
have now started using the CMRP certification to assess their employees’
knowledge and then to develop appropriate training programs that help
338 Chapter 10
Plant/Facility Level
There are no plant certifications available. However, two organiza-
tions recognize best plants/facilities. One is the North American
Maintenance Excellence (NAME) Award given to a plant based on
NAME-established criteria. The other is the “Best Plant” award conduct-
ed by the publishers of Industry Week. This award is not specific to M&R,
but recognizes overall aspects of manufacturing plant operations, e.g.,
quality, productivity, meeting customer needs on a timely basis, and
inventory levels.
Managing Diversity
Over 80% of new entrants into the workforce today are women and
minorities. This changing workforce is one of the major challenges facing
businesses. Organizations that recognize the need to fully develop all
members of the workforce demographic are forced to manage diversity.
Definitions of “diversity” range from narrow to very broad. Narrow
definitions tend to reflect Equal Employment Opportunity law, and define
diversity in terms of race, gender, ethnicity, age, national origin, religion,
and disability. Broad definitions may include sexual orientation, values,
personality characteristics, education, language, physical appearance,
marital status, lifestyle, beliefs, and background characteristics.
In the near future the labor market will become more and more of a
seller’s market. The shrinking workforce and the shortage of skilled labor
will force employers to compete to attract and retain all available employ-
ees, including previously under-represented groups. These demographic
changes have led many organizations to begin changing their cultures in
order to value and manage diversity better.
According to 2004 U.S. Bureau of the Census projections (comparing
2020 to 2000), the percentage of workers aged 20–44 will decline from
36.9% to 32.3%; the number of workers aged 45–64 will increase from
22.1% to 24.9%; and the number of workers aged 65–84 will increase
from 10.9% to 14.1%.
Such gray-haired demographics aren’t limited to the United States
either. The number of workers aged 20–44 will decrease and workers aged
45–84 will increase in other countries such as the United Kingdom,
Germany, Japan, and China.
Workforce Management 341
Many boomers say they plan to balance work and leisure in retire-
ment. They don’t plan to stop working at age 65, instead opting for a
“working retirement.” The reasons are both financial and personal.
Several studies have found that many corporate policies hinder efforts
to adapt to the aging workforce. Still, most organizations say they hire for
ability and willingness to work. A few employers say they’re “hiring wis-
dom” when hiring older workers. Gray-haired workers are viewed as reli-
able, settled, compassionate, and honest. Some organizations have set up
a Casual Worker Program that allows them to hire or reemploy workers
who would receive limited benefits and no pension.
Any time we start discussing organization structure and skills, the first
question that arises is which type of organization structure would be more
beneficial: a centralized or decentralized structure (see Figure 10.3).
Typically, a centralized organization structure is characterized by greater
specialization and standardization. The decentralized structure provides
stronger ownership and responsiveness.
Many experts have advocated shifting from one structure to another,
centralized to decentralized, and vice versa. The primary reason for this
recommendation is not that one type is necessarily better than the other,
but to bring about change. Shaking things up is necessary to break old, tra-
ditional practices. A hybrid structure, containing the best characteristics of
both, centralized and decentralized, is often the optimal solution.
In a hybrid organization, an individual production unit or area may
have their own maintenance technician assigned to take care of area
breakdowns and minor repairs. Operators become valuable resources as
342 Chapter 10
Centralized Decentraliz ed
Advantages Disadvantages Advantages Disadvantages
Standardiz ed practices Less responsive to Str ong ownership Dif ficu lt to buil d
individu al units / specialized skills
departme nt
Enterprise focu s — Lack of ownership Very responsive to Dif ficu lt to prioritize by
objective s aligne d with individu al area facil ity/department
organiza tion
Effici ent us e of resources Sub -optimum use of tools
Outsourcing Maintenance
We are under constantly increasing pressure to reduce our labor cost.
If there are some constraints put on the number of employees we can have,
under those circumstances we can evaluate outsourcing or shared servic-
es for some specific skills in order to obtain a cost-effective solution.
Large variations in workload may lead to poor utilization of resources
and overstaffing. Outsourcing some of the maintenance activities is a
Workforce Management 343
Succession Planning
Succession planning is another key element of the workforce devel-
opment process. It is the process of identifying and preparing suitable
employees—through mentoring, training, and job rotation—to replace
key players in the organization. It involves having management periodi-
cally review their key personnel and those in the next lower level to deter-
mine several backups for key positions. This is important because it often
takes years of grooming to develop effective managers and leaders.
A careful and considered action plan ensures the least possible dis-
ruption to the person’s responsibilities and to the organization’s effec-
tiveness if a key player suddenly becomes unavailable to perform his or
her role. Some sudden actions that may take place include:
10.9 Summary
11.1 Introduction
11.2 Key Terms and Definitions
11.3 Maintenance Root Cause Analysis Tools
11.4 Six Sigma and Quality Maintenance Tools
11.5 Lean Maintenance Tools
11.6 Other Analysis and Improvement Tools
11.7 Summary
11.8 Self Assessment Questions
11.9 References and Suggested Reading
351
352 Chapter 11
11.1 Introduction
There are many tools available to us. For the sake of streamlining our
discussion; they have been classified in four major categories. These are:
5 Whys
A problem-solving technique for discovering the root cause of
a problem. This technique helps users to get to the root of the
problem quickly by simply asking “why” a number of times
until the root cause become evident.
Barrier Analysis
A technique often used, particularly in process industries,
based on tracing energy flows. It has a focus on barriers to
those flows, and helps to identify how and why the barriers did
not prevent the energy flows from causing damage.
Cause-and-Effects Analysis
Also called Ishikawa or fishbone chart. It identifies many pos-
sible causes for an effect or problem, and then sorts ideas into
useful categories to help in developing appropriate corrective
actions.
Cause Mapping
A simple, but effective method of analyzing, documenting,
communicating, and solving a problem to show how individual
cause-and-effect relationships are inter-connected.
354 Chapter 11
Checklists
A generic tool that can be developed for a wide variety of pur-
poses. It is a structured, pre-prepared form for collecting,
recording, and analyzing data as the work progresses. Some
examples are operator’s start-up checklist, PM checklist, and
maintainability checklist used by designers.
Control Charts
A graph used to display how a process changes over time.
Comparing current data to historical control limits indicates
process variations whether the process is in control or out of
control.
Fault Tree
This analysis tool is constructed starting with the final failure
(or event) and progressively tracing each cause that led to the
previous cause. This continues till the trail can be traced back
Maintenance Analysis and Improvement Tools 355
Flow Chart
A graphical summary of the process steps (such as production,
storage, transportation) and flows (movement of information
and materials) that make up a procedure or process from
beginning to end. This information is used in defining, docu-
menting, studying, and improving the system. Also called flow
diagram, flow process chart, or network diagram.
Mistake Proofing
Mistake proofing, also known as Poka-Yoke (Japanese equiva-
lent), is the use of any automatic device or method that either
makes it impossible for an error to occur or makes the error
immediately obvious once it has occurred.
Muda
Japanese lean word for waste; non-value-added work.
Mura
Japanese lean word for unevenness; inconsistency.
Muri:
Japanese lean word for overburden; unreasonable work.
Pareto Analysis
This bar graph displays variances by the number of their
occurrences. Variances are shown in their descending order to
identify the largest opportunities for improvement, and to sepa-
rate the critical few from the trivial many. The concept is also
known as 80/20 Principle.
Scatter Diagram
A diagram that graphs pairs of numerical data, one variable on
each axis, to look for a trend or a relationship .
Six Sigma
This methodology systemically analyzes processes to reduce
process variations and also to eliminate wastes. Six Sigma is
also used to further drive productivity and quality improve-
ments in any type of organization. DMAIC (Define – Measure
– Analyze – Improve – Control) represents the steps used to
guide implementation of the Six Sigma process.
Standard Deviation
Standard deviation measures variations of values from the
mean. It is denoted by the Greek letter (σ) and is calculated
using the following formula:
Stratification
A technique that separates data gathered from a variety of
sources so that a pattern can be seen.
Scenario #1 The Plant Manager walked into the plant and found a
puddle of oil on the floor near a tube assembly machine. The manager
instructed the area supervisor to have the oil cleaned up immediately. The
next day, while in the same area of the plant, the Plant Manager again
found oil on the floor and asked the area supervisor to get the oil cleaned
up from the floor. In fact, the manager was a little upset with the supervi-
sor for not following directions given the day before. His parting words
were either to get the oil cleaned up or he would find someone who would.
Scenario #2 The Plant Manager walked into the plant and found a
puddle of oil on the floor near a tube assembly machine. The manager
asked the area supervisor why there was oil on the floor. The supervisor
indicated that it was due to a leaky oil seal in the hydraulic pipe joint
above. The Plant Manager then asked when the oil seal had been replaced.
The supervisor responded that Maintenance had installed 5 or 6 oil seals
over the past few weeks and each one seemed to leak. The supervisor also
indicated that Maintenance had been talking to Purchasing about the seals
because it seemed they were all failing — leaking prematurely.
The Plant Manager then went to talk to Purchasing about the situation
with the seals. The Purchasing Manager indicated that they had in fact
received a bad batch of oil seals from the supplier as reported by the main-
tenance department. The Purchasing Manager also indicated that they had
been trying for the past month or so to get the supplier to make good on
the last order of 50 seals that all seemed to be bad.
The Plant Manager then asked the Purchasing Manager why they had
purchased from this supplier if their quality was poor. The Purchasing
Manager replied that the supplier was the lowest bidder when quotes were
received from various suppliers. When the Plant Manager asked the
Purchasing Manager why they went with the lowest bidder without con-
sidering bidder’s quality issues, the Purchasing Manager explained that he
was directed by the Finance Manager to reduce cost.
Next, the Plant Manager went to talk to the Finance Manager about
the situation. The Finance Manager noted that his direction to Purchasing
to always take the lowest bid was in response to the Plant Manager’s
memo telling them to be as cost conscious as possible and only purchase
from the lowest bidder, thus saving money. The Plant Manager was horri-
fied to realize that he was the reason there was oil on the plant floor. What
a discovery!!
We may find Scenario # 2 somewhat funny, and even laugh when the
problem comes full circle. We have found that most of the time everyone
Maintenance Analysis and Improvement Tools 359
in the organization tries to do their best and to do the right things. But,
sometimes things don’t work out the way we envision. The root cause of
this whole situation is sub-optimization with no overall vision. Scenario
#2 also provides a good example of how one should proceed to do the root
cause analysis. We need to continue asking “Why?” until a pattern
emerges and the cause of the difficult situation becomes rather obvious.
When we have a problem, how do we approach it for a solution? Do
we jump in and start treating the symptoms, much like continually clean-
ing oil as in Scenario #1? If we only fix the symptoms, based on what we
see on the surface, the problem will almost certainly happen again. Then
we will keep fixing the problem, again and again, but never solving it.
The practice of RCA is predicated on the belief that problems are best
solved by attempting to correct or eliminate root causes, as opposed to
merely addressing the immediately obvious symptoms. By directing cor-
rective measures at root causes, the likelihood of problem recurrence will
be minimized. In many cases, complete prevention of recurrence through
a single intervention is unlikely. Therefore, RCA is often considered to be
an iterative process; it is frequently viewed as a part of a continuous
improvement tool box.
Root cause analysis is not a single, defined methodology; there are
several types or philosophies of RCA in existence. Most of these can be
classified into four, very broadly defined categories based on their field of
application: safety-based, production-based, process-based, and asset fail-
ure-based.
• What happened?
• What were the specific symptoms?
e.g., operators, maintainers, and others who are familiar with the situation.
People most familiar with the problem can help lead to a better under-
standing of the issues. Identify issues that contributed to the problem col-
lectively. The details of the problem–failure can be organized by using the
‘3W2H’ (what, when, where, how, how much) tool.
Do not make any assumptions when examining a problem. No two
problems are exactly the same in nature and cause. Actually, it is rare for
the exact same failure to occur twice. Each problem should be reviewed
as if you are looking at the situation for the first time. It may be that the
two failure phenomena appear to be same, but the causes can be different.
• 5 Whys
• Cause and effect diagram or fishbone diagram
• Failure mode and effects analysis (FMEA)
• Pareto analysis — 80/20 rule
• Fault tree analysis
• Barrier analysis
• Cause mapping
Asset / Machine
• Incorrect asset /machine used
• Incorrect tool selected
• Poorly maintained or inadequate maintenance
• Poor design
• Poor machine installation
• Defective machine or tool
Environment
• Poorly maintained workplace
• Inadequate job design or layout of work
• Surfaces poorly maintained
• Physical demands of the task
• Forces of nature or Act of God
Methods
• No or poor procedures
• Not following the procedures
• Poor communication
5 Whys Analysis
The 5 Whys is a simple problem-solving technique that helps users get
to the root of the problem quickly. Made popular in the 1970s by the
Toyota Production System, the 5 Whys analysis involves looking at any
problem and asking: “Why?” and “What caused this problem? Quite
Maintenance Analysis and Improvement Tools 365
often, the answer to the first why will prompt another why and the answer
to the second why will prompt another, and so on — hence the name the
5 Whys analysis.
Example
The following example shows the effectiveness of the 5 Whys
analysis as a problem-solving technique:
1. Why is our customer (operations department xyz) unhappy?
• Because we did not deliver our services (fixing the
asset) when we said we would.
2. Why were we unable to meet the agreed-upon schedule for
delivery?
• The job took much longer than we thought it would.
3. Why did it take so much longer?
• Because we underestimated the work requirements.
4. Why did we underestimate the work?
• Because we made a quick estimate of the time needed to
complete it, and did not list the individual steps needed to
complete the total job. In short, we did not do work plan-
ning for this job.
5. Why didn’t we do planning — detailed analysis — for this job?
• Because we were running behind on other projects and were
short on planner resources. Our customer (operations depart-
ment xyz) was forcing us to do the job quickly. Clearly, we
needed better planning, including improved time estimations
and steps needed to complete the job efficiently.
366 Chapter 11
The 5 Whys analysis is an effective tool for uncovering the root causes
of a problem. Because it is so elementary in nature, it can be adapted quick-
ly and applied to almost any problem. Remember, if it doesn’t prompt an
intuitive answer, other problem-solving techniques may need to be applied.
The first step is fairly simple and straightforward. Define the problem
for which the root cause needs to be identified. Usually the maintenance /
reliability engineer or technical leader chooses the problem that needs a
permanent fix, and that is worth brainstorming with the team.
After the problem is identified, the team leader can start constructing
the Fishbone diagram. The leader defines the problem in a square box to
the right side of a page or worksheet. A straight line is drawn from the left
to the problem box with an arrow pointing towards the box. The problem
box now becomes the fish head and its bones are to be filled in during the
steps that follow. Figure 11.1 provides an example of a Hydraulic Pump
analysis. In this example, a hydraulic pump that is not pumping the desired
output (oil — specified pressure and volume) has become a problem.
The next step is to start identifying major components and suspected
causes of this failure, e.g., bearing failure, motor failure, seal failure, or
shaft failure. All major causes are identified and connected as parts (the
bones) of the Fishbone diagram. Causes of bearing failures are also listed
in this example. The next step is to refine the major causes to find the sec-
ondary causes and other causes occurring under each of the major cate-
gories.
In general, the following steps are taken to draw the fishbone dia-
gram:
1. List the problem/issue to be investigated in the “head of the
fish”.
2. Label each “bone” of the “fish”. Major categories typically
include:
• The 4 Ms:
Methods, Machines, Materials, and Manpower
• The 4 Ps:
Place, Procedure, People, and Policies
• The 4 Ss:
Surroundings, Suppliers, Systems, and Skills
• The 6 Ms
Machine, Method, Materials, Measurement, Man, and
Mother Nature (Environment)
• The 6 EPMs
Equipment/Asset, Process, People, Materials,
Environment, and Management.
a. The team may use one of the categories suggested above, com-
bine them in any manner, or make up others as needed. The cat-
egories are to help organize the ideas.
b. Use an idea-generating technique (e.g., brainstorming) to identi-
fy the factors within each category that may be affecting the
problem or effect being studied. The team should ask, “What are
the issue and its cause and effect?”
c. Repeat this procedure with each factor under the category to
produce sub-factors. Continue asking, “Why is this happening?”
and put additional segments under each factor and subsequently
Maintenance Analysis and Improvement Tools 369
Although the purpose, terminology, and other details can vary accord-
ing to type (e.g., Process FMEA, Design FMEA), the basic methodology
is similar for all.
372 Chapter 11
Figure 11.4
FMEA Steps
Maintenance Analysis and Improvement Tools 373
• SAE J1739,
• AIAG FMEA-3
• MIL-STD-1629A (out of print / cancelled)
i. Items — components
ii. Functions
iii.Failure modes
iv.Effects of Failure
v. Causes of Failure
374 Chapter 11
Figure 11.5
SAE/AIAG
FMEA
Guidelines
Maintenance Analysis and Improvement Tools 375
FMEA with a
Documenting
Figure 11.6a
Spreadsheet
376 Chapter 11
FMEA Procedure
The basic steps for performing a FMEA are outlined below:
1. Establish the objective of FMEA and identify analysis team
members who should consist of representatives from key stake-
holders, e.g., design, operations, maintenance, and materials. If
the FMEA is design-based, the designer should lead this effort.
2. Describe the asset or process and its functions. A good under-
standing of the asset or process under consideration is important
as this understanding simplifies the analytical process.
3. Create a block diagram of the asset or process. In this diagram,
major components or process steps are represented by blocks
connected together by lines that indicate how the components or
steps are related. The diagram shows the logical relationships of
components and establishes a structure around which the FMEA
can be developed. Establish a coding system to identify system
elements.
4. Create a spreadsheet or a form and list items or components and
the functions they provide. The list should be organized in a log-
ical manner according to the subsystem and sub-assemblies,
based on the block diagram.
5. Identify which component could fail, how it will fail (failure
modes), and for each component why it will fail (possible
cause). A failure mode is defined as the manner in which that
item / component could potentially fail while meeting the design
intent. Examples of potential failure modes include:
• Broken / Fractured
• Corrosion
• Deformation
• Clogged / Contamination
• Excess Vibration
• Electrical Short or Open
• Eroded / Worn
378 Chapter 11
Benefits of FMEA
FMEA helps designers and engineers to improve the reliability of
assets and systems to produce quality products. FMEA analysis helps to
incorporate reliability and maintainability features into the asset design to
eliminate or reduce failures, thereby reducing overall life cycle cost.
Properly performed, FMEA provides several benefits. These include:
Fault Tree
A fault tree is constructed starting with the final failure. It progres-
sively traces each cause that led to the previous cause. This continues until
the trail can be traced back no further. Each result of a cause must clearly
flow from its predecessor. If it becomes evident that a step is missing
between causes, it is added in and need is explained.
Once the fault tree is completed and checked for logical flow, the
investigating team determines what changes to make to prevent or break
the sequence of causes and consequences from occurring again. It is not
necessary to prevent the first, or root cause, from happening. It is merely
necessary to break the chain of events at any point so that the final failure
cannot occur. Often the fault tree leads to an initial design problem. In
Maintenance Analysis and Improvement Tools 381
such a case, redesign becomes necessary. Where the fault tree leads to a
failure of procedures, it is necessary either to address the procedural
weakness or to install a method to protect against the damage caused by
the procedural failure.
line that best approximates the mathematical function that really describes
what is going on in the process. Let’s take a normal curve as an example
of one that we might decide to use. If our histogram can be closely
described by a normal distribution, then 68% of the bags we measure will
weigh within one standard deviation, or one sigma, of the mean value. If
the mean is 200 grams, and the standard deviation is 10 grams, then 68%
of the bags would weigh between 190 and 210 grams. Again, if this is a
normal distribution, we would find that 95.5% of the bags weighed with-
in 2 sigma, or 20 grams of the mean. If we consider 3 sigma, or 30 grams,
we would find that 99.7% of all bags would weigh between 170 grams,
and 330 grams. A very few, just 0.3% of all bags, would weigh less than
170 grams or more than 230 grams.
If this is the case, our process is wider than our specification limits.
Some of the bags weigh less than 180 grams, and some weigh more than
220 grams. Does that matter? The answer depends on product or process.
If we were measuring something requiring fine tolerances, or something
expensive, or where precise temperatures or mixture compositions were
critical, we wouldn’t want to find instances where our process was pro-
ducing outputs out of the specification limits. In that case, we need to
reduce the process variation. In the example above, if we can control the
process variation and reduce sigma (σ) to say 3 grams, then we can be
sure to meet and exceed the specification requirements, (200 grams +/- 10
grams), 99.7 % of the time.
The Six Sigma quality movement takes process variation very much
to heart. In fact, Six Sigma advocates believe that for many processes,
there should be six sigma control between the mean and the specification
limits, so that the process is making only a few bad (3.4) parts per million.
Of course, by relaxing the specifications, we can meet Six Sigma require-
ments, but that isn’t usually the way to please customers. Instead, the vari-
ation in the process needs to be driven towards zero, so that the histogram
gets narrower, and fits more comfortably inside the specification limits.
Six Sigma methodologies are not new. They combine elements of sta-
tistical quality control, breakthrough thinking, and management science
— all valuable, powerful disciplines. The application of quality tools and
process improvement can help in achieving excellent results.
classic Deming’s PDCA (Plan, Do, Control and Act) cycle is used to plan
improvements, implement and test them, evaluate if they worked, and
then standardize if they did. However, for problem solving in Six Sigma,
the PDCA cycle has been modified slightly to have a five-phase method-
ology called DMAIC.
DMAIC represents the five phases in the Six Sigma methodology:
• D — Define
• M — Measure
• A — Analyze
• I — Improve
• C — Control
Recognize that the numbers don’t have to be exactly 20% and 80%.
The key point is that most things in life, effort, reward, output, etc., are
not evenly distributed; some contribute more than others. The number
20% could be anything from 10–30%; similarly 80% could be 60–90%.
What we need to remember: 20% is the vital few; the remaining are the
others. Thus, 20% of the assets could create 60, 80, or 90% of the failures.
Pareto analysis helps us to prioritize and focus resources to gain maxi-
mum benefit.
Figure 11.10 shows an example of Pareto analysis of plant assets.
The value of the Pareto Principle is that it reminds us to focus on the
20% that are important. Of the things we do during our daily routine, only
20% of our activities really matter — they produce 80% of our benefits.
These are the activities we must identify and emphasize.
essary to be efficient and effective, and to ultimately create value for our
customers. In fact, many subject matter experts and authors are saying the
same mantra — get rid of the waste.
For example, Kevin S. Smith, President of TPG Productivity, Inc.,
states, “Lean is a concept, a methodology, a way of working; it’s any
activity that reduces the waste inherent in any business process.”
In their famous book Lean Thinking, James P. Womack and Daniel T.
Jones write that the critical starting point for lean thinking is value. Value
can only be defined by the ultimate customer. It’s only meaningful when
expressed in terms of a specific product (a good or a service, and often
both at once) that meets the customer’s needs at a specific price at a spe-
cific time.
Lean Background
Lean philosophy or thinking is not new. At the turn of the century,
Henry Ford, founder of the Ford Motor Company, was implementing lean
philosophy. Of course he didn’t use the word lean at that time.
John Krafcik, a Massachusetts Institute of Technology (MIT)
researcher in the late 1980s, coined the term Lean Manufacturing while
involved in a study of best practices in automobile manufacturing. The
Maintenance Analysis and Improvement Tools 389
1. Muri: Overburden
2. Mura: Unevenness
3. Muda: Waste, non-value-added work
duction process with the right part, at the right time, in the right amount,
and first-in, first-out component flow. Just in Time systems create a “pull
system” in which each sub-process withdraws its needs from the preced-
ing sub-processes, and ultimately from an outside supplier. When a pre-
ceding process does not receive a request or withdrawal, it does not make
more parts. This type of system is designed to maximize productivity by
minimizing storage overhead.
Muda is a traditional Japanese term for activity that is wasteful and
doesn’t add value or is unproductive.
The original seven muda are:
Lean Maintenance
Much has been written and talked about lean concepts in manufactur-
ing, but what about lean maintenance? Is it merely a subset of lean man-
ufacturing? Is it a natural spinoff from adopting lean manufacturing prac-
tices? Lean maintenance is neither a subset nor a spinoff. Instead, it is a
prerequisite for success as a lean organization. Can we imagine lean JIT
concepts to work without having reliable assets or good maintenance
practices? Of course, we want maintenance to be lean — efficient and
effective — without waste. Lean maintenance has nothing to do with thin-
ning out warm bodies, or more directly, reducing maintenance resources.
Rather, it has to do with enhancing the value-added nature of our mainte-
nance and reliability efforts.
In maintenance, our customers are inherently internal to our organiza-
tion — they are our operations / production departments. One of the pri-
mary responsibilities of maintenance is to provide plant capacity to its
customers. Let’s face a fundamental truth: We can’t be successful with
Lean manufacturing if we don’t have reliable assets, reliable machines.
Lean maintenance is not performing lean (less) corrective or preventive
actions. It is not about facilitating a poor maintenance program.
Maintenance customers expect maintenance and reliability programs to be
optimized — effective and efficient, and fully supporting the need to oper-
ate at designed or required capacity reliably.
The majority of maintenance activities revolve around systems and
the processes that move people, material, and machine together such as
preventive maintenance programs, predictive maintenance programs,
planning and scheduling, computerized maintenance management sys-
tems, and store room and work order systems. We need to apply the prin-
ciples of Lean to these maintenance programs and processes to drive out
the non-value-added activities.
Value stream mapping for key maintenance processes to identify non-
value-added activities should be performed. It will be a good practice to
create current and future states of the maintenance processes in order to
develop a plan to reduce and eliminate wasteful activities. In developing
the current- and future-state maps of maintenance process, we must also
assess the skills and knowledge of our maintenance personnel. A poorly-
skilled person operating within a great system will produce poor results.
Likewise, if we have a good preventive maintenance program, yet our
PMs are poorly structured and designed, our PMs will achieve poor
results. Therefore we need to optimize PM using tools such FMEA/RCM
to give new life to our efforts.
392 Chapter 11
and fish, it also carries other elements which may not be adding any val-
ues, or instead may be harming it. Similarly, our processes have many
non-value-added and wasteful activities which need to be identified and
eliminated. Value Stream Mapping identifies waste and helps to stream-
line the process for higher productivity.
VSM is a tool commonly used in continuous improvement programs
to help understand and improve the material and information flow within
processes and organizations. As part of the lean methodology, VSM cap-
tures the current issues and presents a realistic picture of the whole
process from end to end in a way that is easy to understand. To some,
VSM is a paper and pencil tool that helps us to see and understand the
flow of information and material as it makes its way through the value
stream. It helps to identify steps that are not adding any value — they are
waste and need to be removed from the process or improved.
VSM is a structured process and is usually carried out in eight steps.
Process Flow Types There are four primary process flow types in the
TOC lexicon. The four types can be combined in many ways in larger
facilities.
Although affinity diagrams are not complicated, getting the most from
them takes a little practice. For example:
Maintenance Analysis and Improvement Tools 397
• Make sure that ideas and issues that have been captured are
understood. Usually brainstorming sessions have a habit of sim-
plifying issues or agreeing without understanding the concepts
being discussed.
• Do not place the notes in any order. Furthermore, do not deter-
mine categories or headings in advance.
• Allow plenty of time for grouping the ideas. Maybe post the
randomly-arranged notes in a public place and allow grouping to
happen over a few days, if needed.
• Use an appropriate number of groups within the diagram. Too
many can become confusing and unmanageable.
can be used to establish the type of barriers that should have been in place
to prevent the incident, or could be installed to increase system safety.
Therefore, Barrier Analysis can be used either proactively, to help to
design effective barriers and control measures, or reactively, to clarify
which barriers failed and why.
Barrier analysis offers a structured way to visualize the events related
to system failure. Barriers can be physical, human action, or system con-
trolled. Barriers can be classified as:
• Physical barriers
• Natural barriers
• Human action barriers
• Administrative barriers
11.7 Summary
399
Chapter 12
Current Trends and Practices
12.1 Introduction
12.2 Key Terms and Definitions
12.3 Energy Management, Sustainability, and the Green Initiative
12.4 Personnel, Facility, and Arc Flash Safety
12.5 Risk Management
12.6 Corrosion Control
12.7 Systems Engineering and Configuration Management
12.8 Standards and Standardization
12.9 Summary
12.10 Self Assessment Questions
12.11 References and Suggested Reading
12.1 Introduction
Configuration
The arrangement and contour of the physical and functional
characteristics of a system, equipment, and related hardware or
software. It also includes controlling and documenting changes
made to the functional characteristics and layout
Configuration Management
A discipline applying technical and administrative direction and
surveillance to identify and document the functional and physi-
cal characteristics of an asset / system called a configuration
item; control changes to those characteristics; and record and
report changes.
Current Trends and Practices 403
Corrosion
Gradual alteration, degradation, or eating away of a metal due to
a chemical or electrochemical reaction between it and its envi-
ronment.
Green Energy
A type of energy that is considered to be environmentally friend-
ly and non-polluting, such as hydro, geothermal, wind, and solar
power.
Hazard
A condition that is prerequisite to a mishap and a contributor to
the effects of the mishap.
Hazardous Material
Any substance that is listed as corrosive, harmful, irritant, reac-
tive, toxic, or highly toxic.
LEED
Leadership in Energy and Environmental Design.
Mishap
An unplanned event or series of events resulting in death, injury,
occupational illness, or damage to or loss of equipment or prop-
erty, or damage to the environment.
Mitigation
A method that eliminates or reduces the consequences, likeli-
hood, or effects of a hazard or failure mode; a hazard control.
Risk
A future event that has some uncertainty of occurrence and neg-
ative consequence if it were to occur.
404 Chapter 12
Risk Assessment
The determination of quantitative or qualitative value of risk
related to a concrete situation and a recognized threat (also
called hazard). Quantitative risk assessment requires calcula-
tions of two components of risk: consequence, the magnitude of
the potential loss, and likelihood, the probability that the loss
will occur.
Risk Disposition
One of several different ways to address identified risk.
Risk Management
A continuous process that is accomplished throughout the life
cycle of a system to:
• Identify and measure the unknowns.
• Develop mitigation options.
• Select, plan, and implement appropriate risk mitigations.
• Track the implementation of risk mitigations to ensure
successful risk reduction.
Risk Type
One of several risk attributes: residual, transferred, assumed,
avoided.
Sustainability
The ability to maintain a certain status or process in existing
systems; in general refers to the property of being sustainable;
capacity to endure.
Validation
The act of determining that a product or process, as constituted,
will fulfill its desired purpose.
Verification
The process of assuring that a product or process, as constitut-
ed, complies with the requirements specified for it.
Current Trends and Practices 405
What is Sustainability?
Sustainability in general refers to the property of being sustainable.
The widely accepted definition of sustainability or sustainable develop-
ment was given by the World Commission on Environment and
Development in 1987. It defined sustainable development as “forms of
progress that meet the needs of the present without compromising the
ability of future generations to meet their needs.”
Over the past 25 years, the concept of sustainability has evolved to
reflect perspectives of both the public and private sectors. A public policy
perspective would define sustainability as the satisfaction of basic eco-
nomic, social, and security needs now and in the future without undermin-
ing the natural resource base and environmental quality on which life
depends. From a business perspective, the goal of sustainability is to
increase long-term shareholder and social value, while decreasing indus-
try’s use of materials and reducing negative impacts on the environment.
Common to both the public policy and business perspectives is the
recognition of the need to support a growing economy while reducing the
social and economic costs of economic growth. Sustainable development
can foster policies that integrate environmental, economic, and social val-
ues in decision making. From a business perspective, sustainable develop-
ment favors an approach based on capturing system dynamics, building
resilient and adaptive systems, anticipating and managing variability and
risk, and earning a profit. Sustainable development reflects not the trade-
off between business and the environment but the synergy between them.
Practically, sustainability refers to three broad themes, also called pil-
lars, of sustainability: economic, social and environmental. They must all
be coordinated and addressed to ensure the long-term viability of our
community and the planet. The sustainability issue has emerged as a result
of significant concerns about the unintended social, environmental, and
economic consequences of rapid population growth, economic growth,
and the increasing consumption of our natural resources. When consider-
ing existing or new individual, business, industrial and community prac-
tices or projects, one must ensure that economic, social, and environmen-
tal benefits are achieved. Each person, business, and industry has a role
and a responsibility to ensure their individual and collective actions sup-
port the sustainability of our community.
406 Chapter 12
This also means that we must preserve our resources in such a way
that human beings in the future can enjoy them as well. To achieve this,
we must regenerate our resources at a rate that is equal to or faster than
our consumption.
Social sustainability stems from the fact that multiple cultures and
societies all share and inhabit the same planet. These cultures may be dif-
ferent in their histories, backgrounds, and beliefs, but each brings a differ-
ent perspective to the world around them. Therefore, considering the
social side of resources (whether they be land or other physical resources)
must play a part into the overall sustainability equation.
Environmental sustainability is important because it involves natural
resources that human beings need for economic or manufactured capital.
Materials taken from nature are used for solutions that address human
needs. If nature is depleted faster than it can regenerate, human beings
will be left without raw materials. Furthermore, environmental sustain-
ability also involves ensuring that waste emissions are at volumes that
nature can handle. If not, many humans and other living things on Earth
may be harmed to the point of extinction.
Sustainability is really based on a simple principle: Everything that
we need for our survival and well-being depends, either directly or indi-
rectly, on our natural environment. Sustainability creates and maintains
the conditions under which humans and nature can exist in productive har-
mony, that permit fulfilling the social, economic, and other requirements
of present and future generations.
egorized by the Department of the Energy (DOE) in much the same way:
1. Steam
2. Process Heat
3. Motors, pumps and fans
4. Compressed air
Steam
Over 45% of all the fuel burned is consumed to raise steam. Steam is
used to heat raw materials and treat semi-finished products. It is also a
power source for equipment, as well as for building heat and electricity
generation. Many manufacturing facilities can recapture energy through
the installation of more efficient steam equipment and processes. The
whole system must be considered to optimize energy and cost savings.
Process Heating
Process heating is vital to nearly all manufacturing processes, supply-
ing heat needed to produce basic materials and commodities. Heating
processes consume nearly 20% of all industrial energy use. Advanced
technologies and operating practices offer significant savings opportuni-
ties to reduce process heating costs.
Compressed Air
The compressed air systems account for an estimated $5 billion per
year in energy costs in the U.S. industrial sector. Many industries use
compressed air systems as power sources for tools and equipment used for
pressurizing, atomizing, agitating, and mixing applications. The major
source of waste for this type of energy is leakage. Many users at the shop
floor believe the myth that air is free or costs very little. Optimization of
compressed air systems can provide energy efficiency improvements of
20%–50%.
Although the predominant energy sources used in industry are natural
Current Trends and Practices 411
gas and electricity, industry also uses other energy sources, such as fuel
oil, for producing heat. Some facilities have on-site co-generation, where
they combust a fuel (e.g., natural gas, waste oil, or scraps) to produce heat
and electricity. Understanding the energy end usage —what work we use
the energy to do—reveals more useful information to identify opportuni-
ties for improving efficiency and reducing costs. In an office setting, end-
uses primarily include heating, ventilating, and air conditioning (HVAC),
lighting, and operation of appliances and computers. In an industrial plant,
end uses primarily include process equipment operation, process heating
and cooling, transportation, HVAC, and lighting.
Understanding the costs of energy use can raise awareness of the
potential value of identifying and eliminating energy waste. The costs of
energy use are not always visible to production/operations managers
because they are rolled up into facility overhead costs, rather than
assigned to production areas. Explicitly tracking costs associated with
individual processes or equipment can encourage energy conservation.
Compressed Air
Lighting
It’s simple. STOP! Get somebody to take care of this potential hazard
before proceeding to the meeting. Yes, we will be late. It’s OK. Apologize
to the meeting group and tell them the truth — the reason for being late.
Also, on the way back, ensure that the hazard has been taken care of and
somebody is finding out its root cause. That’s safety culture.
Instituting a safety culture must begin at the top of the organization.
However, all employees have a responsibility to follow procedures and
think about how they do their work. Usually, we start with an organization
in the reactive stage, where employees are reacting to incidents instead of
thinking about how to prevent or eliminate them. Once employees begin
to view safety as something important to them and something which they
value, they move to the independent stage. This is where they are practic-
ing safety because they want to do it, not because they are being told to
do it. The ultimate goal is the interdependent stage when every employee
is looking out for the other. It’s a “brother’s keeper” mentality. At this
stage, any employee should be comfortable to call out a safety issue to the
point where they will stop a production line if they see a problem, or chal-
lenge a manager who, for example, isn’t wearing a hard hat.
The State of Montana has done a unique thing to create a safety cul-
ture. A Safety Culture Act was enacted in 1993 by the Montana state leg-
islature to encourage workers and employers to come together to create
and implement a workplace safety philosophy. The intent of this act is to
raise workplace safety awareness to a preeminent position in the minds of
all of Montana’s workers and employers. It becomes the responsibility
and duty of the employers to participate in the development and imple-
mentation of safety programs that will meet the specific needs of their
workplace — thereby establishing a safety culture that will help to create
a safe work environment for all future generations of Montanans.
Bart takes this subject very passionately. It’s evident when he talks
about it and in his actions on the programs he runs for ATA-Jacobs. He
speaks from the heart because of personal experience. He lost his 23-year-
old brother in an industrial accident and he knows first-hand how an
injury affects a widespread net of friends and family for the rest of their
lives.
According to DuPont’s Rosanne Danner, of the most common reasons
organizations fail to develop a safety culture are:
intense radiant heat produced by the arc. The metal plasma arc produces
tremendous amounts of light energy from far infrared to ultraviolet.
Surfaces of nearby people and objects absorb this energy and are instant-
ly heated to vaporizing temperatures. The effects of this can be seen on
adjacent walls and equipment, which are often ablated and eroded from
the radiant effects.
Risk is the potential that a chosen action or activity will lead to a loss,
an undesirable event or outcome. We all take risks in our everyday life.
When we do any work or activity at work, home or in our personal life,
such as driving to work, repairing a machine, engaging in a new venture
or assignment or project, we accept a certain level of risk. Unconsciously
in our mind, we evaluate the risk and its benefits and based on that infor-
mation, we do things because we believe the level of benefits outweigh
the risks. For example, we know that although driving can be dangerous,
it gets us to work or places we want to go in less time. Also, we know from
historical data that the probability of having an automobile accident is rea-
sonably low if we follow the road rules/regulations, such as wearing safe-
ty belts and staying within speed limits. However, sometimes we don’t
follow the rules, underestimate or misunderstand the risk involved, or
simply ignore it for many reasons, therefore resulting in undesirable out-
comes. Risks can also come from uncertainty in project failures (at any
phase in asset/system development, production or sustainment life-
cycles), legal liabilities, operational accidents, natural causes and disasters
as well as deliberate attack from an adversary or events of uncertain root
cause. Risk is officially defined as the combination of the probability of
an event and its consequences (ISO/IEC Guide 73). In all types of tasks
we undertake, there is the potential for events and consequences that con-
stitute opportunities for benefits or threats to success.
Questions arise about how we manage these uncertainties (risks).
Several risk management guidelines and standards have been developed
including those from the Project Management Institute (PMI), actuarial
societies, and ISO standards. Methods, definitions, and goals vary widely
according to whether the risk management method is in the context of
project management, security, engineering, industrial processes, financial
portfolios, actuarial assessments, or public health and safety. Risk
Management is increasingly recognized as a technique that considers
both positive and negative aspects of risk. In the safety field, risk is known
as a hazard and is generally recognized that consequences are only nega-
tive and therefore the management of safety risk is focused on prevention
and mitigation of these hazards
428 Chapter 12
Purpose
The purpose of risk management is to prevent, reduce, or control
future impacts of unfavorable events as opposed to reacting to unwanted
events after they have already occurred. The mitigation of every plausible
risk may not be possible and is rather impractical due to resource limita-
tions. Hence, effective risk management requires a process to determine
which risks are actionable and can be mitigated, and which risks are non-
actionable or residual and cannot be mitigated. These risks must be con-
trolled instead (if identified early enough), watched, or transferred while
being accepted by the appropriate authority.
Risk Assessment
The fundamental difficulty in risk assessment is determining the rate
of occurrence because statistical information may not be available on all
categories of past incidents. Furthermore, evaluating the severity of the
430 Chapter 12
%$#"! !#%!"
1. List all of the likely risks that the asset/system or project faces.
Make the list as comprehensive as possible.
2. Assess the probability of each risk occurring, and assign it a rat-
ing. For example, use a scale of 1 to 5. Assign a score of 1 when
a risk is extremely unlikely to occur, and use a score of 5 when
the risk is extremely likely to occur.
3. Estimate the impact on the asset/system or project if the risk
occurs. Again, do this for each and every risk on the list. Using
a 1–5 scale, assign it a 1 for little impact and a 5 for a huge, cat-
astrophic impact.
4. Map out the ratings on the Risk Impact/Probability Chart.
5. Develop a response to each risk, according to its position in the
chart. Remember, risks in the bottom left corner can often be
ignored whereas those in the top right corner need a great deal
of time and attention.
432 Chapter 12
Risk Mitigation
Once risks have been identified and assessed, all techniques to man-
age the risk fall into one or more of these four major categories known
as ACAT
2. Performance risk
The degree to which the proposed system or process design is capa-
434 Chapter 12
3. Cost risk
The ability of the system to achieve the program’s life-cycle cost
objectives, including the effects of budget and affordability deci-
sions and the effects of inherent errors in the cost estimating tech-
nique(s) used, given that the system requirements were properly
defined
4. Schedule risk
The adequacy of the time allocated for performing the defined tasks,
including development, production, and testing as well as the effects
of programmatic schedule decisions, the inherent errors in the
schedule estimating technique used, and external physical con-
straints
5. Technology risk
Degree to which the technology proposed for the system has been
demonstrated as capable of meeting all of the project’s objectives
What Is Corrosion?
One specific risk that has become better known is in the area of
Corrosion Control.
Corrosion is a naturally occurring phenomenon commonly defined as
the deterioration of a substance, usually a metal, or its properties because
of a reaction with its environment. In other words, corrosion is the wear-
ing-away of metals due to a chemical reaction.
A better scientific definition of Corrosion is the disintegration of an
Current Trends and Practices 435
Protection:
Imagine that a craft person we have sent to repair an asset finds out
that the new spare won’t fit or the new motor has a different footprint
(frame size) from what’s documented in the CMMS system. Suppose we
have ordered a special purpose machine and, upon installation, it does not
do what it is supposed to do. In both cases, the system requirements or
configurations were not documented properly or they were misinterpret-
ed. Has this happened in your plant? If we had followed systems engineer-
ing and configuration management practices appropriately, we would
have minimized such issues.
Systems Engineering (SE) is an interdisciplinary engineering man-
agement process that evolves and verifies an integrated, life-cycle bal-
anced set of system solutions that satisfy customer needs. Systems engi-
neering management is accomplished by integrating three major cate-
gories:
438 Chapter 12
1. System descriptions
2. Drawings
3. Special studies and reports including safety inspections or
investigations
4. Operations and maintenance procedures, guidelines, and
acceptance criteria
5. Instrument and control set points
6. Quality assurance and quality control documents
7. Vendor/suppliers manuals
8. Regulatory requirements, codes, and standards
9. Modification including capital project packages
10. Component and part lists
11. Specifications and purchase orders information for major and
critical assets
Current Trends and Practices 441
What Is a Standard?
A standard is defined by the National Standards Policy Advisory
Committee as:
History of Standards
Standards are known to have existed as early as 7000 B.C. when
cylindrical stones were used as units of weight in Egypt. One of the first
known attempts at standardization in the Western world occurred in 1120.
King Henry I of England ordered that the ell, the ancient yard, should be
the exact length of his forearm, and that it should be used as the standard
unit of length in his kingdom.
History also notes that, in 1689, the Boston city fathers recognized the
need for standardization when they passed a law making it a civic crime
to manufacture bricks in any size other than 9x4x4. The city had just been
destroyed by fire, and the city fathers decided that standards would assure
rebuilding in the most economic and fastest way possible.
With the advent of the Industrial Revolution in the 19th century, the
increased demand to transport goods from place to place led to advanced
modes of transportation. The invention of the Railroad was a fast, eco-
nomical and effective means of sending products cross-country. This feat
was made possible by the standardization of the railroad gauge, which
established the uniform distance between two rails on a track. Imagine the
chaos and wasted time if a train starting out in New York had to be
unloaded in St. Louis because the railroad tracks did not line up with the
Current Trends and Practices 443
train’s wheels. Early train travel in America was hampered by this phe-
nomenon. The government worked with the railroads to promote use of
the most common railroad gauge in the United States at the time, which
measured 4 feet, 8 1/2 inches, a track size that originated in England. This
gauge was mandated for use in the Transcontinental Railroad in 1864 and
by 1886 had become the U.S. standard.
In 1904, a fire broke out in the basement of the John E. Hurst &
Company Building in Baltimore. After taking hold of the entire structure,
it leaped from building to building until it engulfed an 80-block area of
the city. To help combat the flames, reinforcements from New York,
Philadelphia, and Washington, D.C. immediately responded—but to no
avail. Their fire hoses could not connect to the fire hydrants in Baltimore
because they did not fit the hydrants in Baltimore. Forced to watch help-
lessly as the flames spread, the fire destroyed approximately 2,500 build-
ings and burned for more than 30 hours.
It was evident that a new national standard had to be developed to pre-
vent a similar occurrence in the future. Up until that time, each municipal-
ity had its own unique set of standards for firefighting equipment. As a
result, research was conducted regarding over 600 fire hose couplings
from around the country and one year later a national standard was creat-
ed to ensure uniform fire safety equipment and the safety of Americans
nationwide.
This was the beginning of standardization and standards development
in the 20th century to support interchangeability of parts, components, and
safety. In 1918, the American National Institute of Standards (ANSI), a
not-for-profit organization, was founded by support of several profession-
al societies, such as ASME, IEEE, ASCE, ASTM, etc., to support devel-
opment of standards.
The 9001:2008 is the key standard which contains the requirements. This
standard contains the following key sections:
• Section 1: Scope
• Section 2: Normative Reference
• Section 3: Terms and definitions (specific to ISO 9001, not
specified in ISO 9000)
• Section 4: Quality Management System
• Section 5: Management Responsibility
• Section 6: Resource Management
• Section 7: Product Realization
• Section 8: Measurement, analysis and improvement
In effect, users need to address all Sections 1 through 8, but only Sections
4 through 8 need implementing within a quality management system.
Although ISO 9001 is known as the Quality Management System stan-
dard, but it could be applied, with some tailoring, to any process such as
logistics–supply chain, design, and asset management. Some organizations
such as Aerospace Testing Alliance (ATA) at Arnold Engineering
Development (Test) Center (AEDC) and Jacobs have implemented ISO
9001 to all of their work processes successfully including asset manage-
ment.
However, many experts in maintenance and asset management area
globally believe there is a gap in the area of asset management standards.
An international effort is underway to support and develop an international
standard for asset management known as ISO 55000. This family of stan-
dards has the following three standards:
446 Chapter 12
12.9 Summary
ship team. One specific area of safety worth mentioning is that of arc
flash, which has the ability to negatively impact a facility and endanger its
personnel, to which prevention measures should be evaluated and imple-
mented as necessary.
Risk is defined as the potential that a chosen action or activity will
lead to a loss, an undesirable event or outcome, of which we all take risks
in our everyday life, whether at work or in our personal lives. Risk
Management is increasingly recognized as a technique that considers both
positive and negative aspects of risk and should be a continuously devel-
oping process which runs throughout the organization’s strategy and the
implementation of that strategy, integrated into the culture of the organi-
zation with an effective policy. Risk management approaches can be
divided into four major categories: Avoidance (eliminate, or not do that
activity), Control (optimize, mitigate, or reduce risk), Accept (accept and
budget/plan), and Transfer (risk share or outsource). A risk management
plan should be developed to propose applicable and effective security
controls for managing the risks.
Corrosion is a naturally occurring phenomenon commonly defined as
the deterioration of a substance, usually a metal, or its properties because
of a reaction with its environment. Corrosion can cause dangerous and
expensive damage to everything and is so prevalent that it takes many
forms will never be completely eliminated. However, studies estimate that
25 to 30% of annual corrosion costs could be saved if optimum corrosion
management practices were employed. There are four basic methods for
corrosion control and protection: materials resistant to corrosion, protec-
tive coatings, cathodic protection, and corrosion inhibitors to modify the
operating environment. In most cases, effective corrosion control is
obtained by combining two or more of these methods and should also be
considered at the design stage of a given facility or system. The methods
selected must be appropriate for the materials used, for the configurations,
and for the types and forms of corrosion which must be controlled.
Systems Engineering and Configuration Management are techniques
that should be considered not only for the products that are manufactured
but also for the assets/systems maintained by an organization. Systems
Engineering (SE) is an interdisciplinary engineering management process
that evolves and verifies an integrated, life-cycle balanced set of system
solutions that satisfy customer needs. Configuration management (CM), a
component of SE, is a critical discipline in delivering products that meet
customer requirements and that are built according to approved design
documentation. It addition, it tracks and keeps updated all appropriate sys-
Current Trends and Practices 449
tem documentation.
Standards are rules or requirements that are determined by a consen-
sus opinion of users and that prescribe the accepted and (theoretically) the
best criteria for a product, process, test, or procedure. The general bene-
fits of a standard are safety, quality, reliability, interchangeability of parts
or systems, and consistency across international borders. Most standards
are developed by committees of volunteers, which can include members
of industry, government, and the public. Effective standardization pro-
motes forceful competition and enhances profitability, enabling a business
to take a leading role in shaping the industry itself. An asset management
specific standard will aid maintenance and reliability organizations in
establishing standards for their processes and become a leader in this
industry as well.
Further analysis of how each of these applies to your specific business
or industry is necessary to understand how to apply these trends and prac-
tices, if your company is engaged in activities to which each specifically
applies.
Q. 1 a Q. 31 b
Q. 2 b Q. 32 a
Q. 3 a Q. 33 b
Q. 4 a Q. 34 a
Q. 5 a Q. 35 b
Q. 6 a Q. 36 a
Q. 7 a Q. 37 a
Q. 8 a Q. 38 b
Q. 9 a Q. 39 b
Q. 10 a Q. 40 a
Q. 11 a Q. 41 b
Q. 12 a Q. 42 b
Q. 13 a Q. 43 a
Q. 14 a Q. 44 a
Q. 15 a Q. 45 a
Q. 16 a Q. 46 b
Q. 17 a Q. 47 b
Q. 18 b Q. 48 c
Q. 19 a Q. 49 c
Q. 20 b Q. 50 b
Q. 21 b Q. 51 b
Q. 22 a Q. 52 d
Q. 23 a Q. 53 b
Q. 24 b Q. 54 c
Q. 25 b Q. 55 a
Q. 26 c
Q. 27 b
Q. 28 a
Q. 29 a
Q. 30 a
451
452 Appendix
Answer: a — True
A best practice is a business function, a practice, or a process, that is
considered superior to all other known methods. It’s a documented strate-
gy and approach used by the most respected, competitive, and profitable
organizations. A best practice when implemented appropriately should
improve performance and efficiency in a specific area. See more details in
Chapter 1.
Answer: b — False
Maintainability is defined as ease of maintenance; it’s primarily
measured by Mean Time to Repair (MTTR). See more details in Chapter
6.
Answer: a — True
All maintenance personnel’s time should be counted and document-
ed in CMMS to ensure all repair and maintenance costs are accurate.
See more details in Chapters 3 and 4.
Answer: a — True
OEE is calculated as Availability X Performance X Quality.
Operations and Maintenance both impact this metric and need to work
together as a team to achieve higher OEE. See more details in Chapter 7.
Appendix 453
Answer: a — True
It’s good practice to plan 90% or more work. Planned work costs
2–3 times less than reactive work. See more details in Chapter 4.
Answer: a — True
All PM / PdM tasks should be developed using FMEA / RCM
methodology. This ensures cost effective and correct tasks to mitigate
certain risks and to find failures before they fail. See more details in
Chapter 3 and 8.
Answer: a — True
Assets cost money to procure and maintain. They should be utilized
98% or better to get high ROI. Of course, our M&R task is to ensure
their availability; we need some time to perform maintenance too. See
more details in Chapter 4 and 9.
Answer: a — True
100 % of maintenance personnel, specifically craft available hours,
should be scheduled. Scheduling compliance analysis should provide
opportunity to reduce/eliminate waste and improve productivity. See
more details in Chapter 4.
454 Appendix
Answer: a – True
It’s a good and cost effective practice to do more run/cycle-based
and condition-based PM. It’s good practice to have calendar-based PMs
20% or less. If assets are operating 24/7, calendar-based PM could be a
higher percentage. See more details in Chapters 3, 4, and 8.
Answer: a — True
This rule implies that time-based PM must be accomplished in 10%
of the time frequency or it is out of compliance. Many organizations use
this metric “PM Compliance” as a measurement of their maintenance
department’s performance, which is a good metric. But, we need to
ensure that critical assets are being maintained properly at the right time,
within 10% of time frequency. See more details in Chapters 3 and 4.
Answer: a — True
Most emergency work orders should be written by the
production–operators. Operators are on the shop floor all the time and
they should know what needs to be fixed to meet the production sched-
ule. However, maintenance should also write WOs if emergencies arise.
Emergency and unscheduled work cost many times more than routine
scheduled work. See more details in Chapters 4 and 7.
Answer: a — True
Appendix 455
Answer: a — True
Yes, the primary objective is to detect a fault, or find the start of
one, and correct it before it fails. Detection can be visual or by using
predictive technologies. See more details in Chapter 8.
Answer: a — True
Yes, the primary objective is to detect a fault, or find the start of
one, and correct it before it fails. The PM frequency should be less than
the P-F interval. See more details in Chapter 8.
Answer: a — True
Yes, Reliability is measured by MTBF, which is operating time
divided by the number of failures, or downtime events. See more details
in Chapter 6.
Answer: a — True
It’s true. The goal is to get all work completed as scheduled. See
more details in Chapter 4.
456 Appendix
Answer: b — False
Uniform wear will not create any unbalance, hence no vibration. See
more details in Chapter 8.
19. Understanding the known and likely causes of failures can help
design a maintenance strategy for an asset to prevent or predict
failure.
a. True
b. False
Answer: a — True
If we understand the failure mechanism — how a part or component
can fail, we could develop a mitigating maintenance strategy to prevent
failure. See more details in Chapters 8 and 11.
Answer: b — False
Reliability is a design attribute. It means that reliability is based on
how an asset is designed — with what type of components and their
Appendix 457
Answer: b — 85%
The proactive work is defined as all work minus
unscheduled/unplanned work. We know that the planned and scheduled
work cost less than unscheduled, reactive work. We have found that it’s
good or best practice to have proactive work 85% or more. See more
details in Chapters 3 and 4.
Answer: a — True
MTBF is mean time between failures. It is calculated by dividing
operating time by the number of failures, or downtime events. See more
details in Chapter 6.
Answer: a — True
Initially, when we are starting an M&R improvement plan, mainte-
nance cost may go up, but eventually it should come down as reliability
increases. With an increase in reliability or reduction in the number of
failures, assets are more available to perform their function See more
details in Chapter 3 and 6.
458 Appendix
24. The “F” on the P–F Interval indicates that equipment is still
functioning.
a. True
b. False
Answer: b — False
In the P–F interval curve, F stands for failure and P stand for poten-
tial failure. See more details in Chapters 6 and 8.
Answer: b — 15
On average, an experience planner should able to plan work for
about 15 +/5 craft people, depending upon type of work. Usually 15 is a
good number. See more details in Chapter 4.
Answer: b — MTBF
Reliability is measured by MTBF, which is operating time divided
by the number of failures. MTTR is a measure of maintainability. See
more details in Chapters 6 and 9.
Answer: a — True
The P&S should benefit all maintenance work. Planned and sched-
uled work cost much less and work get accomplished in a timely man-
ner. See more details in Chapter 4.
Answer: a — True
The leading indicators are process indicators and they lead to the
results. For example, PM compliances and backlog are leading indica-
tors. See more details in Chapter 9.
30. The 6th S in the 6 S (also called 5 S plus) process stands for
safety.
a. True
b. False
Answer: a — True
The original Five S (5 S — sort, set, shine, standardize, and sustain)
is a basic, systematic process for improving productivity, quality, and
housekeeping. Lately, a 6th S has been added to focus on safety too. 5 S
was originated in Japan. See more details in Chapter 7.
Answer: a — True
The objective of RCM is to preserve function. See more details in
Chapter 8.
Answer: b — False
The FMEA/RCM analysis should provide us with details of failure
modes and what part we need to stock. However, in some cases, it may
be cost effective not to stock if parts can be procured or made available
locally in couple of hours. See more details in Chapter 5.
34. The inventory turnover ratio for MRO store should be:
a. Less than 2
b. Between 4–6
c. Over 6
Answer: b — Lagging/Leading
PM compliance is a leading indicator for availability and lagging
indicator for work execution. See more details in chapter 9.
Appendix 461
Answer: a — True
OEE = Availability X Performance X Quality. See more details in
Chapter 7.
Answer: a — True
Reliability and Maintainability are design attributes. It means that
reliability and maintainability depend upon how the asset is designed
and with what type of components and configurations. A maintenance
strategy can’t change the basic (inherent) reliability. However, training
the workforce in repair techniques and providing the right tools will
improve the asset availability. See more details in Chapter 6.
Answer: b — False
Karl Fisher’s method is used for determining water content — Parts
per Million (PPM) in an oil sample. See more details in Chapter 8.
462 Appendix
Answer: a — True
IR thermography windows are being used effectively to detect any
hot spots or potential problems in electrical cabinets, switchgears, etc.,
and help in meeting NFPA -70 E arc flash requirements. See more
details in Chapters 8 and 12.
Answer: b — False
FMEA can apply to any asset whether in use or not. In fact, it is a
good application in new systems being designed/ developed to identify
potential failure modes. See more details in Chapters 8 and 11.
Answer: b — False
RCM can be used on new or “in use” systems. In fact, it’s a good
application for new systems under development to use RCM methodolo-
gy to identify potential failure modes and to develop an effective PM
plan. See more details in Chapter 8.
43. Properly training the M&R workforce can increase asset and
plant availability.
a. True
b. False
Answer: a — True
Training the M&R workforce in application of new tools/techniques
will reduce repair time, resulting in higher availability. See more details
in Chapters 10 and 11.
Appendix 463
Answer: a — True
Total Productive Maintenance (TPM) is another maintenance strate-
gy where an operator does some maintenance, sometimes called first
level, e.g., changing filters, minor adjustments, etc., and becomes part of
the maintenance crew in supporting major repairs. See more details in
Chapter 7.
Answer: a — True
Lagging indicators are the results. For example, Maintenance cost
and availability are lagging indicators. See more details in Chapter 9.
Answer: b — False
EOQ (Economical Order Quantity) calculates the optimum order
quantity to optimize inventory cost. It does not impact the inventory
turnover ratio. See more details in Chapter 5.
47. New incoming oil from the supplier is always clean and
ready to be used.
a. True
b. False
Answer: b — False
It has been found that the incoming oil is not usually clean and does
not meet cleanliness requirements. The best organizations are establish-
ing oil cleaning systems to clean all incoming oil to ensure that new oil
meets cleanliness requirements. See more details in Chapter 8.
464 Appendix
48. Which phase of asset life cycle has the highest cost?
a. Design
b. Acquisition
c. O & M
Answer: c — O&M
O&M phase usually has the highest cost in an asset’s life cycle. See
more details in Chapter 6.
Answer: c — Design
Most maintenance costs get fixed during the design phase. See more
details in Chapter 6.
Answer: b — Design
RCM should be used in the design phase to get maximum benefits.
See more details in Chapters 6 and 8.
Answer: b — MTTR
MTTR is a measure of how soon can we bring the asset back to
operations. See more details in Chapter 6.
Appendix 465
Answer: b — MTBF
Failure rate is the inverse of MTBF. See more details in Chapter 5.
Answer: c — design
FMEA should be performed during the design phase to identify fail-
ure modes and those that could be eliminated or their impact reduced or
mitigated cost effectively. See more details in Chapters 6 and 11.
Answer: a — True
High (95% or better) PM schedule compliance will catch potential
failures before they happen, thereby reducing the unexpected break-
downs. See more details in Chapters 3 and 4.
INDEX
ABC 127
AC high potential testing (HiPot) 268
acceleration 248
accountability 285
accuracy 135–137
acoustics 55
acquisition 15 176
action plans 29
active inventory 123
active redundancy 168–170
admire 316
affinity analysis 396–397
affordability 70
age exploration 221
agent 38
alarms 62
alignment 253
American National Standards Institute 402
analysis 351–399
analytical ferrography 263–264
ANSI 402
arc flash hazards 423–426
AS/RS 133–134
ASME 206
assessment 43 76–78 404
asset 3 5 10
12 38 39
51 211–212 221
363 453 464
465
critical 11 97
life cycle 15 176–178
audits 412–413
Austin, Nancy 26
automated ID 143
automated storage 133
autonomous maintenance 60 197–198
availability 16 28 156
157–161 166 190
209 212 462
465
awareness 37
calendar-based PM 93
capital project 52 108
capital project maintenance, see CPM
carousels 133
carrying costs 139–140
category codes 91
causal factors 361
cause mapping 353
cause-and-effects analysis 353 366-370 390
CBM 39 52 55–57
58 66 77
84 87 94
220 221 245–274
426
centralized 341–342
certification 313 335–338
certified maintenance & reliability pro
fessional, see CMRP
chains 205
change 23 28 321–322
change agent 38
change management 21 36–38 43
charisma 28
checklist 103 354
classification 91–98 122–124 145–146
cleaning 61
CM 52 59 87
93 94–95 153
CMMS / EAM 41 51 63–74
77–78 84 85
96 106 120
134–135 136 437
CMRP 30 336–338
collaboration 43
collect data 360–361
cost 6 12 38
69 137–138 176–178
230 244 408–409
434 457 464
Coulometric Titration Method 14 461
Covey, Steven 20 33
CPM 52
craft supervisor 88 91
creating, culture 39–42
critical assets 11 221 454
456
criticality 97
cultural change 21
culture 14 19–47 345–346
461
customer perspective 295–296
customers 154
failure 12 39 73–74
156 165 222
224–225 274–275 277
456
failure mode 222 227–228
Failure Mode and Effect Analysis, see
FMEA
failure rate 16 156 164
166 465
fans 410
fault tree 354–355 380–381
feedback 325
ferrography 222 263–264
financial perspective 296
Harari, Oren 24 26
harmonic distortion 270
hazard 403
hazardous material 403
health 201
Heap, Harold 226
heating 412
hidden failure 222
hire 315
Howe, Neil 317
human error 237
hybrid organization 341–342
hydrostatic testing 272–273
impact 97
impeller wear 12
implementation 241–242 362
improvement 76–78 202
improvement cycle 355
inactive inventory 145
infrared 426
infrared spectroscopy 263
infrared thermography 55 248 254–256
462
infrequently used inventory 123
inherent reliability 5
initiatives 43
inspire 315–316
internal benchmarking 299
inventory 14 117–149
costs 126
shrinkage 146
stratification 124–128
types 122–128
inventory turnover ratio 120 460
kaizen 198–199
kanban 390
Kaplan, Robert 292
key performance indicators, see KPIs
kitting 87 113 136–137
knowledge 3 9–16 44
kobetsu 198
kouzes, James M. 42
KPIs 13–15 43 68
76–77 288 292
459 460 463
Krafcik, John 388
labeling 205
labor 189
lagging indicators 291–292
layout 128–134 413–415
leaders 27
leadership 42 19–47 345–346
419
attributes 27
secrets 24–26
Leadership Practices Inventory 42
leading indicators 291–292
leak detection 258–260
lean maintenance 387–393
lean manufacturing 190
learning 319
learning and growth perspective 294–295
LEED 403 416–417
life cycle 69 176–178 244–245
464
asset 15
lighting 412
limits 247
Lincoln, Abraham 26 83
liquid handling 111–112
listening 325–327
logic tree analysis (LTA) 238–240
logistics 182
loss 8–9 198–199 212–213
lubricant analysis 248 260–266
lubricating 61
M&R 6-16 31 39
M&R analysis 351–399
magnetic particle testing (MPT) 272
maintainability 10 14 151–186
452 461
maintenance 4 31 49–80
453
approaches 53–59
assessment 76–78
autonomous 60
backlog 52
capital project 52
condition based 52 55–57
corrective 52 59
cost 6–7 12 15
38
improvement 76–78
metrics 290
operator based 52 60–63
predictive 52
preventive 53 57–58 200
proactive 53 58–59
maintenance (Cont.)
productive 195
quality 74–76
reactive 53
run-to-failure 53
maintenance & reliability, see M&R
maintenance department 34
maintenance plan 5
man to part 130–132
management by wandering around 6
materials 117–149 363
Matteson, Tom 226
MBWA 26
mean time between failures, see
MTBF
mean time to repair, see MTTR
measures 42–44 151–186
meetings 328–329
megohmmeter testing 269
Mentzer, Bill 226
metrics 287 289 290
mishap 403
mission 21 28
mission statement 21 32–35
mistake proofing 355 413
mitigation 403 432–433
mobile technology 70
modular drawers 131
moisture 262
morale 321
motivation 3
motor current readings 268
motor current signature analysis
(MCSA) 268
motors 410 411
m-out-of-n reliability 170–172
MRO 14 460
MTBF 12 13 16
68 156–158 163
166 173 179
457 459 464
465
MTBMA 16 464
MTTR 13 16 156
158–159 180 459
464 465
muda 355 390
multiple component system 171
mura 355 389–390
muri 355 389
O&M 15 177
objectives 28 287
OBM 52 60–63 94
OEE 10 14 191
208–214 417 452
461
OEM 277
office improvement 201
Ohno, Taiichi 389
oil 463
oil analysis 56
oil contamination 265-266
oil levels 205
operating context 223
operating environment 5 180
operations 187–217
operator based maintenance, see OBM
operator driven reliability 191 193
operators 61 454
optimization 77 134–145 219–282
455
oragnizational initiatives 43
ordering 139
organization 203–208
organizational culture 21–22
personnel 10 11 107
329–338 419–420 452
453
Peters, Tom 26
P-F interval 11 12 223
224 455 458
Phillips, Donald 26
physical plant 34
pillars of TPM 197-201
planned maintenance 199
planned work 8 10 85
113
planner 85 88 90
458
planning 13 83–86 458
459
capabilities 102–105
checklist 104
ineffective 102
process 98–105
plant 38 42 230
413
PM 10 11 14
16 53 57–58
66 75 77
84 86 87
93 153 195
219–282 453 465
calendar-based 93
condition based 94
PM(Cont.)
operator based 94
run-based 94
time-based 454
poke-yoke 390
Posner, Barry Z. 42
potential failure 223
Powell, Gen. Colin 24–26
power factor 270 319
PPE 106 111 403
423
predictive maintenance, see PdM
pressure 56 61 247
prestige 319
preventive maintenance, see PM
prioritization 91-98
proactive maintenance 12 36 53
58-59 99–100 457
procedures 61–62 101
process heating 410
process improvements 28
product data 434
production/operations 8–9
productive maintenance 195
productivity 67–68 244
professional success 43
prognosis 223
pumps 410
reactive thinking 36
reactive work 99
redundancy 168–170
regulations 277
relationships 333
reliability 4 11 12
53 63 151–186
455–457 461
culture 14
design 181–184
failure distribution 162–163
inherent 5
metrics 290
models 170–172
requirements 178–179
specifications 178–181
reliability block diagram, see RBD
reliability culture 38–42 461
repairs 41 62 95
165 166
replacement asset value, see RAV
reporting 68–69
reputation 154
resistance 37
resources 28 100–101 285
338–347 455
retire 316
retrieval system 130 133–134
rewards 189
rework 113
risk 12 75 403–404
risk management 427–434
RM 53
roles 37 87–91
root cause analysis 356 357–381
rotating bomb oxidation test (RBOT) 264
RTF 53 220 223
238 241 278–279
run-based PM 94
run-to-failure maintenance, see RTF
rust prevention 264
seiketsu 207–208
seiri 203
seiso 207
seiton 204–207
service stock 121
set-in-order 204–207
setup 198
shelf life 135
shelves 131
shine 207
shock pulse method 56 253
shutdowns 108–112
signature analysis 252
silent generation 317–318
six sigma 354 356 381–385
459
skill levels 100–101
skills development 333–334
SMART test 289
Smith, Audrey 43
SMRP 30 34–35 83
336–337
SMRP Initiative 301–303
Society of Maintenance & Reliability
Professionals, see SMRP
solid waste 111–112
solutions 362
sort 203
spare parts 121
spectrometric metals analysis 263
sustainability 404–408
sustaining, culture 39–42
system affordability 70
system boundary 231–232
system selection 230–232
systems engineer 88 437–439
validation 404
value stream mapping (VSM) 357 390 392–393
values 29
variance 146
velocity 248
vendors 72 146
verification 404
vibration 247–253 456
vibration analysis 55
vibration monitoring 12
viscosity 223 262
vision 21 26 28–32
287
vision statement 21 30
visual and odor tests 261–262
visual controls 413
visual workplace 192
workforce 15 62 309–350
world-class 288 310
wrench time 107