Professional Documents
Culture Documents
Operating a More
Reliable Cloud Through
Proactive Incident and
Problem Management
#vmworldittran
Disclaimer
2
Session Executive Summary
Cloud Transformation
What is Incident and Problem Management?
Why Proactive Incident and Problem Management
Current State
Evolution from Reactive to Proactive Incident and Problem
Management
Operational Benefits
Key Performance Indicators
3
3
A New Operating Model for the Cloud Era
People, Culture
& Organization
Processes
& Control
Software Technology
& Architecture
4
Five Capabilities Which Unlock Cloud Benefits
Request fulfillment
Automated Automated provisioning, release and deployment of Application development
Provisioning & infrastructure, platform and end-user compute Release and deployment
Deployment services management
Incident management
Proactive Incident &
Monitoring and filtering of events, automatic incident Request fulfillment
Problem
resolution, and problem diagnosis Event management
Management
Application development
5
Lets Agree Upon A Definition…
Incident Management
Focuses on how to handle performance problems or outages. The
primary focus of Incident Management is to manage the incident until it is
resolved. Problem Management
Problem Management
Focuses on identifying root causes to repetitive and high priority incidents.
Once Root Causes have been identified, a plan of action will be generated
that will ideally repair the underlying problem. If the problem can’t be fixed,
additional monitoring and event management handling may be
implemented in an attempt to minimize or eliminate future occurrences of
the problem.
6
Cloud Operations unlocks the benefits of Cloud
Free-up as much as 25% of Reduce time to market for new Ensure data and application
labor operating costs through innovations and increase security, compliance,
standardization, automation, flexibility availability, and recoverability
and streamlining operations1 to employees and customers
7
Why Proactive Incident and Problem Management
8
Typical Process Flow Today
9
How Did We Get Here?
IT architectures
Reactive Reactive were simple 3 tier
Incident Problem
Management Help Desk Management hierarchy
Administrators
10
How Did We Get Here?
IT architectures
Reactive Level 1 Reactive morphed to a full
Incident Problem mesh
Management Support Management
Level 2 Support
Level 3 Support
11
How Did We Get Here?
Automated Complexity
Workflows exploded
Reactive Reactive
Incident Interactive Problem
Management Workflows Management
Level 1 Support
Level 2 Support
Level 3 Support
12
Cloud Ops
Automating incident and problem management in the data
center is the key to becoming proactive
Intelligent analytics and control continuously:
• Assesses the thousands of performance metrics and available
capacity across the entire IT stack,
• Considers all business and physical constraints
• Drives the necessary actions to tune and maintain the environment
in an optimal operating state.
Instead of alerting you when problems
occur, or are about to occur:
• Optimizing performance, maximizing infrastructure
efficiencies and reducing operational costs.
• Prevents events/alerts from happening
• Controls the environment in a “healthy” state
Intelligently prioritize resources and
automatically scale up or down as
performance and business
demand fluctuates
13
What is Proactive Incident and Problem Management
Monitor Plan
Slow performance Utilization / forecast
!
Problem Maintenance
Reactive Proactive
14
Operational Benefits Across Three Dimensions
Benefits
Dimensions Examples
15
Proactive Incident and Problem Management OPEX Savings
Source: Reducing Operational Expense with Virtualization and Systems Management - Enterprise Management Associates
16
Business Impact and KPIs
and on-call costs for support), productivity from idle business unit
business unit staff costs (including staff. Fewer IT resources are
time lost when systems are down), being diverted from strategic
and lost revenue from downtime project work break-fix activity.
17
Business Impact and KPIs
Zero-downtime migration
Provisioning is easier with
eliminates both application
IMPACT
18
In Closing
improving
Identified savings opportunities in IT Operating Expenses, while
cloud services availability and quality
19
Learn more about VMware Cloud Solutions
vmware.com/cloud
20
Additional IT Transformation Tracks
SESSION ID TITLE DAY TIME
ITT1918 Is My Organization Ready to Reap the Benefits of the Cloud? Monday 12:30 PM
ITT3245 VMware on VMware: Our Journey to the Cloud (Part 1) Monday 05:00 PM
ITT3242 Managing Cloud Security, Compliance, and Risk Management Tuesday 12:30 PM
Advice From Your Peers: How to Best Run and Manage a Cloud
ITT1953 Tuesday 02:00 PM
Environment
ITT3238 Taking Your Workloads to the Cloud: Why, How, and When? Wednesday 08:30 AM
ITT3239 On-Demand IT: Leveraging Cloud for Efficient Self-Service IT Wednesday 04:00 PM
ITT3240 From Weeks to Hours: Automated Provisioning and Deployment Thursday 12:30 PM
21
FILL OUT
A SURVEY
Operating a More
Reliable Cloud Through
Proactive Incident and
Problem Management
#vmworldittran