You are on page 1of 50

Home A-Z Find People For Students For Faculty & Staff For Alumni & Friends For

Business & Industry For Visitors Search CCRI

Search IT About IT IT for Students IT for Faculty IT for Staff Services Help Documentation IT Home

IT Home > Help > Troubleshooting Hardware Problems

Troubleshooting Hardware Problems


Monitor

Printer CD Keyboard Mouse Networking Video/Screen Sound Startup Hard Drive Troubleshooting Questions

When your computer is acting peculiarly, turn off the computer. Click on the Start button, select the Shut Down option, and then choose Shut Down. Leave the PC off for 1 to 2 minutes. Turn the PC back on again.

An Unresponsive PC
First check the cable. Unplug it from the computer and the outlet. Re-plug in both sides and try booting it again. Check the wall outlet. Plug something else into the outlet and see if it works. Turn the system off and wait 30 seconds and then try again. Reach behind the machine and see if you feel air blowing out of the power supply. If you do, then you know the machine is getting some power. Look at the keyboard for the indicator lights being lit up as the machine boots. Sometimes the monitor has something to do with the system acting up. Unplug the power cord from the monitor and the wall and re-plug it. Unplug the cable from the computer to the monitor and re-plug it into the monitor. Try rebooting.

Listen to identify a beeping series if there is one to report it to the technical help. Turn in all comments to the Help Desk.

Monitor Troubleshooting
Symptom
The monitor screen is black

Diagnosis
Check to see if the computer turned on. Is the computer turned on? There is a light on the CPU. If the computer is on, it will be lit. Check to see if the monitor getting power. If no lights appear on the front of the monitor at all, it is not getting any power from the power source. Check to see if ALL plugs are secure. Power cord from the computer to the power strip. Power strip to the wall socket. Check to see if the Power Strip turned on. There is a light on the strip. If the strip is on, the indictor light will be on. Check to see if the monitor getting a signal from the computer. There is a light on the monitor. If the monitor is on, it will be lit. If it is turned on, check the contrast and brightness buttons to see if they have been tampered with.

A green light on the front of the monitor would indicate that it is getting a signal from the computer. An orange light would indicate there is not signal from the computer. Make sure the computer is on and you see lights on the front of it. Check the cable that runs from the monitor to the computer to see if it has worked loose. Check to see if the brightness has been turned entirely down. Make sure you check the brightness and contrast buttons or settings on the monitor. Check to see if the computer in Power Save or Sleep mode. Move the mouse or press any key on the keyboard to see if the computer will "wake-up." Check to see if all peripherals plugged in. Verify that all cables and cords leading in to and out of your computer to insure they are all in tight and not disconnected. Secure the following to the computer: Monitor Mouse Keyboard Printer Network cable to computer and wall (Blue)

Check to see if the monitor goes black just as Windows is loaded. This could indicate a problem with the video card driver or settings in Windows. Since you can't see to get to the settings, this is difficult to fix without a visit from technical support.

Symptom
The screen is not synchronized

Diagnosis
Check to make sure the signal cable is firmly connected in the socket. Check that the output level matches the input level of your computer. Make sure the signal timing of the computer system is within the specification of the monitor.

Symptom
The screen is too bright or too dark

Diagnosis
Check if the Brightness or contrast control is at the appropriate position, not at the maximum or minimum. Check if the specified voltage is applied Check if the signal timing of the computer system is within the specification of the monitor. Especially, check the horizontal frequency.

Symptom
The screen is shaking

Diagnosis

Move all objects that emit a magnetic field, such as a motor or transformer, away from the monitor. Check if the specified voltage is applied. Check if the signal timing of the computer system is within the specification of the monitor.

Printer Troubleshooting
Symptom
The Printer is not printing

Diagnosis
Check to see if the printer getting power If there are no lights or no display on the front of the printer, the printer is not getting electricity or power. Check to make sure the power cord is plugged in both to the wall or power strip and to the back of the printer. Wiggle the power cord where it plugs into the back of the printer to make sure it is not loose. Some models of desk jets have a two part power cord. In this case, check along the length of the power cord to make sure both parts are plugged in together. If the printer is still not getting power, plug the power cord into a different outlet on the power strip. If this does not work, try plugging the printer into a different wall outlet. Check to see if you can print a Windows test page The windows test page is a basic communication test between your computer and the printer. To print a Windows test page: Left mouse-click on the Start button.

Go to Settings and then select Printers. Inside the printer window, you should see a small printer icon with the name of printer you are trying to print to. Place your mouse arrow on the small printer icon and right mouse-click. A small gray window should appear and the last choice in the box is Properties. Left mouse-click on Properties. A printer window with several tabs should appear. On the General tab, there is a Print Test Page button in the lower right corner. Left mouse-click on the button. You may click on the "Yes" button on your screen, but the real question is: Did anything print from the printer? If the answer is no, please call or e-mail the Help Desk at x1112. If you can print a Windows test page, try to print from a different program. If the document does not print from that program, the printing problem has to do with that program. Check to see if there is paper in printer. Is their a paper jam? If the printer has paper in the paper tray, the paper may be jammed or not feeding properly. Take the paper out of the paper tray and check to see that the top piece of paper is not crinkled or bent. If the printer is a DeskJet, lift open the front cover and look to see if a piece of paper is halfway fed through. If it is, remove paper gently from the top and close. If the printer is a LaserJet, open the top of the printer and check for paper underneath the toner cartridge. If there is paper there, gently remove it, and replace the toner cartridge. Check to see if the computer getting a signal from the printer.

The computer and the printer must be communicating before the printer will print. When you send a document to print, does a small printer appear on the Windows taskbar (down by the time)? If this printer appears on the taskbar, the computer thinks the printer is receiving communication. At this point, the printer should blink lights (if a DeskJet) or says "printing" or "receiving" on the display (if a laser printer). If the printer is not receiving the communication from the computer, try restarting the computer. After you have logged in, see if you can print now. Check to see if Printer offline or Paused. If the printer is off-line or paused, the print jobs will just stack up in the print queue but nothing prints. Left mouse-click on the Start button Go to Settings Then choose Printers The Printer folder should open and display the printers installed on this PC. Place your mouse arrow on the printer you are checking and right mouse click. A dialog box should open. If the printer is paused or offline you will see a black check mark next to the words "Pause Printing" or "Printer Offline." Left mouse-click on the black check mark and see if you can "uncheck" it. If the check mark will not go away, try restarting the PC (Start Shutdown Restart). Then repeat steps 1 through 7 again.

If the printer is still not printing, please contact the Helpdesk at x1112 Check to see if there multiple jobs in the Print queue. If the printer is a local printer (i.e., there is a cable running directly from the printer to the computer you are printing from), power off the PC, power off the printer, count to 10, and then turn both the printer and the computer back on again. Sometimes this will allow the printer to start printing again.

Symptom
The printer is printing streaks on the page

Diagnosis
If the printer is a DeskJet, go to the HP DeskJet Utilities menu in the Program menu. Choose the "Clean the Print Cartridges" option. If this does not work, try replacing the ink cartridge. If the new cartridge does not help the streaking, place the cartridge back inside the original packaging and save it until the other cartridge has been used up. This just tests to see if the ink cartridge is defective. If the printer is a LaserJet, try changing the toner cartridge. If the new toner cartridge does not improve the streaking problem, return old cartridge to the printer and place the new toner cartridge back in its original packaging for later use. If neither option works, please visit the printer maintenance vendor list to schedule printer service. The printer in spite of everything is not printing?

Turn the PC off. If the printer is a local printer, i.e., has a direct cable hookup to the PC, turn the printer off also so both the PC and printer are turned off at the same time. After 30 seconds, turn the PC and the printer back on again. Try to print a Windows test page. If the printer still does not print, please contact the Helpdesk x1112.

CD Troubleshooting
Symptom
The computer won't read the CD

Diagnosis
Check to see if the label side of the CD is faced up Check to see if the CD be read from the CD Rom drive of another computer If the CD can be read from another computer's CD ROM drive, the CD ROM drive may be bad and need to be replaced. The CD ROM drive may also have dirt or debris inside. Try cleaning the drive with a standard audio CD player cleaning kit. After cleaning the drive, try to read the CD again. Check to see if the CD scratched or dirty CD, CDR, or CDRW drives read discs by shining a laser onto the CD and then measuring the amount of light that gets reflected back. Most of the time a small scratch won't matter. If the CD is dirty, you can clean the CD using a CD Cleaning kit or you can also use a mild detergent, like dish soap, and warm water, wash the CD and dry with a soft cloth. Once the CD is fully dry, insert the CD into the CD ROM drive and try to read it.

If the CD is not dirty, you can try to clean the CD Rom drive using a professional CD cleaner kit Check to see if the CD is a CD-R or CD-RW that was burned A number of older CD drives cannot read some types of CD-R CDs. Try using a different CD-R disk with a different dye under the reflective layer. You will have noticed that some CD-R disks are blue, gold, green, or even silver colored. Some of the colors have a lower light reflectivity value and an older CD Rom drive may have difficulty reading that brand of CD-R media.

Keyboard Troubleshooting

Symptom
Keyboard doesn't respond and gives off a constant beeping noise when booting up

Diagnosis
Check the plug to make sure it's connected securely. Try unplugging it and re-plugging it again. If there is no response, check the indicator light on the keyboard. Is it on? Do the lights respond when you press the caps lock or the num lock key? If not, maybe your keyboard is broken. Check to see if there a key stuck Gently pry off the cover and clean it with alcohol. Make sure it is not connected to your machine when you are cleaning it. The space bar frequently comes off track. Gently pry it off, noting which way the bar lies in your particular keyboard so you can replace it properly.

Mouse Troubleshooting
The mouse is not working

Symptom
The mouse is acting erratic

Diagnosis
Reboot the computer and see if that corrects the problem. If not check to see if there is insufficient memory.

Symptom
The mouse will only move one way, either vertically or horizontally

Diagnosis
Clean the mouse Shut down your machine and unplug your mouse from the computer. Open the underside of the mouse and remove the

ball. If the ball is a rubber ball, do not clean it with alcohol. Clean it with a soft cloth. There should be no lubricant placed on a mouse ball. Clean the roller in the body of the mouse with a cotton swab that is slightly damp with alcohol. Replace the ball when the rollers are dry and replace the bottom portion.

Networking Troubleshooting
Symptom
My PC is not working on the Network

Diagnosis
Programs that require network drives to run or operate properly: SIS, HR, FRS, PROD ALPHA, Network Shares, and some school applications. You would also need a network connection to print to the network laser or color laser printers within CCRI.

Symptom
Message "No Domain Server Available" or there are no Network drives (like the S drive).

Diagnosis

Video/Screen Troubleshooting
Symptom
The Monitor is Black

Diagnosis

Symptom
The desktop Icons are too IMMENSE or too undersized

Diagnosis
Usually this is due to the Display Settings. The standard video setting for most College software is 800x600. To check the video display settings: Left mouse-click on the Start button (lower left-hand corner of the screen). Go to Settings. Go to Control Panel. Once in the Control Panel, look for the Display icon. Double left mouse-click on the Display icon. In the Display Properties box, left mouse-click on the settings tab.

Place your mouse arrow on the slider, hold down the leftmouse button, and move the arrow until the number changes to the desired setting. 640x480 screen resolution has fewer pixels so the screen appears larger. 1024x768 screen resolution has more pixels in the same screen area so the appears smaller.

Symptom
The Screen goes black if not used for a few minutes

Diagnosis
The power saver or energy saver features may be turned on. To correct this problem, you can turn off the feature. Left mouse-click on the Start button (lower left-hand corner of the screen). Go to Settings. Go to Control Panel. Once in the Control Panel, look for the Display icon. Double left mouse-click on the Display icon. Left mouse-click on the Screen Saver tab. Left mouse-click on the Power or Settings button (depends on your version of Windows) in the lower left corner. On the Power schemes tab, you should see where it says "Turn off monitor:" with an amount of time next to it. Left mouse-click on the drop-down arrow. Change the time to "Never." Left mouse-click on the Apply button in the lower right-hand corner.

Left mouse-click on the OK button. Left mouse-click on the next OK button.

Sound Troubleshooting
Symptom
The computer has no sound

Diagnosis

Symptom
No sound is heard from audio (music) CDs

Diagnosis
Look for the Volume icon in the system tray in the lower

right-hand corner of the Windows desktop. Place the mouse arrow on this icon. Double left-mouse click. The Volume Control dialog box should appear on the computer screen.

Place the mouse arrow on the "slider" button and slide the arrow up to increase or down to decrease the volume. If "Mute all" check box is checked, there will be no sound. To enable the sound again, uncheck the box. Ensure the speakers are properly connected to the audio cards output connector and turned on.

Symptom
There is no volume icon in the lower right corner

Diagnosis

To place the volume icon right of the desktop:

in the system tray in the lower

Place the mouse arrow on the Start button in the lower left corner. Left mouse-click on Settings. Left mouse-click on the Control Panel Place the mouse arrow on the Multimedia icon .

(In Windows XP, look for the Sounds and Audio Devices icon .)

Double-left mouse click. Left mouse-click on the Audio file tab. Towards the bottom look for the check box that reads "Show volume control on taskbar." Make sure the box is checked to activate the icon.

Startup Troubleshooting
If your computer is making noise or attempting to start up, but there is no video or no display on the monitor.

Symptom
No power lights on the monitor/computer

Diagnosis

Hard Drive Troubleshooting


Symptom
The cursor is stuck on the hourglass

Diagnosis

Open Task Manager Simultaneously press [Ctrl] [Alt] [Delete]. You will see a list of all tasks (programs) currently running. You may notice one program has "Not Responding" instead of "Running" listed next to it. Select this task and click the End Task button. Another dialog box will open stating that the program is not responding. Choose End Now to close the program. Reboot your computer (Warm Boot) Resetting a computer that is already turned on: Press [Ctrl] [Alt] [Delete] once to open the Task Manager. Press [Ctrl] [Alt] [Delete] again to restart the computer. Shut down your computer (Cold Boot) Start-up of a computer from a powered-down state. If you restart your computer and the problem isnt resolved, make an attempt to completely shut down the computer by pressing the power button. Let it set for 15-30 seconds then restart the computer.

Symptom
You have run out of disk space on your computer. Music files, movies, digital pictures, and other big data files can fill up your hard drive.

Diagnosis
To check for disk space: Open My Computer. Right click on the C: drive and select Properties from the shortcut menu. A pie chart will appear telling you the used and free space.

Try running the Disk Cleanup Wizard. This utility can tell you whether you are running out of room and help you clear away some space. Click the Start button and choose Programs | Accessories | System Tools | Disk Cleanup. Choose the disk to clean up (C :) and let the wizard do the work.

Troubleshooting Questions
When did your computer last work properly? If your computer was working satisfactory yesterday or the last time you were logged on but are now having trouble, try to identify everything that has changed recently. Did the trouble begin shortly after you installed: New program? New piece of hardware or updated a device driver? Do you receive a consistent error message? If so, write down the exact error message that appears on the screen, either write it down word for word. Can you reproduce the trouble with specific steps? If you can identify a specific set of actions that consistently cause the trouble to occur, the Technical Support Specialist and outline your steps to determine the problem. Write down the precise sequence of actions. Does the problem only occur after you have been using your computer for a while? If your computer runs fine first thing in the morning but crashes after several hours it could be heat related problems.

This page developed and maintained by the Information Technology Department at CCRI. Send comments and suggestions to edowling@ccri.edu . 2012, Community College of Rhode Island. About the CCRI web site

This page last modified on:

IT THOUGHT OF THE DAY


Home About Polls Store Contact Me

RSS Feeds My Twitter Great Cyber Movies The Terminator The Most Unfunny Information Technology (IT) Story of All Time

The Lost Art of Information Technology Troubleshooting


Whatever happened to the expertise of troubleshooting within the Information Technology Profession? The engineering, scientific and electrical worlds have many formalized methods and concepts for

troubleshooting. I feel that somewhere along the way, this valuable skill born from the fundamental pieces of the information technology profession failed to transition for the information age.

Have you tried what passes for troubleshooting when you call a company for support with a product these days? It is usually something like: "Try This", Yes/No, "Try That", if not then "Buy a new one". Try going to a consumer website like Apple.com or Dell.com and play with what passes for troubleshooting tools to see what I mean. At work, is it really any better? I am frustrated daily by the slipshod, unprofessional, undisciplined activities that I see take place in the name of "troubleshooting". My recent questioning of people at work tells me that schools dont teach it anymore, there is no certification program that validates it, and the rapid turnover of personnel in the IT profession prevents any real on the job training on it causing a transfer from the experienced to the new. Our profession really significantly undervalues troubleshooting and pays the price for it every day. The price is slow trouble resolution, secondary damage from uncontrolled troubleshooting actions, extended loss of system availability, increased chance of transient vulnerability exploitation, loss of configuration management, increased management frustration, and a failure to increase the level of proficiency and knowledge of inexperienced workers. We all know that all engineered systems will have problems. IT Pros are really all computer and network engineers. Troubleshooting and correcting these problems needs to be one of their core competencies. So what are the fundamentals of troubleshooting that should be adopted by Information Technology Professionals everywhere? Here are a few. First we need to agree that Troubleshooting is an IT discipline. Wikipedias generalized description of it is accurate for our uses.

"Troubleshooting is a form of problem solving most often applied to repair of failed products or processes. It is a logical, systematic search for the source of a problem so that it can be solved, and so the product or process can be made operational again. Troubleshooting is needed to develop and maintain complex systems where the symptoms of a problem can have many possible causes." Second, there are four basic elements of all effective troubleshooting efforts: 1. They are based upon half-splitting. That means that you are always trying to divide the world into two states: "known good" and "known bad". When this effort is truly complete, you have at least isolated the things that need to change to get you back to good. 2. You need to eliminate possibilities through testing. You always start with a complete list of the possible causes and eliminate them one by one from hard evidence. Many a troubleshooter has found that the real problem was one that they discounted from experience rather than through checking. This is where the question, "Is the computer plugged into the wall?", came from when you call the Help Desk. 3. The KISS Keep It Simple Stupid principle is always best. Your day is already complicated enough if you are really troubleshooting. Never let your tests become more complicated than the problem you are trying to solve. 4. Always document everything that you thought, did, and will do. The tendency to just stop when you have found the fault and not document the challenge or solution nearly always leads to pain down the road. The job is not done until the paperwork is including trouble tickets, configuration management documents, operating manuals, etc. There are many Formal Methodologies for Troubleshooting that we could borrow from other disciplines. The advantage of a formal method is that all participants, local and remote, familiar and unknown, senior and junior, are all synchronized on what you are doing and where you are going next. Best of all, it provides the framework for management decision-making in support of the troubleshooting and allows for formalized training programs to be built. From my time working in the nuclear energy field, where formality and standardization are prerequisites, I remain a huge fan of formal methods to technical investigation. Here are some leads for a formal method to employ:

Steve Litt has written guides on what he calls The Universal Troubleshooting Process (UTP). Its 10 steps are:

1. 2. 3. 4. 5. 6. 7. 8. 9.

Prepare Make damage control plan Get a complete and accurate symptom description Reproduce the symptom Do the appropriate corrective maintenance Narrow it down to the root cause Repair or replace the defective component Test Take pride in your solution

10. Prevent future occurrence of this problem 1. 2. 3. 4. 5. 6. 7. 8. Another favorite list found on the web follows: Troubleshooting Steps: Establish symptoms Identify the affected area. Establish what has changed. Select the most probable cause. Implement a Solution. Test the result. Recognize potential effects of the solution. Document the solution. MaintenanceWorld.com has an excellent article on Electrical Troubleshooting in Seven Steps . This is list that I learned many years ago and use as the basis for my own approach. 1. 2. 3. 4. 5. 6. Gather information Understand the malfunction Identify which parameters need to be evaluated Identify the source of the problem Correct/repair the component Verify the repair

7.

Perform root cause analysis As an exception to most information processing companies, Cisco does produce a Cisco Internetwork Troubleshooting (CIT) course. They teach an 8 step troubleshooting process for their equipment that is probably most relevant for information technology pros. Their steps are:

1. 2. 3. 4. 5. 6. 7. 8.

Define the problem. Gather detailed information. Consider probable cause for the failure. Devise a plan to solve the problem. Implement the plan. Observe the results of the implementation. Repeat the process if the plan does not resolve the problem. Document the changes made to solve the problem.

I propose that it is high time that we all rediscover the Lost Art of Information Technology Troubleshooting for ourselves, our organizations, and our users. This will lead to a much happier Information Age for one and all. What do you think? Are we any good at troubleshooting? Are there standardized methods that are accepted by the IT Pro community? Tell me what you think, please. That is my Information Technology Thought of the Day (ITTOD) for July 31, 2009 Scott Coughlin . Image Credit: http://www.veryslowcomputer.com Be Sociable, Share!

No related posts. Related posts brought to you by Yet Another Related Posts Plugin. Information Age Information Professional Information Technology IT Twitter Digg Facebook Delicious StumbleUpon in Share 0

This entry was posted by Scott Coughlin on July 31, 2009 at 4:59 am, and is filed under Business of IT, Hardware, Human Resources, Information Age, Information Technology. Follow any responses to this post through RSS 2.0.You can leave a response or trackback from your own site.

Comments (3) Related Posts

#1 written by Becki True 3 years ago Thanks for posting this Scott. There is a real lack of troubleshooting skills out there. I wonder if it is related to we are raised. I grew up in a rural area without a lot of money, so we had to fix our own bikes or make bikes out of spare parts. That taught us to learn how things work. I see the troubleshooting today consists of reboot first, troubleshoot later. Another common thing is oh it was x last time so let me do that. Sorry for the rant, but this is an area we really need to improve in our industry. Quote

#2 written by Scott Coughlin 3 years ago Exactly, my thoughts, Becki. Your idea that it has something to do with upbringing is really fascinating. I think that you might be on to something there. Thanks a lot for reading my post. -Scott Quote

#3 written by Social Bookmarking Tips Guide 3 years ago

Social Bookmarking Tips Guide is unique website promotion and products marketing book written to increase AdSense and affiliate income, boost product sales, and make your sites as profitable as never before. Being completely different, Social Bookmarking Tips Guide will help you discovering new website promotion and product marketing tactics that really work today. Quote

Name (required)

E-mail (required, w ill not be published)

Website

607

Type your comment

You may use these HTML tags: <a> <abbr> <acronym>


<b> <blockquote> <cite> <code> <del> <em> <i> <q> <strike> <strong>

Post Comment

Comment Feed for this Post


1. IT Quote of the Week: Douglas Engelbart

2. 3. 4. 5. 6. 7. 8. 9.

News Commentary: Three Cheers for the Cell Phone IT Thought of the Week: Robert McNamara News Commentary: Lions and Tigers and CyberWar Oh My! News Commentary: The Lunacy of Follower Counting More Enterprise Service Desk Humor: Funny Sign IT Quote of the Week: Robert McNamara iPhone Loving (Video) More Carl Sagan Quotes

10. IT Quote of the Week: Carl Sagan

Powered by WordPress and Mystique theme by digitalnature | RSS Feeds

Troubleshooting
From Wikipedia, the free encyclopedia Jump to: navigation, search This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
(June 2010)

Troubleshooting is a form of problem solving, often applied to repair failed products or processes. It is a logical, systematic search for the source of a problem so that it can be solved, and so the product or process can be made operational again. Troubleshooting is needed to develop and maintain complex systems where the symptoms of a problem can have many possible causes. Troubleshooting is used in many fields such as engineering, system administration, electronics, automotive repair, and diagnostic medicine. Troubleshooting requires identification of the malfunction(s) or symptoms within a system. Then, experience is commonly used to generate possible causes of the symptoms. Determining which cause is most likely is often a process of elimination - eliminating potential causes of a problem. Finally,

troubleshooting requires confirmation that the solution restores the product or process to its working state. In general, troubleshooting is the identification of, or diagnosis of "trouble" in the management flow of a corporation or a system caused by a failure of some kind. The problem is initially described as symptoms of malfunction, and troubleshooting is the process of determining and remedying to the causes of these symptoms. A system can be described in terms of its expected, desired or intended (usually, for artificial systems, its purpose). Events or inputs to the system are expected to generate specific results or outputs. (For example selecting the "print" option from various computer applications is intended to result in a hardcopy emerging from some specific device). Any unexpected or undesirable behavior is a symptom. Troubleshooting is the process of isolating the specific cause or causes of the symptom. Frequently the symptom is a failure of the product or process to produce any results. (Nothing was printed, for example). The methods of forensic engineering are especially useful in tracing problems in products or processes, and a wide range of analytical techniques are available to determine the cause or causes of specific failures. Corrective action can then be taken to prevent further failures of a similar kind. Preventative action is possible using failure mode and effects analysis (FMEA) and fault tree analysis (FTA) before full scale production, and these methods can also be used for failure analysis.

Contents
[hide]

1 Aspects 2 Half-splitting 3 Reproducing symptoms 4 Intermittent symptoms 5 Multiple problems 6 See also 7 References

[edit] Aspects
Most discussion of troubleshooting, and especially training in formal troubleshooting procedures, tends to be domain specific, even though the basic principles are universally applicable. Usually troubleshooting is applied to something that has suddenly stopped working, since its previously working state forms the expectations about its continued behavior. So the initial focus is often on recent changes to the system or to the environment in which it exists. (For example a printer that "was working when it was plugged in over there"). However, there is a well known principle that correlation does not imply causality. (For example the failure of a device shortly after it's been plugged into a different outlet doesn't necessarily mean that the events were related. The failure could have been a matter of coincidence.) Therefore troubleshooting demands critical thinking rather than magical thinking. It's useful to consider the common experiences we have with light bulbs. Light bulbs "burn out" more or less at random; eventually the repeated heating and cooling of its filament, and fluctuations in the power supplied to it cause the filament to crack or vaporize. The same principle applies to most other electronic devices and similar principles apply to mechanical devices. Some failures are part of the normal wear-and-tear of components in a system. A basic principle in troubleshooting is to start from the simplest and most probable possible problems first. This is illustrated by the old saying "When you see hoof prints, look for horses, not zebras", or to use another maxim, use the KISS principle. This principle results in the common complaint about help desks or manuals, that they sometimes first ask: "Is it plugged in and does that receptacle have power?", but this should not be taken as an affront, rather it should

serve as a reminder or conditioning to always check the simple things first before calling for help. A troubleshooter could check each component in a system one by one, substituting known good components for each potentially suspect one. However, this process of "serial substitution" can be considered degenerate when components are substituted without regards to a hypothesis concerning how their failure could result in the symptoms being diagnosed. Simple and intermediate systems are characterized by lists or trees of dependencies among their components or subsystems. More complex systems contain cyclical dependencies or interactions (feedback loops). Such systems are less amenable to "bisection" troubleshooting techniques. It also helps to start from a known good state, the best example being a computer reboot. A cognitive walkthrough is also a good thing to try. Comprehensive documentation produced by proficient technical writers is very helpful, especially if it provides a theory of operation for the subject device or system. A common cause of problems is bad design, for example bad human factors design, where a device could be inserted backward or upside down due to the lack of an appropriate forcing function (behavior-shaping constraint), or a lack of error-tolerant design. This is especially bad if accompanied by habituation, where the user just doesn't notice the incorrect usage, for instance if two parts have different functions but share a common case so that it isn't apparent on a casual inspection which part is being used. Troubleshooting can also take the form of a systematic checklist, troubleshooting procedure, flowchart or table that is made before a problem occurs. Developing troubleshooting procedures in advance allows sufficient thought about the steps to take in troubleshooting and organizing the troubleshooting into the most efficient troubleshooting process. Troubleshooting tables can be computerized to make them more efficient for users. Some computerized troubleshooting services (such as Primefax, later renamed Maxserve), immediately show the top 10 solutions with the highest probability of fixing the underlying problem. The technician can either answer additional questions to advance through the troubleshooting procedure, each step narrowing the list of solutions, or immediately implement the solution he feels will fix the problem. These services give a rebate if the technician takes an

additional step after the problem is solved: report back the solution that actually fixed the problem. The computer uses these reports to update its estimates of which solutions have the highest probability of fixing that particular set of symptoms.[1]

[edit] Half-splitting
Efficient methodical troubleshooting starts with a clear understanding of the expected behavior of the system and the symptoms being observed. From there the troubleshooter forms hypotheses on potential causes, and devises (or perhaps references a standardized checklist of) tests to eliminate these prospective causes. This approach is often called "Divide and Conquer". Two common strategies used by troubleshooters are to check for frequently encountered or easily tested conditions first (for example, checking to ensure that a printer's light is on and that its cable is firmly seated at both ends). This is often referred to as "milking the front panel."[2] Then, "bisect" the system (for example in a network printing system, checking to see if the job reached the server to determine whether a problem exists in the subsystems "towards" the user's end or "towards" the device). This latter technique can be particularly efficient in systems with long chains of serialized dependencies or interactions among its components. It's simply the application of a binary search across the range of dependencies and is often referred to as "half-splitting".[3]

[edit] Reproducing symptoms


One of the core principles of troubleshooting is that reproducible problems can be reliably isolated and resolved. Often considerable effort and emphasis in troubleshooting is placed on reproducibility ... on finding a procedure to reliably induce the symptom to occur. Once this is done then systematic strategies can be employed to isolate the cause or causes of a problem; and the resolution generally involves repairing or replacing those components which are at fault.

[edit] Intermittent symptoms


Some of the most difficult troubleshooting issues relate to symptoms that are only intermittent. In electronics this often is the result of components that are thermally sensitive (since resistance of a circuit varies with the temperature of the conductors in it). Compressed air can be used to

cool specific spots on a circuit board and a heat gun can be used to raise the temperatures; thus troubleshooting of electronics systems frequently entails applying these tools in order to reproduce a problem. In computer programming race conditions often lead to intermittent symptoms which are extremely difficult to reproduce; various techniques can be used to force the particular function or module to be called more rapidly than it would be in normal operation (analogous to "heating up" a component in a hardware circuit) while other techniques can be used to introduce greater delays in, or force synchronization among, other modules or interacting processes. Intermittent issues can be thus defined:
An intermittent is a problem for which there is no known procedure to consistently reproduce its symptom. Steven Litt, [1]

In particular he asserts that there is a distinction between frequency of occurrence and a "known procedure to consistently reproduce" an issue. For example knowing that an intermittent problem occurs "within" an hour of a particular stimulus or event ... but that sometimes it happens in five minutes and other times it takes almost an hour ... does not constitute a "known procedure" even if the stimulus does increase the frequency of observable exhibitions of the symptom. Nevertheless, sometimes troubleshooters must resort to statistical methods ... and can only find procedures to increase the symptom's occurrence to a point at which serial substitution or some other technique is feasible. In such cases, even when the symptom seems to disappear for significantly longer periods, there is a low confidence that the root cause has been found and that the problem is truly solved. Also, tests may be run to stress certain components to determine if those components have failed.
[4]

[edit] Multiple problems


Isolating single component failures which cause reproducible symptoms is relatively straightforward.

However, many problems only occur as a result of multiple failures or errors. This is particularly true of fault tolerant systems, or those with built-in redundancy. Features which add redundancy, fault detection and failover to a system may also be subject to failure, and enough different component failures in any system will "take it down." Even in simple systems the troubleshooter must always consider the possibility that there is more than one fault. (Replacing each component, using serial substitution, and then swapping each new component back out for the old one when the symptom is found to persist, can fail to resolve such cases. More importantly the replacement of any component with a defective one can actually increase the number of problems rather than eliminating them). Note that, while we talk about "replacing components" the resolution of many problems involves adjustments or tuning rather than "replacement." For example, intermittent breaks in conductors -- or "dirty or loose contacts" might simply need to be cleaned and/or tightened. All discussion of "replacement" should be taken to mean "replacement or adjustment or other maintenance."

[edit] See also


Bathtub curve Cause and effect Forensic engineering Problem solving Root cause analysis 5 Whys Debugging No Trouble Found RPR Problem Diagnosis

[edit] References
1. ^ "Troubleshooting at your fingertips" by Nils Conrad Persson. "Electronics Servicing and Technology" magazine 1982 June. 2. ^ "Hewlett Packard Bench Briefs". Hewlett Packard. http://www.hparchive.com/Bench_Briefs/HP-Bench-Briefs-1982-01-02.pdf. Retrieved 14 October 2011. 3. ^ Sullivan, Mike (Nov 15, 2000). "Secrets of a super geek: Use half splitting to solve difficult problems". TechRepublic. http://articles.techrepublic.com.com/5100-10878_11-5029507.html. Retrieved 22 October 2010. 4. ^ http://www.ocf.berkeley.edu/~joyoung/trouble/page1.shtml Retrieved from "http://en.wikipedia.org/w/index.php?title=Troubleshooting&oldid=524244895"

Bathtub curve
From Wikipedia, the free encyclopedia Jump to: navigation, search This article includes a list of references, related reading or external links, but its sources remain unclear because it lacks inline citations. Please improve this article by introducing more precise citations. (July 2010) This article relies largely or entirely upon a single source. Relevant discussion may be found on the talk page. Please help improve this article by introducing citations to additional sources. (July 2010)

The "bathtub" curve hazard function

The bathtub curve is widely used in reliability engineering. It describes a particular form of the hazard function which comprises three parts:

The first part is a decreasing failure rate, known as early failures. The second part is a constant failure rate, known as random failures. The third part is an increasing failure rate, known as wear-out failures.

The name is derived from the cross-sectional shape of a bathtub.

The bathtub curve is generated by mapping the rate of early "infant mortality" failures when first introduced, the rate of random failures with constant failure rate during its "useful life", and finally the rate of "wear out" failures as the product exceeds its design lifetime. In less technical terms, in the early life of a product adhering to the bathtub curve, the failure rate is high but rapidly decreasing as defective products are identified and discarded, and early sources of potential failure such as handling and installation error are surmounted. In the mid-life of a productgenerally, once it reaches consumersthe failure rate is low and constant. In the late life of the product, the failure rate increases, as age and wear take their toll on the product. Many consumer products strongly reflect the bathtub curve, such as computer processors. While the bathtub curve is useful, not every product or system follows a bathtub curve hazard function, for example if units are retired or have decreased use during or before the onset of the wear-out period, they will show fewer failures per unit calendar time (not per unit use time) than the bathtub curve. The term "Military Specification" is often used to describe systems in which the infant mortality section of the bathtub curve has been burned out or removed. This is done mainly for life critical or system critical applications as it greatly reduces the possibility of the system failing early in its life. Manufacturers will do this at some cost generally by means similar to accelerated stress testing. In reliability engineering, the cumulative distribution function corresponding to a bathtub curve may be analysed using a Weibull chart.

[edit] Critics: Invalid concept for modern complex systems


Some investigations in the aerospace and other industries have discovered that most failures do not comply with the bathtub curve. It is argued that the bathtub curve is an old concept and should not be used as a stand alone guide to reliability. Most interesting in these investigations was the conclusion that wear-out issues in complex systems only count for about 4%[citation needed] of all failures (refer to Reliability centered maintenance (RCM); Boeing 747 - MSG2 and MSG3 investigations). According to "The RCM approach" about 6 different types of failure rate curves can be distinguished. It is also remarkable that the highest contribution to failures appear to be

failures that have a constant failure rate character. This mainly counts for complex systems, being highly integrated.

Home > troubleshoot > How to develop troubleshooting skills and become a good server admin

How to develop troubleshooting skills and become a good server admin


January 7th, 2010 Goto comments Leave a comment

10 points to increase trou bleshooting skills and become a good Server Admin 1) Be clear with the concepts You will be learning lot of things in random in your day -to-day work, but essentially what you need to remember are the basic concepts of each technology you have come across, starting from basic commands, services associated, the applications related, tools etc. Before you enter the system admin scenario, join the popular discussion boards, mailing lists relating to your domain. Believe me, a lot of quick tips can be gathered just by reading what others in the business have to say. 2) Build your knowledgebase Its always good habit to write down somewhere whatever you have learned new in your days work. It would definitely come in handy in the future, cos you might have got it after rigorous research and you need not do it a second time. My notes that I always keep have saved me plenty of time for many of the issues that creep up. Also be willing to ask your peers or seniors or in the forums when you find yourself stuck, cos experience is one thing you can never garner in a little time. Remember, you can revert the favor when they look for you. 3) Try to relate with the technologies In my early days I have wasted plenty of hours trying to find a fix without understanding the cause, but that has changed. You have to keep in mind that there will be some relation with services or applications you are working with. So if something not working it might be that of some other service that is related to it. Knowing which is causing the problem is the measure of your troubleshooting skill. So think cleverly and identify the possible causes. 4) Troubleshooting skill

Nobody can train you to become a good admin. Its just like any other skill hardwork, persistence, experience and your methodica l approach towards the issues that you encounter makes you what you become. At first you may take up hours to fix the same issue that your peers with a bit of experience do in a few mins, but do not let that deter you. You have to think in different direct ions really quick and tap your knowledge base at the right moment. 5) Backup You might wonder why I have mentioned it here, but think of a situation when you have to say there is no backup for the data lost ! Your client would not definitely want to hear such an excuse. Make it a point to let the client know about the importance of doing backups and introduce backup process into your administration life cycle though it may seek additional operating costs and time.

6) Be cordial with customers If you have to deal with the end customers, please update them constantly about the status in critical or major issues like server down, migration, network down etc. If you have to manage a huge customer base, its always wise to have forum outside your network or datacentre where your customers can view the status for any work being done. Trust me, you do not want the heated up customers calling you up while the issue is upon your head and also the customers would be half relieved to know that we are working on to fix t he issue. 7) Communication with your boss Your boss wants quick results but he might not be willing to understand the technical difficulties involved. So you will have to be patient and accept your bosss apprehensions even though he might be jumping the gun most of the time. If you are still unhappy then try this - go to a empty room, imagine your boss is tied to a chair there, shout at him all you want, and feel content that you have given him your piece of mind. You will find this tip weird but its eff ective, at least it works for me. There are many other situations that I have imagination, but considering the wider audience, I find it unsuitable to describe here! Write articles You will see lot of ways to do an install or configuration of an application or a service. But when you get time sit down and write an article about it in your own words. I gurantee that it will give you a lot of satisfaction at the same time will boost yo ur confidence. You can put them on your blog or discussion boards where you are active. The feedbacks from readers will definitely help you improve and also it earns your repute as an admin.

9) Learn something new You shouldnt be tied down with a monotono us routine, be innovative whenever you have free time and try to learn something new. Remember in this industry, your knowledge is your wealth. By keeping yourself updated and familiarized on every aspect like the new issues and its fixes, latest tools and apps, best practices and security standards are of great importance. Even though I can confidently say that am a successful admin, I still learn things everyday. I consider this an imperative trait for my success. 10) Be a good team player avoid ego clash Always be ready to help your team members when they need it. Assist them on issues, give then clues and pointers, guide them but not fix the issue they are facing, if you do so you are preventing them from learning. You should be always ready to share and sharing makes you more authoritative about the concepts. Well if you hav ego, then I tell you one thing, keep it to yourself before you sit before the machine, cos I do the same! Hope these steps will help you to improve in your day to day work. You will be learning lot of things in random in your day -to-day work, but essentially what you need to remember are the basic concepts of each technology you have come across, starting from basic commands, services associated, the applications related, tools etc. Before you enter the system admin scenario, join the popular discussion boards, mailing lists relating to your domain. Believe me, a lot of quick tips can be gathered just by reading what others in the business have to say. Rating: 8.5/10 (10 votes cast)

Troubleshooting 201: Ask the Right Questions


Effective troubleshooting is a multifaceted exercise in diagnosis and deliberation, analysis and action.

Stephanie Krieger
There are two rules that always apply, whether youre troubleshooting hardware or software:

Troubleshooting is a process of elimination The most important assumption you can make, no matter how much you know about the technology, is that you could be wrong

If that first rule seems obvious, then consider this: Troubleshootingor any problem-solving processis clearly a process of elimination. However, its not that simple.

Your success or failure lies in what you choose to eliminate, and more importantly, why. Its a game of Pick Up Sticks where you evaluate, reason, then remove any obstacles that get you closer to resolving the problem without breaking anything else. How you make those choices depends entirely on the questions you ask and how you interpret the answers. As for the second point, the assumptions you make lead to the questions you ask and the way you interpret responseswhether youre asking a person, a document, a piece of hardware, a software package or a network infrastructure. When you assume you could be wrong, no matter what your level of experience, you keep an open mind that helps you see simple solutions you may never have expected. These are some of the common pitfalls you can run into while troubleshooting technological problems, as well as tips for asking questions that can lead you to the simple, effective solution every time.

If You Dont Know Why It Works, It Isnt Fixed


While teaching a document-troubleshooting training course, I asked the class if they were familiar with the Microsoft Word bug by which Word randomly changes the type of section break in long documents for no reason at all. They excitedly replied that they had been plagued by this bug, but one person in the class had found the solution. As you may already know, I had set them up. Theres no such bug. The section start type is often misunderstood. It does what it does for good reason and not at all randomly. So as you might expect, their solution was not ideal. After telling me they didnt know why the fix worked, but that it did work most of the time, they explained their solution. They recommended adding several next page section breaks before and after the break that changes. Then remove them one at a time (undoing your actions when the result is undesirable) until youre left with the break type you want. Whether youre familiar with this feature of Word or not, a troubleshooter should know this isnt a viable solution, and heres why:

If you dont know why a fix works, it probably doesnt. It may appear to work by coincidence, but a workaround is not a fix If the fix doesnt work consistently, it most likely doesnt work at all Whether youre working with software or hardware, computer technology is rooted in logic. If a fix seems unruly or overcomplicated, like the challenges in reality TV shows, theres probably a better way. In this case, theres a simple, consistent solution. You just have to change one setting in a dialog box (youll find the details of this particular fix at the end of this article).

The path to effective problem solving broke down here when the troubleshooters assumed the behavior was a bug because they didnt understand. They looked for any possible workaround rather than a simple, logical solution. Its common and understandable for users to blame the software or hardware when something frustrating happens that they dont understand. For a troubleshooter to do the same, however, is an almost certain setup for failure.

The job of troubleshooting begins when you dont already know the answer. You cant fix something if you dont know why its broken. So how do you get to the why when you dont know the how? You start by gathering information, and that means asking questions of the user and of the technology itself.

Ask, Narrow and Verify


This three-tiered approach to troubleshooting is both simple and effective. Heres one example: a networking troubleshooter in a large corporation was speaking with a user who couldnt log in to one internal application. The user had contacted the help desk to request login credentials. He learned that anyone in the organization should be able to log in.

Ask: Through a series of basic questions, the troubleshooter determined the user works remotely. However, the user is able to access both internal and external sites, as well as other internal applications. Everything appeared to be working normally, and the user had never had connectivity issues before

Narrow: The troubleshooter was not an expert in that particular application, so she started from what she knows networking. Based on the fact that the user could log in to other applications and was working remotely, the troubleshooter hypothesized there had to be something about his connection that was a problem for this particular application. She researched the system requirements for the application and then connected remotely to his computer

Verify: When connected remotely, the troubleshooter saw a network setting she believed might be causing the issue. She changed the setting and the user was able to log in, but she didnt leave it there. The troubleshooter had the user verify other connectivity and found that the change prevented him from accessing certain Web sites. She tried a different change to the same setting that let the user log in without disrupting other connections

You could easily apply the steps this troubleshooter took to our Word document scenario. If you dont know whats wrong, start with what you do know and work from there. If youre working with a user, listen to what he has to say and value the information he gives you. Use any related knowledge you have to interpret the answers. In the case of the networking issue, the user assumed that because he never had a connection problem before, his connection couldnt be an issue. The way he answered the question was exactly what made the troubleshooter believe she should, in fact, check his connection. A good troubleshooter takes information shes given and applies to it what she knows, always confirming the validity of a hunch before taking action. For example, this troubleshooter researched the system requirements of the application in question before connecting to the users computer. Similarly, if youre troubleshooting the first scenario from this article and you arent familiar with section breaks in Word, use the help functionality in the program to find out what they are and why theyre used. That way, you can begin to understand the behavior. Get your hands dirty. If youre not an expert in the specific problem, approach it with the same logic you would approach a technology you know well. This might mean connecting to a users machine and interacting with the technology in a way thats familiar to you. Be specific, start simply and look for concrete information that can help you narrow the possibilities. For example, you might double-click a section break to see if anything happens. That small action opens the Page Setup dialog box. That dialog box contains many related features, including the section start type, so you just got much closer to finding a real solution.

Measure twice, cut once. When you think you have the answer, test it. Make sure that it fixes the issue without doing other harm. Test it to confirm that it solves the problem consistently. And most importantly, be sure that you understand why the fix worked, or you cant be sure that youve fixed it at all.

Open Minds, Simple Solutions


Consider one more example of troubleshooting a Microsoft Word document: A troubleshooter received a document that was crashing frequently. He began by opening the document using the Open and Repair feature in Word. Open and Repair indicated there was a corrupt shape in the document. However, he saw no embedded graphics. Was Open and Repair wrong? No, it was absolutely right. Closer examination revealed shapes off the page, as well as in a header that was currently turned off. (See the end of this article for more information on troubleshooting this particular issue.) Whether you ask a question directly of a user (as in the previous networking scenario) or of the technology itself in this case, trust the information you receive and interpret it based on everything else that you know. Theres no substitute for technical knowledge of the problem area to help you ask the right questions. When you work in IT, though, learning new skills is a constant part of most job descriptions. Good troubleshooting skills are a constant and a necessity thats entirely separate of technical knowledge. Good troubleshooting means applying logic so you can take concrete steps to effectively narrow the possibilities. It also means keeping an open mind and calling on any related knowledge (including how to find help and research the problem) to help you reach the simple solution. Before you tell a user that he needs new hardware, needs to reinstall software or needs to recreate a document from scratch, consider the most likely possibility: If the answer appears to be that complicated, you may not have asked the right questions.

Sidebar: Sensible Solutions Solution to Word Section Break Issue


The answer is that the break type changed because a user removed an adjacent section break. While a section break stores formatting for the section that precedes the break, the type of break (such as next page or continuous) refers to how the following section starts. To change the break type, click into the section that follows the break. Then, on the Page Layout tab of the Ribbon, in the Page Setup group, click the dialog box launcher. On the Layout tab of the Page Setup dialog box, change the Section Start value. The only case in which this wont work is if the section requires a specific break type. For example, you cant have a continuous section break between sections with different page orientation, because a single piece of paper cant be both portrait and landscape.

Solution to Word Open and Repair Issue


One easy way to find issues such as Open and Repair indicating that something is wrong, but you cant see the object in question, is to use Microsoft Visual Basic for Applications (VBA). You dont have to be a programmer to quickly learn some simple VBA that can be a troubleshooters best friend when working in any Microsoft Office documents. See the MSDN Library article, Troubleshooting Word 2007 Documents More Easily Using VBA for detailed help and more information. This article was written for Word 2007, but also applies to Word 2010. Stephanie Krieger is a Microsoft Office MVP, as well as the author of Advanced Microsoft Office Documents 2007 Edition Inside Out and Microsoft Office Document Designer, both from Microsoft Press. Krieger writes and creates content for several pages on the Microsoft web site. Visit her blog Arouet Dot Net for Microsoft Office tips, and information about new and upcoming publications.

Related Content

The Desktop Files: Windows Won't Start! The Cable Guy: Network Diagnostics & Tracing in Windows 7 SQL Q&A: Unexpected Consistency Checks, Troubleshooting Memory Usage, More

. ITIL- Incident Management For Beginners Abhishek Agnihotry If any query mail me at [email_address] 2. Contents Incident Management objective Incident Management Activities Incident management input and Output Benefits Roadblocks Key Performance indicators Roles and responsibilities Relationships Tool Requirements Summary 3. Incident Management : Objective The primary Objective of incident management is to restore normal service operation, within Service Level Agreement limits, as quickly as possible, after an incident has occurred to that service, and minimize the adverse impact on business operations. Goals of Incident Management: Restore the service as quickly as possible. Minimum disruption to users work Management of an incident during its entire lifecycle Support of operational activities 4. Incident Management : Activities 5. Incident Lifecycle 6. . Incident Management : Activities 7. Incident Management : Identification and Registration of Incidents 8. Incident Management Classification: Find service affected, Match against SLA, and Assign Priority . 9. Incident Management :Priority order for handling incidents is primarily defined by impact and urgency 10. Incident Management :Each priority is related to a certain recovery time 11. Incident Management :Escalation Objective of an escalation or escalation procedure is the avoidance or minimizing of material or immaterial damage. The definition of an escalation comprises: escalation trigger escalation measures escalation levels The

combination of the three keys results in an escalation matrix with escalation paths The escalation procedures should be clearly agreed between all involved parties. Escalation can usefully improve service provision only if it is accepted by all parties. Escalation should not be misused as a proof of guilt. 12. Incident Management : Types of Escalation 13. Incident Management Input and Output 14. Incident Management :Benefits Reduced business impact of incidents by timely resolution Proactive identification of possible enhancements Management information related to business-focused SLA Improved monitoring Improved management information related to aspects of service Better staff utilization: no more interruptionbased handling of incidents Elimination of lost incidents and service requests Better and accurate CMDB information Better user/customer satisfaction 15. Incident Management Roadblocks: No management/staff commitment, hence no resources Lack of clarity of business needs Badly-defined service goals Working practices not reviewed or changed No service levels defined with customer Lack of knowledge on resolving incidents The quality of the Configuration Database No integration with other processes Resistance to using process 16. Key Performance indicators Total number of Incidents Average time to restore service from point of first call Percentage of incidents Handled within agreed response time Number of Incidents broken down by Priority and category Number of Incidents escalated by service desk Number of incidents re-opened Number of Incidents by Passing service desk Number of Incidents Incorrectly escalated 17. Incident management Roles and Responsibilities Incident Manager - driving the efficiency and effectiveness of the Incident Management process -producing management information -managing the work of Incident support staff (first-and second-line) monitoring the effectiveness of Incident Management and making recommendations for improvement - developing and maintaining the Incident Management systems. Incident Analyst (first Line) - Incident registration -routing service requests to support groups when Incidents are not closed -initial support and classification -ownership, monitoring, tracking and communication -resolution and recovery of Incidents not assigned to second-line support -closure of Incidents 18. Incident management Roles and Responsibilities Incident Analyst (second line) Handling service requests -monitoring Incident details, including the Configuration Items affected -Incident investigation and diagnosis (including resolution where possible) detection of possible Problems and the assignment of them to the Problem -Management team for them to raise Problem records - the resolution and recovery of assigned Incidents. Subject Matter expert - Analyzes incidents to identify service restoration actions to be taken -Takes Incident resolution actions to restore services to customers Assists incident management staff with identifying the impact of the incidents 19. Key Relationships Problem Management Configuration Management Change Management Service level Management Availability Management Capacity Management 20. Tool requirements Tool for incident logging and recording Automatic Escalation Facilities Automatic extraction of Configuration data from CMDB ACD systems for automatically registering names and phone Numbers of users 21. Incident Management Summary: The goal of incident management is to restore normal service operation, within Service Level Agreement limits, as quickly as possible,

after an incident has occurred to that service, and to minimize the adverse impact on business operations. Incident Management process Activities Incident detection and recording Classification and initial support Investigation and diagnosis Resolution and recovery Incident closure Prioritization primarily determined by impact on business and urgency with which a resolution or workaround is needed; correct prioritization enables optimum staffing and use of other resources to customer satisfaction. Escalation (functional and hierarchical)

You might also like