IEEE -SWEBOK Guide V3 focuses on computer science foundations. "Software engineering builds on top of computer science," says Frezza. The importance of computer science to Software Engineering is not sufficiently acknowledged.
IEEE -SWEBOK Guide V3 focuses on computer science foundations. "Software engineering builds on top of computer science," says Frezza. The importance of computer science to Software Engineering is not sufficiently acknowledged.
IEEE -SWEBOK Guide V3 focuses on computer science foundations. "Software engineering builds on top of computer science," says Frezza. The importance of computer science to Software Engineering is not sufficiently acknowledged.
ACRONYMS 3 AOP: Aspect-Oriented Programming ALU: Arithmetic and Logic Unit API: Application Programming Interface ATM: Asynchronous Transfer Mode B/S: Browser-Server C/S: Client-Server CS: Computer Science DBMS: Data Base Management System FPU: Float Point Unit I/O: Input and Output ISA: Instruction Set Architecture ISO: International Organization for Standardization ISP: Internet Service Providers KA: Knowledge Area LAN: Local Area Network MUX: Multiplexer NIC: Network Interface Card OOP: Object-Oriented Programming OS: Operating Systems PDA: Personal Digital Assistant PDC: Parallel and Distributed Computing PL: Programming Languages PPP: Point-to-Point Protocol RFID: Radio Frequency Identification SCSI Small Computer System Interface SQL: Structured Query Language VPN: Virtual Private Network WAN: Wide Area Network Tutodut`ou 4 TheComputingFoundationsKnowledgeArea(KA)is 5 concerned with the computer science foundations 6 that support the design and construction of 7 software products. Moreover, it also includes 8 knowledgeaboutthetransformationofa(software) 9 design into a (software) implementation, the tools 10 used during this process, and the various software 11 developmentmethods. 12 It is generally accepted that software engineering 13 builds on top of computer science. For example, 14 Software Engineering 2004: Curriculum Guidelines 15 for Undergraduate Degree Programs in Software 16 Engineering clearly states, One particularly 17 importantaspectisthatsoftwareengineeringbuilds 18 on computer science and mathematics (italics 19 added). 20 Whilesomemaynotfeelverycomfortablewiththe 21 claimthatsoftwareengineeringbuildsoncomputer 22 science (Frezza) [2], not many will deny the 23 important role computer science plays in the 24 development of software engineering both as a 25 disciplineandasabodyofknowledge.However,the 26 importance of computer science to software 27 engineering is not sufficiently acknowledged within 28 the software industry; thus, this KA guideline is 29 beingwritten. 30 Although the term "Computing Foundations" is not 31 precisely defined and may have different meanings 32 or connotations to different people, most would 33 agree that the scope encompassed by the term 34 includes at least the development and operational 35 environment in which software evolves and 36 executes. Because no software can exist in a 37 vacuumorrunwithoutacomputer,thecoreofsuch 38 an environment is the computer and its various 39 components. To some extent, knowledge about the 40 computer and its underlying principles of hardware 41 and software serves as a framework on which 42 softwareengineeringisanchored.Thus,allsoftware 43 engineers must have good understanding of the 44 ComputingFoundationsKnowledgeArea. 45 The majority of topics discussed in the Computing 46 FoundationsKAarealsotopicsofdiscussioninbasic 47 courses given in computer science undergraduate 48 and graduate programs. Such courses include 49 programming, data structure, algorithms, computer 50 organization, operating systems, compiler, 51 databases, networking, distributed systems, and so 52 forth. Thus, when breaking down topics, it can be 53 temptingtodecomposetheComputingFoundations 54 KA according to these oftenfound divisions in 55 relevantcourses. 56 However, a pure coursebased division of topics 57 suffersseriousdrawbacks.Forone,notallcoursesin 58 computer science are related or equally important 59 to software engineering. Thus, some topics that 60 would otherwise be covered in a computer science 61 course are not covered in this guideline. For 62 example, computer graphicswhile an important 63 course in a computer science degree programis 64 notincludedinthisguideline. 65 IEEE SWEBOK Guide V3 Second, some topics discussed in this guideline do 66 notexistasstandalonecoursesinundergraduateor 67 graduate computer science programs. 68 Consequently, such topics may not be adequately 69 covered in a pure coursebased breakdown. For 70 example, abstraction is a topic mentioned/covered 71 in several different computer science courses; it is 72 unclear which course abstraction should belong to 73 inacoursebasedbreakdownoftopics. 74 For the aforementioned reasons, we do not use a 75 coursebased division; instead, we break the 76 ComputingFoundationsKAintoseventeendifferent 77 topics. A topics direct usefulness to software 78 engineers is the criterion we used for selecting 79 topics for inclusion in this KA. (Figure 1 depicts the 80 breakdownoftopicsfortheComputingFoundations 81 KA.)Theadvantageofthistopicbasedbreakdownis 82 its foundation on the belief that Computing 83 Foundationsifitistobegraspedfirmlymustbe 84 considered as a collection of logically connected 85 topics undergirding software engineering in general 86 andsoftwareconstructioninparticular. 87 The Computing Foundations KA is relatedclosely to 88 the Software Design, Software Construction, 89 Software Testing, Software Maintenance, Software 90 Quality,andtheMathematicalFoundationsKAs. 91 bll/lDO"\ Ol TOlTCS lOl COHl|TT\0 92 lO|\D/TTO\S 93 The breakdown of topics for the Computing 94 FoundationsKAisshowninFigure1. 95 1. Problem Solving Techniques 96 [3*c3, s4.1, s4.24*c5] 97 The concepts, notions, and terminology introduced 98 hereformanunderlyingbasisforunderstandingthe 99 roleandscopeofproblemsolvingtechniques. 100 1.1DefinitionofProblemSolving 101 Problem solving refers to the thinkingand activities 102 conducted to answer or derive a solution to a 103 problem. There are many ways to approach a 104 problem, andeach way employs different tools and 105 uses different processes. These different ways of 106 approaching problems gradually expand and define 107 themselves and finally give rise to different 108 disciplines. For example, software engineering 109 focuses on solving problems using computers and 110 software. 111 Whiledifferentproblemswarrantdifferentsolutions 112 andmayrequiredifferenttoolsandprocessestobe 113 used, the methodology and techniques used in 114 solvingproblemsdofollowsomegeneralguidelines 115 and can often be generalized as problem solving 116 techniques. For example, a general guideline for 117 solving a generic engineering problem is to use the 118 threestepprocessgivenbelow. 119 z Formulate the real problem 120 z Analyze the problem 121 z Design a solution search strategy 122 1.2FormulatingtheRealProblem 123 Gerard Voland writes, It is important to recognize 124 thataspecificproblemshouldbeformulatedifone 125 istodevelopaspecificsolution[3*p88].Tosolvea 126 problem, one must first define the problem. In 127 softwarejargon,thisdefinitioniscalledtheproblem 128 statement, which explicitly specifies both what the 129 problemandthedesiredoutcomeare. 130 Although there is no specific way of stating a 131 problem, in general a problem should be expressed 132 in such a way as to facilitate the development of 133 solutions for solving it. Some general techniques to 134 help one formulate the real problem include 135 statementrestatement,determiningthesourceand 136 thecause,revisingthestatement,analyzingpresent 137 anddesiredstate,andusingthefresheyeapproach. 138 IEEE SWEBOK Guide V3 139 Figure1.BreakdownoftopicsfortheComputing 140 FoundationsKA 141 1.3AnalyzetheProblem 142 Once the problem statement is available, the next 143 step is to analyze the problem statement or 144 situationtohelpstructureoursearchforasolution. 145 Four types of analysis include situation analysis, in 146 which the most urgent or critical aspects of a 147 situation are identified first; problem analysis, in 148 which the cause of the problem must be 149 determined;decisionanalysis,inwhichtheaction(s) 150 needed to correct the problem or eliminate its 151 cause must be determined; and potential problem 152 analysis, in which the action(s) needed to prevent 153 any reoccurrences of the problem or the 154 developmentofnewproblemsmustbedetermined. 155 1.4DesignaSolutionSearchStrategy 156 Once the problem analysis is complete, we can 157 focus on structuring a search strategy to find the 158 solution. Voland writes, In most problem solving 159 situations, one should not seek to generate and 160 evaluate every possible solution to a problem 161 [3*p117]. In order to find the best solution, we 162 need to eliminate paths that do not lead to viable 163 solutions, design tasks in a way that provides the 164 most guidance in searching for a solution, and use 165 variousattributesofthefinalsolutionstatetoguide 166 ourchoicesintheproblemsolvingprocess. 167 1.5ProblemSolvingusingPrograms 168 The uniqueness of computer software gives 169 problemsolvingaflavorthatisdistinctfromgeneral 170 engineering problem solving. For example, the 171 techniques employed in solving a programming 172 problemaresomewhatdifferentfromthoseusedin 173 solving a general engineering problem: to solve a 174 problem using computers, we must answer the 175 followingquestions. 176 z How do we figure out what to tell the computer to 177 do? 178 z How do we convert the problem statement into an 179 algorithm? 180 z How do we convert the algorithm into machine 181 instructions? 182 Thefirsttaskinsolvingaproblemusingacomputer 183 is to determine what to tell the computer to do. 184 There may be many ways to tell the story, but all 185 shouldtaketheperspectiveofacomputersuchthat 186 the problem can eventually be solved by the 187 computer. In general, a problem should be 188 expressed in such a way as to facilitate the 189 development of algorithms and data structures for 190 solvingit. 191 The result of the first task is a problem statement. 192 The next step is to convert the problem statement 193 into algorithms that solve the problem. Once an 194 algorithm is found, the final step converts the 195 algorithm into machine instructions that form the 196 IEEE SWEBOK Guide V3 finalsolution:softwarethatsolvestheproblem. 197 Abstractly speaking, problem solving using a 198 computer can be considered as a process of 199 problem transformationin other words, the step 200 bysteptransformationofaproblemstatementinto 201 a problem solution. To the discipline of software 202 engineering, the ultimate objective of problem 203 solving is to transform a problem expressed in 204 natural language into electrons running around a 205 circuit.Ingeneral,thistransformationcanbebroken 206 intothreephases: 207 z Systematic derivation of algorithms from the 208 problem statement. 209 z Systematic application of algorithms to the 210 problem. 211 z Transformation of algorithms to program code. 212 The conversion of a problem statement into 213 algorithms and algorithms into program codes 214 usually follows a socalled stepwise refinement 215 (a.k.a. systematic decomposition) in which we start 216 with a problem statement, rewrite it as a task, and 217 recursively decompose the task into a few simpler 218 subtasksuntilthetaskissosimplethatsolutionsto 219 itarestraightforward.Therearethreebasicwaysof 220 decomposing:sequential,conditional,anditerative. 221 2. Abstraction 222 [4*s5.2-5.4] 223 Abstractionisanindispensibletechniqueassociated 224 with problem solving. It refers to both the process 225 and result of generalization by reducing the 226 information of a concept, a problem, or an 227 observable phenomenon so that one can focus on 228 thebigpicture.Oneofthemostimportantskillsin 229 anyengineeringundertakingisframingthelevelsof 230 abstractionappropriately. 231 Through abstraction, according to Voland, we 232 view the problem and its possible solution paths 233 fromahigherlevelofconceptualunderstanding.As 234 a result, we may become better prepared to 235 recognize possible relationships between different 236 aspects of the problem and thereby generate more 237 creative design solutions [3*p232]. This is 238 particularly true in computer science in general 239 (such as hardware vs. software) and in software 240 engineering in particular (software requirement vs. 241 softwareconstruction,andsoforth). 242 2.1 Levels of Abstraction 243 When abstracting, we concentrate on one level of the 244 big picture at a time with confidence that we can then 245 connect effectively with levels above and below. 246 Although we focus on one level, abstraction does not 247 mean knowing nothing about the neighboring levels. 248 Abstraction levels do not necessarily correspond to 249 discrete components in reality or in the problem 250 domain, but to well-defined standard interfaces such as 251 programming APIs. The advantages that standard 252 interfaces provide include portability, easier 253 software/hardware integration and wider usage. 254 Usually, the definition or divisions of abstraction levels 255 are arbitrary to some extent; there are often other ways 256 to draw the lines. 257 2.2Encapsulation 258 Encapsulation is a mechanism used to implement 259 abstraction. When we are dealing with one level of 260 abstraction, the information concerning the levels 261 below and above that level is encapsulated. This 262 information can be the concept, problem, or 263 observable phenomenon; or it may be the 264 permissible operations on these relevant entities. 265 Encapsulation usually comes with some degree of 266 information hiding in which some or all of the 267 underlying details are hidden from the level above 268 the interface provided by the abstraction. To an 269 object, information hiding means we don't need to 270 knowthedetailsofhowtheobjectisrepresentedor 271 how the operations on those objects are 272 implemented. 273 2.3Hierarchy 274 When we use abstraction in our problem 275 formulation and solution, we may use different 276 abstractions at different timesin other words, we 277 work on different levels of abstraction as the 278 situation calls. Most of the time, these different 279 levels of abstraction are organized in a hierarchy. 280 This hierarchical structure resembles the 281 organizationalstructuresinhumansocietyandthus 282 facilitates easier understanding by humans. There 283 are many ways to structure a particular hierarchy 284 and the criteria used in determining the specific 285 content of each layer in the hierarchy varies 286 dependingontheindividualsperformingthework. 287 IEEE SWEBOK Guide V3 Mostofthetime,ahierarchyofabstractionlevelsis 288 sequential, which means that each layer has one 289 andonlyonepredecessor(lower)layerandoneand 290 only one successor (upper) layerexcept the 291 upmost layer (which has no successor) and the 292 bottommost layer (which has no predecessor). 293 Sometimes, the hierarchy is organized in a treelike 294 structure, which means each layer can have more 295 than one predecessor layer but only one successor 296 layer. Occasionally, a hierarchy can have a manyto 297 many structure, in which each layer can have 298 multiple predecessors and successors. At no time, 299 shalltherebeanyloopinahierarchy. 300 A hierarchy often forms naturally in task 301 decomposition. Often, a task analysis can be 302 decomposed in a hierarchical fashion, starting with 303 the larger tasks and goals of the organization and 304 breaking each of them down into smaller subtasks 305 that can again be further subdivided This 306 continuousdivisionoftasksintosmalleroneswould 307 produceahierarchicalstructureoftaskssubtasks. 308 3. Programming Fundamentals 309 [4*c6-19] 310 Programming (often called coding) is composed of 311 the methodologies or activities for creating 312 computer programs that perform a desired 313 function. It is an indispensible part in software 314 construction. In general, programming can be 315 considered as the process of designing, writing, 316 testing, debugging, and maintaining the source 317 code. This source code is written in some 318 programminglanguage. 319 The process of writing source code often requires 320 expertiseinmanydifferentsubjectareasincluding 321 knowledge of the application domain, appropriate 322 data structures, specialized algorithms, various 323 language constructs, good programming 324 techniques,andsoftwareengineering. 325 3.1TheProgrammingProcess 326 Programming involves design, writing, testing, 327 debugging, and maintenance. Design is the 328 conception or invention of a scheme for turning a 329 customer requirement for computer software into 330 operational software. It is the activity that links 331 application requirement to coding and debugging. 332 Writing is the actual coding of the design in an 333 appropriate programming language. Testing is the 334 activity to verify that the code one writes actually 335 does what it is supposed to do. Debugging is the 336 activity to find and fix bugs (errors) in the source 337 code (or design). Maintenance is the activity to 338 update, correct, and enhance existing programs. 339 Each of these activities is a huge topic and often 340 warrants the explanation of an entire Knowledge 341 AreaintheSWEBOKGuideandmanybooks. 342 3.2ProgrammingParadigms 343 Programming is highly creative and thus somewhat 344 personal. Different people often write different 345 programs for the same requirements. This diversity 346 of programming causes much difficulty in the 347 construction and maintenance of large complex 348 software. Various programming paradigms have 349 been developed over the years to put some 350 standardization into this highly creative and 351 personalactivity.Whenoneprograms,heorshecan 352 use one of many programming paradigms to write 353 the code. The major types of programming 354 paradigmsarediscussedbelow. 355 Unstructured Programming: In unstructured 356 programming, a programmer follows his/her hunch 357 to write the code in whatever way he/she likes as 358 long as the function is operational. Often, the 359 practice is to write code to fulfill a specific utility 360 without regard to anything else. Programs written 361 this way exhibit no particular structurethus the 362 name unstructured programming. Unstructured 363 programming is also sometimes called ad hoc 364 programming. 365 Structured/Procedural Programming: Programs are 366 structured as procedures (or functions) with each 367 procedure performing a specific task. Standard 368 interfaces exist between procedures to facilitate 369 correct and smooth calling operations of the 370 programs. Under structured programming, 371 programmersoftenfollowestablishedprotocolsand 372 rules of thumb when writing code. These protocols 373 and rules can be numerous and cover almost the 374 entire scope of programmingranging from the 375 simplest issue (such as how to name variables, 376 functions, procedures, and so forth) to more 377 complex issues (such as how to structure an 378 interface,howtouseexceptions,andsoforth). 379 IEEE SWEBOK Guide V3 ObjectOriented Programming: While procedural 380 programming organizes programs around 381 procedures, objectoriented programming (OOP) 382 organize a program around socalled objects, which 383 areabstractdatastructuresthatcombinebothdata 384 andmethodsusedtoaccessormanipulatethedata. 385 The primary features of OOP are that objects 386 representing various abstract and concrete entities 387 are created and these objects interact with each 388 othertocollectivelyfulfillthedesiredfunctions. 389 AspectOriented Programming: Aspectoriented 390 programming (AOP) is a programming paradigm 391 that is built on top of OOP. AOP aims to isolate 392 secondary or supporting functions from the main 393 program's business logic by focusing on the so 394 called cross sections (concerns) of the objects. The 395 primary motivation for AOP is to resolve the so 396 calledobjecttanglingandscatteringassociatedwith 397 OOP, in which the interactions among objects 398 become very complex. The essence of AOP is the 399 greatly emphasized separation of concerns, which 400 separatesnoncorefunctionalconcernsorlogicinto 401 variousaspects. 402 3.3DefensiveProgramming 403 Intheconstructionoflargecomplexsoftware,many 404 people work as a team with each person being 405 responsible for part of the software. Because 406 different people have different styles in writing 407 programs,itisimperativethatpeoplewritecodein 408 such a way as to avoid conflicts or problems with 409 codewrittenbyotherpeople.Thisthinkingleadsto 410 the emergence of the socalled defensive 411 programming. 412 Abstractlyspeaking,defensiveprogrammingismore 413 a programming attitude than a style. It takes 414 responsibility for protecting ones own code even 415 when someone else is not doing his/her work 416 properly. The main idea is that if someone does 417 something badsuch as provide a bad inputthe 418 codedoesnotbreak.Inapracticalsense,defensive 419 programming can be seen as a collection of 420 programming techniques that, when used, produce 421 more robust programs. Techniques for defensive 422 programming include assertions, errorhandling, 423 exception, and barricades. Defensive programming 424 oftenproduceshigherqualitycode. 425 4. Programming Language Basics 426 [6*c6] 427 Using computers to solve problems involves 428 programmingwriting and organizing instructions 429 telling the computer what to do at each step. 430 Programs must be written in some programming 431 language with which and through which we 432 describe necessary computations. In other words, 433 we use the facilities provided by a programming 434 languagetodescribeproblems,developalgorithms, 435 and reason about problem solutions. To write any 436 program, one must understand at least one 437 programminglanguage. 438 4.1ProgrammingLanguageOverview 439 A programming language is designed to express 440 computations that can be performed by a computer. In 441 a practical sense, a programming language is a notation 442 for writing programs and thus should be able to express 443 most data structures and algorithms. Some, but not all, 444 people restrict the term "programming language" to 445 those languages that can express all possible 446 algorithms. 447 Not all languages have the same importance and 448 popularity. The most popular ones are often defined by 449 a specification document established by a well-known 450 and respected organization. For example, the C 451 programming language is specified by an ISO standard 452 named . Other languages, such as Perl 453 and Python, do not enjoy such treatment and often have 454 a dominant implementation that is used as a reference. 455 4.2 Syntax and Semantics of Programming Languages 456 Just like natural languages, many programming 457 languages have some form of written specification of 458 their syntax (form) and semantics (meaning). Such 459 specifications include, for example, specific 460 requirements for the definition of variables and 461 constants (in other words, declaration and types) and 462 format requirements for the instructions themselves. 463 In general, a programming language supports such 464 constructs as variables, data types, constants, literals, 465 assignment statements, control statements, procedures, 466 functions, and comments. The syntax and semantics of 467 each construct must be clearly specified. 468 4.3LowLevelProgrammingLanguages 469 IEEE SWEBOK Guide V3 Programming language can be classified into two 470 classes: lowlevel languages and highlevel 471 languages. Lowlevel languages can be understood 472 by a computer with no or minimal assistance and 473 typically include machine languages and assembly 474 languages.Amachinelanguageusesonesandzeros 475 to represent instructions and variables, and is 476 directlyunderstandablebyacomputer.Anassembly 477 language contains the same instructions as a 478 machinelanguagebuttheinstructionsandvariables 479 have symbolic names that are easier for humans to 480 remember. 481 Assembly languages cannot be directly understood 482 by a computer and must be translated into a 483 machine language by a utility program called an 484 assembler. There often exists a correspondence 485 between the instructions of an assembly language 486 and a machine language, and the translation from 487 assembly code to machine code is straightforward. 488 For example, add r1, r2, r3 is an assembly 489 instructionforaddingthecontentofregisterr2and 490 r3 and storing the sum into register r1. This 491 instruction can be easily translated into machine 492 code 0001001010000011 (assume the operation 493 codeforadditionis0001.) 494 One common trait shared by these two types of 495 languageistheircloseassociationwiththespecifics 496 ofatypeofcomputerorinstructionsetarchitecture 497 (ISA). 498 4.4HighLevelProgrammingLanguages 499 A high-level programming language has a strong 500 abstraction from the details of the computers ISA. In 501 comparison to low-level programming languages, it 502 often uses natural-language elements and is thus much 503 easier for humans to understand. Such languages allow 504 symbolic naming of variables, provide expressiveness, 505 and enable abstraction of the underlying hardware. For 506 example, while each microprocessor has its own ISA, 507 code written in a high-level programming language is 508 usually portable between many different hardware 509 platforms. For this reason, most programmers use 510 and most software are written in high-level 511 programming languages. Examples of high-level 512 programming languages include C, C++, C#, and 513 Java, 514 4.5 Declarative vs. Imperative Programming 515 Languages 516 Most programming languages (high-level or low-level) 517 allow programmers to specify the individual 518 instructions that a computer is to execute. Such 519 programming languages are called imperative 520 programming languages because one has to specify 521 every step clearly to the computer. But some 522 programming languages allow programmers to only 523 describe the function to be performed without 524 specifying the exact instruction sequences to be 525 executed. Such programming languages are called 526 declarative programming languages. Declarative 527 languages are high-level languages. The actual 528 implementation of the computation written in such a 529 language is hidden from the programmers and thus is 530 not a concern for them. 531 The key point to note is that declarative programming 532 only describes what the program should accomplish 533 without describing how to accomplish it. For this 534 reason, many people believe declarative programming 535 facilitates easier software development. Declarative 536 programming languages include Lisp and Prolog, while 537 imperative programming languages include C, C++, 538 and JAVA. 539 5. Debugging Tools and Techniques 540 [4*c23] 541 Once a program is coded, the next step is 542 debugging,whichisamethodicalprocessoffinding 543 and reducing the number of bugs or errors in a 544 program. The purpose of debugging is to find out 545 whyaprogramdoesntwork.Exceptforverysimple 546 programs,debuggingisalwaysnecessary. 547 5.1TypesofErrors 548 Usually, a program does not work because it 549 contains bugs or errors that can be either syntactic 550 errors,logicalerrors,ordataerrors. 551 Syntax errors are typing errors that result in illegal 552 operations. This type of error only applies to high 553 level programming languages and not to machine 554 languages. In a machine language, any bit pattern 555 corresponds to some legal instruction. In highlevel 556 programming languages, syntax errors are often 557 caught during the compilation or translation from 558 the highlevel language into machine code. For 559 example, in the C/C++ programming language, the 560 statement 123=constant; contains a syntax error 561 that will be caught by the compiler during 562 compilation. 563 IEEE SWEBOK Guide V3 Logic errors are semantic errors that result in 564 incorrect computations or program behaviors. Your 565 program is legal, but wrong! So the results do not 566 matchtheproblemstatementoruserexpectations. 567 For example, in the C/C++ programming language, 568 the inline function int f(int x) {return f(x1);} for 569 computingfactorialx!islegalbutlogicallyincorrect. 570 This type of error cannot be caught by a compiler 571 during compilation and is often discovered through 572 tracingtheexecutionoftheprogram. 573 Data errors are input errors that result either in 574 input data that is different from what the program 575 expects or in the processing of wrong data. This 576 type of error can be discovered by testing the 577 programwithawidevarietyofinputs. 578 5.2DebuggingTechniques 579 Debugging involves many activities and can be static, 580 dynamic, or post-mortem. Static debugging usually 581 takes the form of code review, while dynamic 582 debugging usually takes the form of tracing and is 583 closely associated with testing. Post-mortem debugging 584 is the act of debugging the core dump (memory dump) 585 of a process. Core dumps are often generated after a 586 process has terminated due to an unhandled exception. 587 All three techniques are used at various stages of 588 program development, but (to most programmers) 589 dynamic debugging is the norm. 590 The main activity of dynamic debugging is tracing, 591 which is executing the program one piece at a time, 592 examining the contents of registers and memory, in 593 order to examine the results at each step. There are 594 three ways to trace a program. 595 z Single-stepping: execute one instruction at a time 596 to make sure each instruction is executed 597 correctly. This method is tedious but useful in 598 verifying each step of a program. 599 z Breakpoints: tell the program to stop executing 600 when it reaches a specific instruction. This 601 technique lets one quickly execute selected code 602 sequences to get a high-level overview of the 603 execution behavior. 604 z Watch points: tell the program to stop when a 605 register or memory location changes or when it 606 equals to a specific value. This technique is useful 607 when one doesnt know where or when a value is 608 changed and when this value change likely causes 609 the error. 610 5.3DebuggingTools 611 Debugging can be complex, difficult, and tedious. Like 612 programming, debugging is also highly creative 613 (sometimes more creative than programming). Thus 614 some help from tools is in order. For dynamic 615 debugging, debuggers are widely used and enable the 616 programmer to monitor the execution of a program, 617 stop the execution, re-start the execution, set 618 breakpoints, change values in memory, and even, in 619 some cases, go back in time. 620 For static debugging, there are many so-called static 621 code analysis tools, which look for a specific set of 622 known problems within the source code. Both 623 commercial and free tools exist in various languages. 624 These tools can be extremely useful when checking 625 very large source trees, where it is impractical to do 626 code walkthroughs. The UNIX lint program is an early 627 example. 628 6. Data Structure and Representation 629 [7*s2.1-2.6] 630 Programsworkondata.Butdatamustbeexpressed 631 and organized within computers before being 632 processed by programs. This organization and 633 expression of data for programs use is the subject 634 of data structure and representation. Simply put, a 635 data structure tries to store and organize data in a 636 computer in such a way that the data can be used 637 efficiently. There are many types of data structures 638 andeachtypeofstructureissuitableforsomekinds 639 of applications. For example, B/B+ trees are well 640 suited for implementing massive file systems and 641 databases. 642 6.1DataStructureOverview 643 Data structures are computer representations of data. 644 Data structures are used in almost every program. In a 645 sense, no meaningful program can be constructed 646 without the use of some sort of data structure. Some 647 design methods and programming languages even 648 organize an entire software system around data 649 structures. Fundamentally, data structures are 650 abstractions defined on a collection of data and its 651 associated operations. 652 Often, data structures are designed for improving 653 program or algorithm efficiency. Examples of such 654 data structures include stacks, queues, and heaps. At 655 other times, data structures are used for conceptual 656 unity (abstract data type), such as the name and address 657 of a person. Often, a data structure can determine 658 whether a program runs in a few seconds or in a few 659 hours or even a few days. 660 IEEE SWEBOK Guide V3 From the perspective of physical and logical ordering, 661 a data structure is either linear or non-linear. Other 662 perspectives give rise to different classifications that 663 include homogeneous vs. heterogeneous, static vs. 664 dynamic, persistent vs. transit, external vs. internal, 665 primitive vs. aggregate, recursive vs. non-recursive; 666 passive vs. active; and stateful vs. stateless structures. 667 6.2 Types of Data Structure 668 As mentioned above, different perspectives can be used 669 to classify data structures. However, the predominate 670 perspective used in classification centers on physical 671 and logical ordering between data items. This 672 classification divides data structures into linear and 673 non-linear structures. Linear structures organize data 674 items in a single dimension in which each data entry 675 has one (physical or logical) predecessor and one 676 successor with the exception of the first and last entry. 677 The first entry has no predecessor and the last entry has 678 no successor. Non-linear structures organize data items 679 in two or more dimensions, in which case one entry can 680 have multiple predecessors and successors. Examples 681 of linear structures include lists, stacks, and queues. 682 Examples of non-linear structures include heaps, hash 683 tables, and trees (such as binary trees, balance trees, B- 684 trees, and so forth). 685 Another type of data structure that is often encountered 686 in programming is the so-called compound structure. A 687 compound data structure builds on top of other (more 688 primitive) data structures and, in some way, can be 689 viewed as the same structure as the underlying 690 structure. Examples of compound structures include 691 sets, graphs, and partitions. For example, a partition 692 can be viewed as a set of sets. 693 6.3OperationsonDataStructures 694 All data structures support some operations that 695 produce a specific structure and ordering, or 696 retrieve from the structure relevant data, or store 697 data into the structure. Basic operations supported 698 byalldatastructuresinclude: 699 z Build an initial structure (or initialize the 700 structure). 701 z Insert a data entry into the structure. 702 z Retrieve a data entry from the structure. 703 z Remove a data entry from the structure. 704 Some data structure also support additional 705 operations: 706 z Find a particular element in the structure. 707 z Sort all elements according to some ordering. 708 z Traverse all elements in some specific order. 709 z Reorganize or rebalance the structure. 710 Different structures support different operations 711 with different efficiencies. The difference between 712 operationefficiencycanbesignificant.Forexample, 713 it is easy to retrieve the last item inserted into a 714 stack,butfindingaparticularelementwithinastack 715 isratherslowandtedious. 716 7. Algorithms and Complexity 717 [7* s1.1-1.3, s3.3-3.6, s4.1-4.8, s5.1-5.7, s6.1-6.3, 718 s7.1-7.6, s11.1, s12.1] 719 Programs are not random pieces of code: they are 720 meticulously written to perform userexpected 721 actions. The guide one uses to compose programs 722 are algorithms, which organize various functions 723 intoaseriesofstepsandtakeintoconsiderationthe 724 application domain, the solution strategy, and the 725 data structures being used. An algorithm can be 726 verysimpleorverycomplex. 727 7.1OverviewofAlgorithms 728 Abstractly speaking, algorithms guide the 729 operations of computers and consist of an abstract 730 sequence of actions composed to solve a problem. 731 Alternative definitions include but are not limited 732 to: 733 z An algorithm is any well-defined computational 734 procedure that takes some value or set of values 735 as input and produces some value or set of values 736 as output. 737 z An algorithm is a sequence of computational steps 738 that transform the input into the output. 739 z An algorithm is a tool for solving a well-specified 740 computation problem. 741 Of course, different definitions are favored by 742 different people. Though there is no universally 743 accepted definition, some agreement exists that an 744 algorithm needs to be correct, finite (in other 745 words,terminateeventually),andunambiguous. 746 7.2AttributesofAlgorithms 747 The attributes of algorithms are many and often 748 include modularity, correctness, maintainability, 749 functionality, robustness, userfriendliness, 750 IEEE SWEBOK Guide V3 programmer time, simplicity, and extensibility. But 751 the most important attribute is the socalled 752 performance or efficiency, by which we mean 753 both time and resourceusage efficiency while 754 emphasizing the time axis. To some degree, 755 efficiency determines if an algorithm is feasible or 756 impractical. For example, an algorithm that takes 757 one hundred years to terminate is virtually useless 758 andisevenconsideredincorrect. 759 7.3AlgorithmicAnalysis 760 Analysis of algorithms is the theoretical study of 761 computerprogram performance and resource 762 usage; it determines the goodness of an algorithm. 763 Such analysis usually abstracts away the particular 764 detailsofaspecificcomputerandfocusesontheso 765 calledasymptotic,machineindependentanalysis. 766 There are three basic types of analysis. In worst 767 caseanalysis,onedeterminesthemaximumtimeor 768 resourcesrequiredbythealgorithmonanyinputof 769 sizen.Inaveragecaseanalysis,onedeterminesthe 770 expected time or resources required by the 771 algorithm over all inputs of size n; in performing 772 averagecase analysis, one often needs to make 773 assumptionsonthestatisticaldistributionofinputs. 774 The third type of analysis is the socalled bestcase 775 analysis, in which one determines the minimum 776 time or resources required by the algorithm on any 777 input of size n. Among the three types of analysis, 778 averagecaseanalysisisthemostusefulbutalsothe 779 mostdifficulttoperform. 780 Besides the basic analysis methods, there are also 781 theamortizedanalysis,inwhichonedeterminesthe 782 maximum time required by an algorithm over a 783 sequence of operations; and the competitive 784 analysis, in which one determines the relative 785 performance merit of an algorithm against the 786 optimalalgorithm(whichmaynotbeknown)inthe 787 samecategory(forthesameoperations). 788 7.4AlgorithmicDesignStrategies 789 The design of algorithms generally follows one of 790 the following strategies: brute force, divide and 791 conquer, dynamic programming, and greedy 792 selection. The brute force strategy is actually a no 793 strategy. It exhaustively tries every possible way to 794 tackle a problem. If a problem has a solution, this 795 strategy is guaranteed to find it; however, the time 796 expense may be too high. The divide and conquer 797 strategy improves on the brute force strategy by 798 dividing a big problem into smaller, homogeneous 799 problems. It solves the big problem by recursively 800 solving the smaller problems and combing the 801 solutions to the smaller problems to form the 802 solution to the big problem. The underlying 803 assumption for divide and conquer is that smaller 804 problemsareeasiertosolve. 805 The dynamic programming strategy improves on 806 the divide and conquer strategy by recognizing that 807 someofthesubproblemsproducedbydivisionmay 808 be the same and thus avoids solving the same 809 problems again and again. This elimination of 810 redundant subproblems can dramatically improve 811 efficiency. 812 The greedy selection strategy further improves on 813 dynamicprogrammingbyrecognizingthatnotallof 814 the subproblems contribute to the solution of the 815 big problem. By eliminating all but one sub 816 problem,thegreedyselectionstrategyachievesthe 817 highest efficiency among all algorithm design 818 strategies.Sometimestheuseofrandomizationcan 819 improve on the greedy selection strategy by 820 eliminating the complexity in determining the 821 greedy choice through coin flipping or 822 randomization. 823 7.5AlgorithmicAnalysisStrategies 824 The analysis strategies of algorithms include basic 825 counting analysis, in which one actually counts the 826 number of steps an algorithm takes to complete its 827 task; asymptotic analysis, in which one only 828 considers the order of magnitude of the number of 829 steps an algorithm takes to complete its task; 830 probabilistic analysis, in which one makes use of 831 probabilities in analyzing the average performance 832 of an algorithm; amortized analysis, in which one 833 uses the methods of aggregation, potential, and 834 accounting to analyze the worst performance of an 835 algorithm on a sequence of operations; and 836 competitive analysis, in which one uses methods 837 such as potential and accounting to analyze the 838 relativeperformanceofanalgorithmtotheoptimal 839 algorithm. 840 For complex problems and algorithms, one may 841 IEEE SWEBOK Guide V3 need to use a combination of the aforementioned 842 analysisstrategies. 843 8. Basic Concept of a System 844 [8*c2] 845 Ian Summerville writes that a system is a 846 purposeful collection of interrelated components 847 that work together to achieve some objective 848 [8*p21]. A system can be very simple and include 849 only a few components, like an ink pen, or rather 850 complex, like an aircraft. Depending on whether 851 humans are part of the system, systems can be 852 divided into technical computerbased systems and 853 sociotechnical systems, with the former excluding 854 the human factor and the latter including the 855 human factor. Examples of technical, computer 856 based systems include televisions, mobile phones, 857 andmostpersonalcomputersoftware.Examplesof 858 sociotechnical systems include manned space 859 vehicles, chips embedded inside a human, and so 860 forth. 861 8.1EmergentSystemProperties 862 A system is more than simply the sum of its parts. 863 Thus, the properties of a system are not simply the 864 sum of the properties of its components. Instead, a 865 systemoftenexhibitspropertiesthatareproperties 866 of the system as a whole. These properties are 867 called emergent properties because they emerge 868 only once the system components have been 869 integrated [8*p23]. Emergent system properties 870 can be either functional or nonfunctional. 871 Functional properties describe the functions that 872 can be achieved by the system. For example, an 873 aircrafts functional properties include flotation on 874 air, carrying people or cargo, and use as a weapon 875 of mass destruction. Nonfunctional properties 876 relate to the behavior of the system in its 877 operational environment and can include such 878 things as weight, volume, reliability, security, 879 usability,etc.[8*p23]. 880 8.2SystemEngineering 881 System engineering is the activity of specifying, 882 designing, implementing, validating, deploying, and 883 maintaining systems. The phases of system 884 engineering vary depending on the system being 885 builtbut,ingeneral,includerequirementdefinition, 886 system design, subsystem development, system 887 integration, system installation, system evolution, 888 andsystemdecommissioning. 889 Many practical guidelines have been produced in 890 thepasttoaidpeopleinperformingtheactivitiesof 891 each phase. For example, system design can be 892 brokenintosmallertasksofrequirementdefinition, 893 identification of subsystems, assignment of 894 requirements to subsystems, specification of sub 895 system functionality, definition of subsystem 896 interfaces,andsoforth. 897 8.3OverviewofaComputerSystem 898 Among all the systems, one that is obviously 899 relevant to the software engineering community is 900 thecomputersystem.Acomputerisamachinethat 901 executes programs or software. It consists of a 902 purposeful collection of mechanical, electrical, and 903 electronic components with each component 904 performing a preset function. Jointly, these 905 components are able to perform the computations 906 thataregivenbyaprogram. 907 Abstractly speaking, a computer receives some 908 input, stores and manipulates some data, and 909 provides some output. The most distinct feature of 910 a computer is its ability to store and execute 911 sequences of instructions called programs. An 912 interesting phenomenon concerning the computer 913 is the universal equivalence in functionality. 914 According to Turing, all computers with a certain 915 minimumcapabilityareequivalentintheirabilityto 916 perform computation tasks. In other words, given 917 enough time and memory, all computersranging 918 fromanetbooktoasupercomputerarecapableof 919 computing exactly the same things, irrespective of 920 speed,size,cost,oranythingelse. 921 Most computer systems have a structure that is 922 known as the von Neumann model, which 923 consists of five components: a memory for storing 924 instructions and data, a processing unit for 925 performing arithmetic and logical operations, a 926 control unit for sequencing and interpreting 927 instructions, input for getting external information 928 into the memory, and output for producing results 929 for the user. The basic components of a computer 930 system based on the von Neumann model is 931 depictedinFigure2. 932 IEEE SWEBOK Guide V3 933 Figure2.Basiccomponentsofacomputersystem 934 basedonthevonNeumannmodel 935 9. Computer Organization 936 [9*c1-c4] 937 From the perspective of a computer, a wide 938 semantic gap exists between its intended behavior 939 and the workings of the underlying electronic 940 devices that actually do the work within the 941 computer. This gap is bridged through computer 942 organization, which meshes various electrical, 943 electronic, and mechanical devices into one device 944 that forms a computer. The objects that computer 945 organization deals with are the devices, 946 connections, and controls. The abstraction built in 947 computerorganizationisthecomputer. 948 9.1ComputerOrganizationOverview 949 A computer generally consists of a CPU, memory, 950 input devices, and output devices. Abstractly 951 speaking, the organization of a computer can be 952 divided into four levels (Figure 3). The macro 953 architecture level is the formal specification of all 954 thefunctionsaparticularmachinecancarryoutand 955 is known as the instruction set architecture (ISA). 956 The micro architecture level is the implementation 957 of the ISA in a specific CPUin other words, the 958 way in which the ISAs specifications are actually 959 carriedout.Thelogiccircuitslevelisthelevelwhere 960 each functional component of the micro 961 architecture is built up of circuits that make 962 decisionsbasedonsimplerules.Thedeviceslevelis 963 the level where, finally, each logic circuit is actually 964 built of electronic devices such as complementary 965 metaloxide semiconductors (CMOS), nchannel 966 metal oxide semiconductors (NMOS), or gallium 967 arsenide(GaAs)transistors,andsoforth. 968 Macro Architecture Level (ISA) Micro Architecture Level Logic Circuits Level Devices Level Figure3.Machinearchitecturelevels 969 Eachlevelprovidesanabstractiontothelevelabove 970 and is dependent on the level below. To a 971 programmer, the most important abstraction is the 972 ISA, which specifies such things as the native data 973 types,instructions,registers,addressingmodes,the 974 memory architecture, interrupt and exception 975 handling, and the external I/Os. Overall, the ISA 976 specifies theability of a computer andwhat can be 977 doneonthecomputerwithprogramming. 978 9.2 Digital Systems 979 Atthelowestlevel,computationsarecarriedoutby 980 the electrical and electronic devices within a 981 computer. These electrical/electronic devices 982 operateinoneoftwomodes:onandoff.Whenwe 983 encode the on and off state with digital one and 984 zero, respectively, a computer becomes a digital 985 system. Everythingincluding instruction and 986 datais expressed or encoded using digital zeros 987 and ones. For example, decimal value 6 can be 988 encoded as 110, the addition instruction may be 989 encodedas0001,andsoforth. 990 9.3 Digital Logic 991 Obviously,logicsareneededtomanipulatedataand 992 to control the operation of computers. This logic, 993 which is behind a computers proper function, is 994 called digital logic because it deals with the 995 operations of digital zeros and ones. Digital logic 996 specifies the rules both for building various digital 997 devices from the simplest elements (such as 998 transistors) and for governing the operation of 999 digital devices. For example, digital logic spells out 1000 what the value will be if a zero and one is ANDed, 1001 ORed, or exclusively ORedtogether. It also specifies 1002 how to build decoders, multiplexers (MUX), 1003 memory, and adders that are used to assemble the 1004 computer. 1005 9.4ComputerExpressionofData 1006 IEEE SWEBOK Guide V3 As mentioned before, a computer expresses data 1007 with electrical signals or digital zero and one. Since 1008 there are only two different digits used in data 1009 expression, such a system is called a binary 1010 expression system. Due to the inherent nature of a 1011 binary system, the maximum numerical value 1012 expressible by an nbits binary code is 2 n 1. 1013 Specifically, binary number a n a n1 a 1 a 0 corresponds 1014 toa n 2 n +a n1 2 n1 ++a 1 2 1 +a 0 2 0 .Thus,thenumerical 1015 value of the binary expression of 1011 is 1016 18+0x4+1x2+11=11. To express a nonnumerical 1017 value, we need to decide the number of zeros and 1018 onestouseandtheorderinwhichthosezerosand 1019 onesarearranged. 1020 Of course, there are different ways to do the 1021 encoding, and this gives rise to different data 1022 expressionschemesandsubschemes.Forexample, 1023 integers can be expressed in the form of unsigned, 1024 ones complement, or twos complement. For 1025 characters, there are ASCII, Unicode, and IBMs 1026 EBCDICstandards.Forfloatingpointnumbers,there 1027 areIEEE754FP1,2,and3standards. 1028 9.5TheCentralProcessingUnit(CPU) 1029 The central processing unit is the place where 1030 instructions (or programs) are actually executed. 1031 The execution usually takes several steps, including 1032 fetching the program instruction, decoding the 1033 instruction, fetching operands, performing 1034 arithmetic and logical operations on the operands, 1035 and storing the result. The main components of a 1036 CPUconsistofregisterswhereinstructionsanddata 1037 are often read from and written to, the arithmetic 1038 and logic unit (ALU) that performs the actual 1039 arithmetic (such as addition, subtraction, 1040 multiplication,anddivision)andlogic(suchasAND, 1041 OR, shift, and so forth) operations, the control unit 1042 that is responsible for producing proper signals to 1043 control the operations, and various (data, address, 1044 and control) buses that link the components 1045 together and transport data to and from these 1046 components. 1047 9.6MemorySystemOrganization 1048 Memory is the storage unit of a computer. It 1049 concerns the assembling of a largescale memory 1050 system from smaller and singledigit storage units. 1051 The main topics covered by memory system 1052 architectureincludethefollowing: 1053 z Memory cells and chips 1054 z Memory boards and modules 1055 z Memory hierarchy and cache 1056 z Memory as a sub-system of the computer 1057 Memory cells and chips deal with singledigital 1058 storageandtheassemblingofsingledigitunitsinto 1059 onedimensional memory arrays as well as the 1060 assembling of onedimensional storage arrays into 1061 multidimensional storage memory chips. Memory 1062 boards and modules concern the assembling of 1063 memorychipsintomemorysystems,withthefocus 1064 being on the organization, operation, and 1065 management of the individual chips in the system. 1066 Memory hierarchy and cache are used to support 1067 efficient memory operations. Memory as a sub 1068 system deals with the interface between the 1069 memorysystemandotherpartsofthecomputer. 1070 9.7 Input and Output (I/O) 1071 A computer is useless without I/O. Common input 1072 devices include the keyboard and mouse; common 1073 output devices include the disk, the screen, the 1074 printer, and speakers. Different I/O devices operate 1075 at different data rates and reliabilities. How 1076 computers connect and manage various input and 1077 output devices to facilitate the interaction between 1078 computers and humans (or other computers) is the 1079 focusoftopicsinI/O.Themainissuesthatmustbe 1080 resolved in input and output are the ways I/O can 1081 andshouldbeperformed. 1082 In general, I/O is performed at both hardware and 1083 software levels. Hardware I/O can be performed in 1084 anyofthreeways.DedicatedI/OdedicatestheCPU 1085 to the actual input and output operations during 1086 I/O; memorymapped I/O treats I/O operations as 1087 memory operations; and hybrid I/O combines 1088 dedicatedI/OandmemorymappedI/Ointoasingle 1089 holisticI/Ooperationmode. 1090 Coincidentally, software I/O can also be performed 1091 inoneofthreeways.ProgrammedI/OletstheCPU 1092 wait while the I/O device is doing I/O; interrupt 1093 driven I/O lets the CPUs handling of I/O be driven 1094 bytheI/Odevice;anddirectmemoryaccess(DMA) 1095 lets I/O be handled by a secondary CPU embedded 1096 in a socalled DMA device (or channel). (Except 1097 IEEE SWEBOK Guide V3 during the initial setup, the main CPU is not 1098 disturbedduringaDMAI/Ooperation.) 1099 Regardless of the types of I/O scheme being used, 1100 the main issues involved in I/O include I/O 1101 addressing (which deals with the issue of how to 1102 identify the I/O device for a specific I/O operation), 1103 synchronization (which deals with the issue of how 1104 to make the CPU and I/O device work in harmony 1105 during I/O), and error detection and correction 1106 (which deals with the occurrence of transmission 1107 errors). 1108 10. Compiler Basics 1109 [6*s6.4 9*s8.4] 1110 10.1CompilerOverview 1111 To be understood by a computer, programs written 1112 in highlevel programming languages must be 1113 translatedintothespecificmachinelanguageofthe 1114 computer under consideration. This translation is 1115 usually performed by a piece of software called a 1116 compiler. This process of translation from a high 1117 level language to a machine language is called 1118 compilation(or,sometimes,interpretation). 1119 The primary tasks of a compiler may include 1120 preprocessing, lexical analysis, parsing, semantic 1121 analysis, code generation, and code optimization. 1122 Program faults caused by incorrect compiler 1123 behaviorcanbeverydifficulttotrackdown.Forthis 1124 reason, compiler implementers invest a lot of time 1125 ensuringthecorrectnessoftheirsoftware. 1126 10.2 Interpretation and Compilation 1127 There are two ways to translate a program written 1128 in a higherlevel language into machine code: 1129 interpretation and compilation. Interpretation 1130 translates the source code one piece at a time into 1131 machinelanguage,executesitonthespot,andthen 1132 goes back for another piece. Both the highlevel 1133 language source code and the interpreter are 1134 requiredeverytimetheprogramisrun. 1135 Compilation translates the highlevellanguage 1136 source code into an entire machinelanguage 1137 program (an executable image) by a programcalled 1138 a compiler. After compilation, only the executable 1139 image is needed to run the program. Most 1140 applicationsoftwareissoldinthisform. 1141 10.3 The Compilation Process 1142 Compilation is a complex task. Most compilers 1143 divide the compilation process into many phases. A 1144 typicalbreakdownisasfollows: 1145 z Lexical Analysis 1146 z Syntax Analysis or Parsing 1147 z Semantic Analysis 1148 z Code Generation 1149 Lexicalanalysispartitionstheinputtext(thesource 1150 code), which is a sequence of characters, into 1151 separate comments, which are to be ignored in 1152 subsequent actions, and basic symbols, which have 1153 lexical meanings. These basic symbols must 1154 correspond to some terminal symbols of the 1155 grammaroftheparticularprogramminglanguage. 1156 Syntaxanalysisisbasedontheresultsofthelexical 1157 analysis and discovers the structure in the program 1158 and determines whether or not a text conforms to 1159 an expected format. "Is this a textually correct C++ 1160 program?" or "Is this entry textually correct?" are 1161 typical questions that can be answered by syntax 1162 analysis. Syntax analysis determines if the source 1163 code of a program is correct and converts it into a 1164 more structured representation (parse tree) for 1165 semanticanalysisortransformation. 1166 Semanticanalysisaddssemanticinformationtothe 1167 parsetreebuiltduringthesyntaxanalysisandbuilds 1168 the symbol table. It performs various semantic 1169 checks that include type checking, object binding 1170 (associating variable and function references with 1171 theirdefinitions),anddefiniteassignment(requiring 1172 all local variables to be initialized before use). If 1173 mistakes are found, the incorrect programs are 1174 rejected. 1175 Once semantic analysis is complete, the phase of 1176 code generation begins and transforms the 1177 intermediate code produced in the previous phases 1178 into the native machine language of the computer 1179 under consideration. This involves resource and 1180 storagedecisionssuchasdecidingwhichvariables 1181 to fit into registers and memory and the selection 1182 andschedulingofappropriatemachineinstructions, 1183 alongwiththeirassociatedaddressingmodes. 1184 IEEE SWEBOK Guide V3 It is often possible to combine multiple phases into 1185 one pass over the code in a compiler 1186 implementation. Some compilers also have a 1187 preprocessing phase at the beginning or after the 1188 lexical analysis that does necessary housekeeping 1189 work, such as processing the program instructions 1190 for the compiler (directives). Some compilers 1191 provide an optional optimization phase at the end 1192 oftheentirecompilationtooptimizethecode(such 1193 as the rearrangement of instruction sequence) for 1194 efficiency and other desirable objectives requested 1195 bytheusers. 1196 11. Operating Systems Basics 1197 [6*c3] 1198 Every system of meaningful complexity needs to be 1199 managed. A computer, as a rather complex 1200 electricalmechanical system, needs its own 1201 manager for managing the resources and activities 1202 occurringonit.Thatmanageriscalledanoperating 1203 system(OS). 1204 11.1 Operating Systems Overview 1205 Operating systems manage and beautify computers 1206 such that they can be correctly and easily used by 1207 humans. Conceptually, an operating system is a 1208 computer program that manages the hardware 1209 resourcesandmakesiteasiertousebyapplications 1210 by presenting nice abstractions. This nice 1211 abstraction is often called the virtual machine and 1212 includes such things as processes, virtual memory, 1213 and file systems. An OS hides the complexity of the 1214 underlying hardware and is found on almost any 1215 moderncomputer. 1216 The principal roles played by OSs are management 1217 and illusion. Management refers to the OSs 1218 management (allocation and recovery) of physical 1219 resources among multiple competing 1220 users/applications/tasks. Illusion refers to the nice 1221 abstractionstheOSprovides. 1222 11.2TasksofanOperatingSystem 1223 The tasks of an operating system differ significantly 1224 depending on the machine and time of its invention. 1225 However, modern operating systems have come to 1226 agreement as to the tasks that must be performed by an 1227 OS. These tasks include CPU management, memory 1228 management, disk management (file system), I/O 1229 device management, and security and protection. Each 1230 OS task manages one type of physical resource. 1231 Specifically, CPU management deals with the 1232 allocation and releases of the CPU among competing 1233 programs (called processes/threads in OS jargon), 1234 including the operating system itself. The main 1235 abstraction provided by CPU management is the 1236 process/thread model. Memory management deals with 1237 the allocation and release of memory space among 1238 competing processes, and the main abstraction 1239 provided by memory management is virtual memory. 1240 Disk management deals with the sharing of magnetic 1241 disks among multiple programs/users and its main 1242 abstraction is file system. I/O device management deals 1243 with the allocation and releases of various I/O devices 1244 among competing processes. Security and protection 1245 deal with the protection of computer resources from 1246 illegal use. 1247 11.3OperatingSystemAbstractions 1248 The arsenal of OSs is abstraction. Corresponding to the 1249 five physical tasks, OSs use five abstractions: 1250 process/thread, virtual memory, file systems, 1251 input/output, and protection domains. The overall OS 1252 abstraction is the virtual machine. 1253 For each task area of OS, there is both a physical 1254 reality and a conceptual abstraction. The physical 1255 reality refers to the hardware resource under 1256 management; the conceptual abstraction refers to 1257 theinterfacetheOSpresentstotheusers/programs 1258 above.Forexample,inthethreadmodeloftheOS, 1259 thephysicalrealityistheCPUandtheabstractionis 1260 multiple CPUs. Thus, a user doesnt have to worry 1261 aboutsharingtheCPUwithotherswhenworkingon 1262 the abstraction provided by an OS. In the virtual 1263 memoryabstractionofanOS,thephysicalrealityis 1264 the physical RAM or ROM (whatever), the 1265 abstraction is multiple unlimited memory space. 1266 Thus, a user doesnt have to worry about sharing 1267 physical memory with others or about limited 1268 physicalmemorysize. 1269 11.4OperatingSystemsClassification 1270 Different operating systems can have different 1271 functionality implementation. In the early days of 1272 thecomputerera,operatingsystemswererelatively 1273 simple. As time goes on, the complexity and 1274 sophistication of operating systems increases 1275 significantly.Fromahistoricalperspective,operating 1276 IEEE SWEBOK Guide V3 systemcanbeclassifiedasoneofthefollowing. 1277 z Batching OS: organizes and processes work in 1278 batches. Examples of such OSs include IBMs 1279 FMS, IBSYS, and University of Michigans 1280 UMES. 1281 z Multi-programmed batching OS: adds multi-task 1282 capability into earlier simple batching OSs. An 1283 example of such an OS is IBMs OS/360. 1284 z Time-sharing OS: adds multi-task and interactive 1285 capabilities into the OS. Examples of such OSs 1286 include UNIX, Linux, and NT. 1287 z Real-time OS: adds timing predictability into the 1288 OS by scheduling individual tasks according to 1289 each tasks completion deadlines. Examples of 1290 such OS include VxWorks (WindRiver) and 1291 DART (EMC). 1292 z Distributed OS: adds the capability of managing a 1293 network of computers into the OS. 1294 z Embedded OS: has limited functionality and is 1295 used for embedded systems such as cars and 1296 PDAs. Examples of such OSs include Palm OS, 1297 Windows CE, and TOPPER. 1298 Alternatively, an OS can be classified by its 1299 applicable target machine/environment into the 1300 following. 1301 z Mainframe OS: runs on the mainframe computers 1302 and include OS/360, OS/390, AS/400, MVS, and 1303 VM. 1304 z Server OS: runs on workstations or servers and 1305 includes such systems as UNIX, Windows, Linux, 1306 and VMS. 1307 z Multi-computer OS: runs on multiple computers 1308 and include such examples as Novell Netware. 1309 z Personal computers OS: runs on personal 1310 computers and include such examples as DOS, 1311 Windows, Mac OS, and Linux. 1312 The most popular modern operating systems today 1313 include Microsoft Windows, UNIX, Linux, and Mac 1314 OSX. 1315 12. Database Basics and Data Management 1316 [6*c9] 1317 A database consists of an organized collection of data 1318 for one or more uses. In a sense, a database is a 1319 generalization and expansion of data structures. But the 1320 difference is that a database is usually external to 1321 individual programs and permanent in existence 1322 compared to data structures. Databases are used when 1323 the data volume is large or logical relations between 1324 data items are important. The factors considered in 1325 database design include performance, concurrency, 1326 integrity, and recovery from hardware failures. 1327 12.1EntityandSchema 1328 The things a database tries to model and store are 1329 called entities. Entities can be realworld objects 1330 such as persons, cars, houses, and so forth,. or 1331 abstract concepts such as human, salary, names, 1332 and so forth. An entity can be primitive such as a 1333 name or composite (such as an employee that 1334 consists of a name, identification number, salary, 1335 address,andsoforth). 1336 The single most important concept in a database is 1337 the socalled schema, which is a description of the 1338 entire database structure from which all other 1339 database activities are built. A schema defines the 1340 relationships between the various entities that 1341 compose a database. For example, a schema for a 1342 company payroll system would consist of such 1343 things as employee ID, name, salary rate, address, 1344 and so forth. Database software maintains the 1345 databaseaccordingtotheschema. 1346 Another important concept in database is the so-called 1347 database models that describe the type of relationship 1348 among various entities. The commonly used models 1349 include relational, network, and object models. 1350 12.2 Database Management Systems (DBMS) 1351 The software that manages the database is called 1352 thedatabasemanagementsystem(DBMS).ADBMS 1353 controls the creation, maintenance, and use of the 1354 databaseandisusuallycategorizedaccordingtothe 1355 database model it supportssuch as the relational, 1356 network, or objectmodel.For example,a relational 1357 databasemanagementsystem(RDBMS)implements 1358 featuresoftherelationalmodel.Anobjectdatabase 1359 managementsystem (ODBMS) implements features 1360 oftheobjectmodel. 1361 12.3DatabaseQueryLanguage 1362 Users/applicationsinteractwithadatabasethrough 1363 a database query language, which is a specialized 1364 programming language tailored to database use. 1365 The database model tends to determine the query 1366 languagesthatareavailabletoaccessthedatabase. 1367 One commonly used query language for the 1368 IEEE SWEBOK Guide V3 relational database is the structured query 1369 language, more commonly abbreviated as SQL. A 1370 commonquerylanguageforobjectdatabasesisthe 1371 object query language (abbreviated as OQL), 1372 although not all vendors implement this. An 1373 exampleofanSQLquerymaylooklike: 1374 SELECTComponent_No,Quantity 1375 FROMCOMPONENT 1376 WHEREItem_No=100 1377 TheabovequeryselectsalltheComponent_Noand 1378 its corresponding quantity from a database table 1379 called COMPONENT, where the Item_No equals to 1380 100. 1381 12.4TasksofDBMSPackages 1382 ADBMSsystemprovidesthefollowingcapabilities: 1383 z Database development is used to define and 1384 organize the content, relationships, and structure 1385 of the data needed to build a database. 1386 z Database interrogation is used for accessing the 1387 data in a database for information retrieval and 1388 report generation. End users can selectively 1389 retrieve and display information and produce 1390 printed reports. This is the operation that most 1391 users know about databases. 1392 z Database Maintenance is used to add, delete, 1393 update, correct, and protect the data in a database. 1394 z Application Development is used to develop 1395 prototypes of data entry screens, queries, forms, 1396 reports, tables, and labels for a prototyped 1397 application. It also refers to the use of 4GL (or 4th 1398 Generation Language) or application generators to 1399 develop or generate program code. 1400 12.5DataManagement 1401 A database must manage the data stored in it. This 1402 management includes both organization and 1403 storage. 1404 The organization of the actual data in a database 1405 depends on the database model. In a relational 1406 model, data are organized as tables with different 1407 tables representing different entities or relations 1408 among a set of entities. The storage of data deals 1409 with the storage of these database tables on disks. 1410 The common ways for achieving this is to use files. 1411 Sequential, indexed, and hash files are all used in 1412 this purpose with different file structures providing 1413 differentaccessperformanceandconvenience. 1414 12.6DataMining 1415 One often has to know what to look for before 1416 querying a database. This type of access does not 1417 make full use of the vast amount of information 1418 stored in the database, and in fact reduces the 1419 database into a collection of discrete records that 1420 are not related with each other. To take full 1421 advantageofadatabase,onecanperformstatistical 1422 analysis and pattern discovery on the content of a 1423 database using a technique called data mining. 1424 Suchoperationscanbeusedtosupportanumberof 1425 business activities that include, but are not limited 1426 to,marketing,frauddetection,andtrendanalysis. 1427 Numerous ways for performing data mining have 1428 been invented in the past decade and include such 1429 common techniques as class description, class 1430 discrimination,clusteranalysis,associationanalysis, 1431 andoutlieranalysis. 1432 13. Network Communication Basics 1433 [9*c12] 1434 A computer network connects a collection of 1435 computers and allows users of different computers 1436 to share resources with other users. A network 1437 facilitates the communications between all the 1438 connectedcomputersandmaygivetheillusionofa 1439 single, omnipotent computer. Every computer or 1440 device connected to a network is called a network 1441 node. 1442 Anumberofcomputingparadigmshaveemergedto 1443 benefitfromthefunctionsandcapabilitiesprovided 1444 by computer networks. These paradigms include 1445 distributed computing, grid computing, Internet 1446 computing,andcloudcomputing. 1447 13.1TypesofNetwork 1448 Computer networks are not all the same and may 1449 be classified according to a wide variety of 1450 characteristics, including the networks connection 1451 method, wired technologies, wireless technologies, 1452 scale, network topology, functions, and speed. But 1453 theclassificationthatisfamiliartomostisbasedon 1454 thescaleofnetworking. 1455 z Personal Area Network/Home Network is a 1456 computer network used for communication 1457 IEEE SWEBOK Guide V3 among computer(s) and different information 1458 technological devices close to one person. The 1459 devices connected to such a network may include 1460 PCs, faxes, PDAs, and TVs. This is the base on 1461 which the Internet of Things is built. 1462 z Local Area Network (LAN) connects computers 1463 and devices in a limited geographical area, such 1464 as a school campus, computer laboratory, office 1465 building, or closely positioned group of buildings. 1466 z Campus Network is a computer network made up 1467 of an interconnection of local area networks 1468 (LANs) within a limited geographical area. 1469 z Wide area network (WAN) is a computer network 1470 that covers a large geographic area, such as a city 1471 or country or even across intercontinental 1472 distances. A WAN limited to a city is sometimes 1473 called a Metropolitan Area Network. 1474 z Internet is the global network that connects 1475 computers located in many (perhaps all) 1476 countries. 1477 Other classifications may divide networks into 1478 control networks, storage networks, virtual private 1479 networks (VPN), wireless networks, pointtopoint 1480 networks,andInternetofThings. 1481 13.2BasicNetworkComponents 1482 All networks are made up of the same basic hardware 1483 components, including computers, network interface 1484 cards (NICs), bridges, hubs, switches, and routers. All 1485 these components are called nodes in the jargon of 1486 networking. Each component performs a distinctive 1487 function that is essential for the packaging, connection, 1488 transmission, amplification, controlling, unpacking, 1489 and interpretation of the data. For example, a repeater 1490 amplifies the signals, a switch performs many-to-many 1491 connections, a hub performs one-to-many connections, 1492 an interface card is attached to the computer and 1493 performs data packing and transmission, a bridge 1494 connects one network with another, and a router is a 1495 computer itself and performs data analysis and flow 1496 control to regulate the data from the network. 1497 The functions performed by various network 1498 components correspond to the functions specified by 1499 one or more levels of the seven-layer OSI networking 1500 model depicted in Figure 4. 1501 13.3NetworkingProtocolsandStandards 1502 Computers talk with each other using protocols, 1503 which specify the format and regulations used to 1504 pack and unpack data. To facilitate easier 1505 communication and better structure, network 1506 protocolsaredividedintodifferentlayerswitheach 1507 layer dealing with one aspect of the 1508 communication. For example, the physical layers 1509 deal with the physical connection between the 1510 partiesthataretocommunicate,thedatalinklayer 1511 deals with the raw data transmission and flow 1512 control, and the network layer deals with the 1513 packing and unpacking of data into a particular 1514 format that is understandable by the relevant 1515 parties. The most commonly used OSI networking 1516 model organizes network protocols into seven 1517 layers,asdepictedinFigure4. 1518 Figure 4. The seven-layer OSI networking model 1519 Application Layer Presentation Layer Session Layer Transport Layer Network Layer Data link Layer Physical Layer One thing to note is that not all network protocols 1520 implementalllayersoftheOSImodel.Forexample, 1521 the TCP/IP protocol implements neither the 1522 presentationlayernorthesessionlayer. 1523 Therecanbemorethanoneprotocolforeachlayer. 1524 Forexample,UDPandIPbothworkonthenetwork 1525 layer. Physical layer protocols include token ring, 1526 Ethernet, fast Ethernet, gigabit Ethernet, and 1527 wireless. Data link layer protocols include frame 1528 relay, asynchronous transfer mode (ATM), and 1529 PointtoPoint Protocol (PPP). Application layer 1530 protocols include Fibre channel, Small Computer 1531 System Interface (SCSI), and Bluetooth. For each 1532 layeroreveneachindividualprotocol,theremaybe 1533 standards established by national or international 1534 organizations to guide the design and development 1535 ofthecorrespondingprotocols. 1536 13.4TheInternet 1537 The Internet is a global system of interconnected 1538 governmental, academic, corporate, public, and 1539 private computer networks. Most people gain 1540 access to the Internet through the service of so 1541 called Internet service providers (ISP). The ISP 1542 maintains one or more switching centers called a 1543 point of presence, which is often known to users 1544 through their domain names such 1545 as.com,.gov,.org,.net,and.cn. 1546 IEEE SWEBOK Guide V3 13.5InternetofThings 1547 The Internet of Things refers to the networking of 1548 everyday objectssuch as cars, cell phones, PDAs, 1549 TVs, refrigerators, and even buildingsusing wired 1550 or wireless networking technologies. The function 1551 andpurposeofInternetofThingsistointerconnect 1552 allthingstofacilitateautonomousandbetterliving. 1553 Technologies used in the Internet of Things include 1554 RFID, wireless and wired networking, sensor 1555 technology, and much software of course. As the 1556 paradigm of Internet of Things is still shaping up, 1557 many works are needed for Internet of Things to 1558 actuallygainwidespreadacceptance. 1559 13.6VirtualPrivateNetwork(VPN) 1560 A Virtual Private Network is a computer network in 1561 which some or all of the links between the nodes 1562 arecarriedbyopenconnectionsorvirtualcircuitsin 1563 some larger network (for example, the Internet). 1564 VPNs are most often used to separate the traffic of 1565 differentusercommunitiesandoftenprovidebetter 1566 securitytousers. 1567 14. Parallel and Distributed Computing 1568 [9*c9] 1569 Parallel computing is a computing paradigm that 1570 emerges with the development of multi-functional units 1571 within a computer. The main objective of parallel 1572 computing is to execute several tasks simultaneously 1573 on different functional units and thus improve 1574 throughput or response or both. With the emergence of 1575 computer networks, parallel computing has also taken 1576 on a new perspective in which the parallelism occurs 1577 on a greater scale. Distributed computing, on the other 1578 hand, is a computing paradigm that emerges with the 1579 development of computer networks. Its main objective 1580 is to either make use of multiple computers in the 1581 network to accomplish things otherwise not possible 1582 within a single computer or improve computation 1583 efficiency by harnessing the power of multiple 1584 computers. 1585 14.1 Parallel and Distributed Computing Overview 1586 Traditionally, parallel computing investigates ways to 1587 maximize concurrency (the simultaneous execution of 1588 multiple tasks) within the boundary of a computer. 1589 Distributed computing studies distributed systems, 1590 which consists of multiple autonomous computers that 1591 communicate through a computer network. 1592 Alternatively, distributed computing can also refer to 1593 the use of distributed systems to solve computational or 1594 transactional problems. In the former definition, 1595 distributed computing investigates the protocols, 1596 mechanisms, and strategies that provide the foundation 1597 for distributed computation; in the latter definition, 1598 distributed computing studies the ways of dividing a 1599 problem into many tasks and assigning such tasks to 1600 various computers involved in the computation. 1601 Fundamentally, distributed computing is another 1602 form of parallel computing, albeit on a grander 1603 scale.Indistributedcomputing,thefunctionalunits 1604 are not ALU, FPU, or separate cores, but individual 1605 computers. For this reason, some people regard 1606 distributedcomputingasbeingthesameasparallel 1607 computing. Because both distributed and parallel 1608 computing involve some form of concurrency, they 1609 arebothalsocalledconcurrentcomputing. 1610 14.2 Difference between Parallel and Distributed 1611 Computing 1612 Though parallel and distributed computing 1613 resemble each other on the surface, there is a 1614 subtle but real distinction between them: parallel 1615 computing does not necessarily refer to the 1616 execution of programs on different computers 1617 instead, they can be run on different processors 1618 within a single computer. In fact, consensus among 1619 computing professionals limits the scope of parallel 1620 computing to the case where a shared memory is 1621 used by all processors involved in the computing, 1622 while distributed computing refers to computations 1623 where private memory exists for each processor 1624 involvedinthecomputations. 1625 Thus,itispossibletoclassifyconcurrentsystemsas 1626 being "parallel" or "distributed" based on the 1627 existence or nonexistence of shared memory 1628 among all the processors. Thus, parallel computing 1629 deals with computations within a single computer; 1630 distributed computing deals with computations 1631 within a set of computers. According to this view, 1632 multicore computing is a form of parallel 1633 computing. 1634 14.3ParallelandDistributedComputingModels 1635 Since multiple computers/processors/cores are 1636 involved in distributed/parallel computing, some 1637 coordination among the involved parties is 1638 necessarytoensurecorrectbehaviorofthesystem. 1639 IEEE SWEBOK Guide V3 Different ways of coordination give rise to different 1640 computing models. The most common models in 1641 this regard are the shared memory (parallel) model 1642 andthemessagepassing(distributed)model. 1643 In a shared memory (parallel) model, all computers 1644 have access to a shared memory. The algorithm 1645 designerchoosestheprogramforexecutionbyeach 1646 computer. Access to the memory can be 1647 synchronous or asynchronous, and must be 1648 coordinated such that coherency is maintained. 1649 Different access models have been invented for 1650 suchapurpose. 1651 In a messagepassing (distributed) model, all 1652 computersrunthesameprogram.Thesystemmust 1653 work correctly regardless of the structure of the 1654 network. This model can be further classified into 1655 clientserver (CS), browserserver (BS), and many 1656 layers (ML) models. In the CS model, the server 1657 provides services and the client requests services 1658 from the server. In the socalled BS model, the 1659 server provides services and the client is the 1660 browser. In the ML model, each layer provides 1661 services to the layer immediately above it and 1662 requestsservicesfromthelayerimmediatelybelow 1663 it. In fact, the ML model can be seen as a chain of 1664 clientserver models. Often, the layers between the 1665 bottommost layer and the topmost layer are called 1666 middleware, which is a distinct subject of study in 1667 itsownright. 1668 14.4 Main Issues in Distributed Computing 1669 Coordination among all the components in a 1670 distributed computing environment is often 1671 complex and timeconsuming. As the number of 1672 cores/CPUs/computers increases, the complexity of 1673 distributed computing also increases. Among the 1674 many issues faced, memory coherency and 1675 consensus among all computers are the most 1676 difficult ones. Many computation paradigms have 1677 been invented to solve these problems and are the 1678 main discussion issues in distributed/parallel 1679 computing. 1680 15. Basic User Human Factors 1681 [4*c8, 5*c5] 1682 Software is developed to meet human desires or 1683 needs. Thus, all software design and development 1684 must take into consideration humanuser factors 1685 suchashowpeopleusesoftware,howpeopleview 1686 software, and what humans expect from software. 1687 There are numerous factors in the humanmachine 1688 interaction, but the basic humanuser factors 1689 considered here include input/output, the handling 1690 of error messages, and the robustness of the 1691 softwareingeneral. 1692 15.1InputandOutput 1693 Input and output are the interfaces between users 1694 andsoftware.Softwareisuselesswithoutinputand 1695 output. Humans design software to process some 1696 input and produce desirable output. All software 1697 engineers must consider input and output as an 1698 integral part of the software product they engineer 1699 ordevelop. Issuesconsideredforinputinclude(but 1700 arenotlimitedto): 1701 z What input is required? 1702 z How is the input passed from users to computers? 1703 z What is the most convenient way for users to 1704 enter input? 1705 z What format does the computer require of the 1706 input data? 1707 For output, we need to consider what the users 1708 wishtosee: 1709 z In what format would users like to see output? 1710 z What is the most pleasing way to display output? 1711 Furthermore,ifthepartyreceivingtheoutputisnot 1712 human but another computer or control system, 1713 then we need to consider the output type and 1714 format that the software should produce to ensure 1715 properdatafeedintoanothersystem. 1716 There are many rules of thumb for developers to 1717 follow to produce good input/output for a system. 1718 These rules of thumb include simple and natural 1719 dialogue, speaking users language, minimizing user 1720 memoryload,andconsistency. 1721 15.2ErrorMessages 1722 Itisunderstandablethatsoftwarecontainsbugsand 1723 failsfromtimetotime.Butusersshouldbenotified 1724 if there is anything that impedes the smooth 1725 execution of the program. Nothing is more 1726 IEEE SWEBOK Guide V3 frustrating than an unexpected termination or 1727 behavioral deviation of software without any 1728 warning or explanation. To be user friendly, the 1729 software should report all error conditions to the 1730 users or upperlevel applications so that some 1731 measure can be taken to rectify the situation or to 1732 exit gracefully. There are several guidelines that 1733 definewhatconstitutesagooderrormessage:error 1734 messagesshouldbeclear,tothepoint,andtimely. 1735 First, error messages should clearly explain what is 1736 happening so that users know what is going on in 1737 thesystem.Second,errormessagesshouldpinpoint 1738 the cause of the error, if at all possible, so that 1739 proper actions can be taken. Third, error messages 1740 should be displayed right when the error condition 1741 occurs. According to Jakob Nielsen, Good error 1742 messagesshouldbeexpressedinplainlanguage(no 1743 codes), precisely indicate the problem, and 1744 constructivelysuggestasolution[5*p20]. 1745 15.3SoftwareRobustness 1746 Softwarerobustnessreferstotheabilityofsoftware 1747 to tolerate erroneous inputs. Software is said to be 1748 robust if it continues to function even when 1749 erroneousinputsaregiven.Thus,itisunacceptable 1750 for software to simply crash when encountering an 1751 input problem as this breaks the abstraction and 1752 may cause unexpected consequences, such as the 1753 loss of valuable data. Software that exhibits such 1754 behaviorisconsideredtolackrobustness. 1755 Nielsen gives a simpler (though not very accurate) 1756 description of software robustness: The software 1757 shouldhavealowerrorrate,sothatusersmakefew 1758 errors during the use of the system and so that if 1759 they do make errors they can easily recover from 1760 them. Further, catastrophic errors must not occur 1761 [5*p26]. 1762 There are many ways to evaluate the robustness of 1763 software and just as many ways to make software 1764 more robust. For example, to improve robustness, 1765 one should always check the validity of the inputs 1766 and return values before progressing further; one 1767 should always throw an exception when something 1768 unexpected occurs, and one should never quit a 1769 program without first giving users/applications a 1770 chancetocorrectthecondition.. 1771 16. Basic Developer Human Factors 1772 [4*c31-32] 1773 Developerhumanfactorsrefertotheconsiderations 1774 of human factors taken when developing software. 1775 Softwareisdevelopedbyhumans,readbyhumans, 1776 and maintained by humans. If anything is wrong, 1777 humans are responsible for correcting those 1778 wrongs. Thus, it is essential to write software in a 1779 way that is easily understandable by humans or, at 1780 the very least, by other software developers. A 1781 program that is easy to read and understand 1782 exhibitsasocalledreadability. 1783 The means to ensure that software meet this 1784 objective are numerous and range from proper 1785 architecture at the macro level to the particular 1786 coding style and variable usage at the micro level. 1787 But the two prominent factors are structure (or 1788 programlayouts)andcomments(documentation). 1789 16.1Structure 1790 Most people prefer structure to chaos. It is human 1791 nature to prefer a wellstructured program to 1792 randomly organized sections of code. Structure not 1793 only looks good but also makes programs easier to 1794 understand and maintain. It is said that a well 1795 structured program is selfexplanatory. But if a 1796 program is poorly structured, then no amount of 1797 explanation or comments is sufficient to make it 1798 understandable.Thewaystoorganizeaprogramare 1799 numerous and range from the proper use of white 1800 space, indentation, and parentheses to nice 1801 arrangements of groupings, blank lines, and braces. 1802 Whateverstyleonechooses,itshouldbeconsistent 1803 acrosstheentireprogram. 1804 16.2Comments 1805 To most people, programming is coding. These 1806 people do not realize that programming also 1807 includes writing comments and that comments are 1808 an integral part of programming. True, comments 1809 are not used by the computer and certainly do not 1810 constitute final instructions for the computer, but 1811 they improve the readability of the programs by 1812 explaining the meaning and logic of the statements 1813 or sections of code. It should be remembered that 1814 programs are not only meant for computers, they 1815 arealsoread,written,andmodifiedbyhumans. 1816 IEEE SWEBOK Guide V3 The types of comments include repeat of the code, 1817 explanation of the code, marker of the code, 1818 summary of the code, description of the codes 1819 intent, and information that cannot possibly be 1820 expressedbythecodeitself.Thebestcommentsare 1821 selfdocumentingcode.Ifthecodeiswritteninsuch 1822 a clear and precise manner that its meaning is self 1823 proclaimed,thennocommentisneeded.Butthisis 1824 easier said than done. Most programs are not self 1825 explanatory and are often hard to read and 1826 understandifnocommentsaregiven. 1827 Here are some general guidelines for writing good 1828 comments: 1829 z Comments should be consistent across the entire 1830 software. 1831 z Each function should be associated with 1832 comments that explain the purpose of the function 1833 and its role in the overall software. 1834 z Within a function, comments should be given for 1835 each logical section of coding to explain the 1836 meaning and purpose (intention) of the section. 1837 z Comments are seldom required for individual 1838 statements. If a statement needs comments, one 1839 should reconsider the statement. 1840 17. Secure Coding 1841 [10*c29] 1842 Due to increasing malicious activities targeted at 1843 computer systems, security has become a significant 1844 issue in the development of software systems. In 1845 addition to the usual correctness and reliability, 1846 software developers must also pay attention to the 1847 security of the software they develop. Secure coding is 1848 one of the ways to ensure security. 1849 17.1TwoAspectsofSecureCoding 1850 Secure coding means different things for different 1851 people. It can mean the way a specific function is 1852 coded,suchthatthecodingitselfissecure,oritcan 1853 meanthecodingofsecurityintosoftwaresystems. 1854 Most people entangle the two together without 1855 distinction. One reason for such entanglement is 1856 that it is not clear how one can make sure that a 1857 specific coding is secure. For example, in C 1858 programminglanguage,theexpressionofi<<1(shift 1859 the value of variable i to the left by 1 bit) and 2*i 1860 (multiplythevalueofvariableibyconstant2)mean 1861 the same thing semantically, but do they have the 1862 same security ramification? Due to this lack of 1863 understanding,securecodinginitscurrentstateof 1864 existencemostly refers to the second aspect 1865 mentioned above: the coding of security into 1866 softwaresystem. 1867 17.2CodingSecurityintoSoftware 1868 A generally accepted view concerning software 1869 security is that it is much better to design security 1870 into software than to patch it in after software is 1871 developed. To design security into software, one 1872 must take into consideration most, if not every, 1873 stage of the software development life cycle. In 1874 particular, secure coding involves requirement 1875 security, design security, and implementation 1876 security, which are described in the next few 1877 sections. 1878 17.3RequirementSecurity 1879 Requirement security deals with the clarification 1880 and specification of security policy and objectives 1881 into software requirements, which lays the 1882 foundation for all future security consideration in 1883 the software development process. Factors to 1884 consider in this phase include requirements and 1885 threats. The former refers to the specific functions 1886 that arerequired for the sake of security; the latter 1887 refers to the possible ways that the security of 1888 softwareisthreatened. 1889 17.4DesignSecurity 1890 Design security deals with the construction of 1891 software modules that fit together to meet the 1892 security objective specified in the security 1893 requirements. This step clarifies the details of 1894 security considerations and develops the specific 1895 steps for implementation. Factors considered may 1896 include frameworks and access modes that set up 1897 the overall security monitoring/enforcement 1898 strategies, as well as the individual policy 1899 enforcementmechanisms. 1900 17.5ImplementationSecurity 1901 Implementationsecurityisdirectlyrelatedtocoding 1902 and is the most relevant to secure coding among 1903 the three phases of coding security into software. 1904 All secure requirements and designs are useless if 1905 IEEE SWEBOK Guide V3 the implementation is not secure. Implementation 1906 securityconcernsitselfwiththequestionofhowto 1907 write actual codes for specific situations such that 1908 securityconsiderationsaretakencareof. 1909 Implementation security can be achieved by 1910 following some recommended rules. A few such 1911 rulesfollow[10*pp880910]: 1912 z Structure the process so that all sections requiring 1913 extra privileges are modules. The modules should 1914 be as small as possible and should perform only 1915 those tasks that require those privileges. 1916 z Ensure that any assumptions in the program are 1917 validated. If this is not possible, document them 1918 for the installers and maintainers so they know the 1919 assumptions that attackers will try to invalidate. 1920 z Ensure that the program does not share objects in 1921 memory with any other program. 1922 z The error status of every function must be 1923 checked. Do not try to recover unless neither the 1924 cause of the error nor its effects affect any 1925 security considerations. The program should 1926 restore the state of the system to the state it had 1927 before the process began, and then terminate. 1928 [1] The Joint Task Force on Computing 1929 Curricula, et al., "Software Engineering 1930 2004: Curriculum Guidelines for 1931 Undergraduate Degree Programs in 1932 SoftwareEngineering,"2004. 1933 [2] S. T. Frezza, "Computer Science: Is It Really 1934 the Scientific Foundation for Software 1935 Engineering,"Computer,vol.43,no.8(Aug. 1936 2010),pp.98101. 1937 [3*] G. Voland, Engineering by Design, 2nd ed. 1938 UpperSaddleRiver,NJ:PrenticeHall,2003. 1939 [4*] S. McConnell, Code Complete, 2nd ed. 1940 Redmond,WA:MicrosoftPress,2004. 1941 [5*] J. Nielsen, Usability Engineering, 1st ed. 1942 Boston:MorganKaufmann,1993. 1943 [6*] J. G. Brookshear, Computer Science: An 1944 Overview, 10th ed. Boston: Addison 1945 Wesley,2008. 1946 [7*] E. Horowitz, et al., Computer Algorithms, 1947 2nded.Summit,NJ:SiliconPress,2007. 1948 [8*] I. Sommerville, Software Engineering, 8th 1949 ed.NewYork:AddisonWesley,2006. 1950 [9*] L. Null and J. Lobur, The Essentials of 1951 Computer Organization and Architecture, 1952 2nd ed. Sudbury, MA: Jones and Bartlett 1953 Publishers,2006. 1954 [10*] M. Bishop, Computer Security: Art and 1955 Science.Boston:AddisonWesley,2002. 1956 1957 1958 IEEE SWEBOK Guide V3 Matrix of Topics vs. References for the Computing Foundations KA 1959 1960 [1*] [2*] [3*] [4*] [5*] [6*] [7*] [8*] 1. Problem Solving Techniques c3, s4.1-4.2 -- 35 pp 1.1 Definition of Problem Solving s3.1 1.2 Formulating the Real Problem s3.2 1.3 Analyze the Problem s3.3 1.4 Design a Solution Search Strategy s4.2 1.5 Problem Solving using Programs c5 2. Abstraction s5.2-5.4 -- 41 pp
3.1 The Programming Process c6-c19 3.2 Programming Paradigms c6-c19 3.3 Defensive Programming c8 4. Programming Language Basics c6 -- 60 pp
4.1 Programming Language Overview s6.1 4.2 Syntax and Semantics of Programming Language S6.2 4.3 Low Level Programming Language s6.5-6.7 4.4 High Level Programing Language s6.5-6.7 4.5 Declarative vs. Imperative Programming Language s6.5-6.7 5. Debugging Tools and Techniques c23 -- 28 pp 5.1 Types of Errors s23.1 5.2 Debugging Techniques: s23.2 5.3 Debugging Tools s23.5 6. Data Structure and Representation s2.1-2.6 -- 47 pp 6.1 Data Structure Overview s2.1-2.6 6.2 Types of Data Structure s2.1-2.6 6.3 Operations on Data Structures s2.1-2.6 7. Algorithms and Complexity s1.1-1.3, s3.3-3.6, s4.1- 4.8, s5.1-5.7, s6.1-6.3, s7.1-7.6, s11.1, s12.1 --
8.1 Emergent System Properties s2.1 8.2 System Engineering s2.2 8.3 Overview of a Computer System 9. Computer Organization c1-4 -- 96 pp
9.1 Computer Organization Overview s1.1-1.2 9.2 Digital Systems c3 9.3 Digital Logic c3 9.3 Computer Expression of Data c2 9.4 The Central Processing Unit (CPU) s4.1-4.2 9.5 Memory System Organization s4.6 9.6 Input and Output (I/O) s4.5 10. Compiler Basics s6.4 s8.4 -- 11 pp 10.1 Compiler Overview s8.4 10.2 Interpretation and Compilation s8.4-8.5 10.3 The Compilation Process s6.4 s8.4 11. Operating Systems Basics c3 -- 26 pp
11.1 Operating Systems Overview s3.2 11.2 Tasks of Operating System s3.3 11.3 Operating System Abstractions s3.2 11.4 Operating Systems Classification s3.1 12. Database Basics and Data Management c9 -- 44 pp
12.1 Entity and Schema s9.1 12.2 Database Management Systems (DBMS) s9.1 12.3 Database Query Language s9.2 12.4 Tasks of DBMS Packages s9.2 IEEE SWEBOK Guide V3 12.5 Data Management s9.5 12.6 Data Mining s9.6 13. Network Communication Basics c12 -- 70 pp 13.1 Types of Network s12.2-12.3 13.2 Basic Network Components s12.6 13.3Networking Protocols and Standards s12.4-12.5 13.4 The Internet 13.5Internet of Things s12.8 13.6 Virtual Private Network 14. Parallel and Distributed Computing c9 -- 65 pp 14.1 Parallel and Distributed Computing Overview s9.4.1-9.4.3 14.2 Difference between Parallel and Distributed Computing s9.4.4-9.4.5 14.3 Parallel and Distributed Computing Models s9.4.4-9.4.5 14.4 Main Issues in Distributed Comp. 15. Basic User Human Factors c8 28 pp c5 50 pp 15.1 Input and Output s5.1, s5.3 15.2 Error Messages s5.2, s5.8 15.3 Software Robustness s5.5-5.6 16. Basic Developer Human Factors c31-32 -- 90 pp
17.1 Two Aspects of Secure Coding s29.1 17.2 Coding Security into Software s29.4 17.3 Requirement Security s29.2 17.4 Design Security s29.3 17.5 Implementation Security s29.5 Total Pages Referenced in the Computing Foundations KA 1367 pages 1961 Recommended references for the Knowledge Area 1962 [1*] M. Bishop, Computer Security: Art and Science. Boston: Addison-Wesley, 2002. 1963 [2*] J. G. Brookshear, Computer Science: An Overview, 10th ed. Boston: Addison-Wesley, 2008. 1964 [3*] E. Horowitz, et al., Computer Algorithms, 2nd ed. Summit, NJ: Silicon Press, 2007. 1965 [4*] S. McConnell, Code Complete, 2nd ed. Redmond, WA: Microsoft Press, 2004. 1966 IEEE SWEBOK Guide V3 [5*] J. Nielsen, Usability Engineering, 1st ed. Boston: Morgan Kaufmann, 1993. 1967 [6*] L. Null and J. Lobur, The Essentials of Computer Organization and Architecture, 2nd ed. Sudbury, MA: Jones and Bartlett Publishers, 2006. 1968 [7*] I. Sommerville, Software Engineering, 8th ed. New York: Addison-Wesley, 2006. 1969 [8*] G. Voland, Engineering by Design, 2nd ed. Upper Saddle River, NJ: Prentice Hall, 2003. 1970 1971 IEEE SWEBOK Guide V3 Further Readings List for the Computing Foundations KA [17] 1972 1973 1. A.V.Aho,R.Sethi,andJ.Ullman,Compilers:Principles,Techniques,andTools,AddisonWesley,2006. 1974 Reason for inclusion: The book is considered by many as one of the primary sources for knowledge on 1975 compiler and its construction. It provides more indepth discussion on compiler, in general, and the 1976 compilationprocess,inparticular. 1977 1978 2. T.H.Cormen,etal.,IntroductiontoAlgorithm,3rded.,TheMITPress,2009. 1979 Reason for inclusion: The book is considered by many as one of the primary sources for knowledge on 1980 algorithms and is used by many universities as either a textbook or a reference book. It is brought in to 1981 provideanalternativevieworperspectiveonalgorithms. 1982 1983 3. ACM SIGPLAN Education Board, "Why Undergraduates Should Learn the Principles of Programming 1984 Languages,"2010. 1985 Reasonforinclusion:ThispaperiscitedintheKAdescription. 1986 1987 4. S.T. Frezza, "Computer Science: Is It Really the Scientific Foundation for Software Engineering?" Computer, 1988 IEEECSPress,vol.43,no.8,Aug.2010,pp.98101. 1989 Reasonforinclusion:ThispaperiscitedintheKAdescription. 1990 1991 5. Graduate Software Engineering 2009, "Curriculum Guidelines for Graduate Degree Programs in Software 1992 Engineering,"Aug.2010,www.gswe2009.org. 1993 Reasonforinclusion:TheseguidelinesarecitedintheKAdescription. 1994 1995 6. A.Silberschatz,P.B.Galvin,andG.Gagne,OperatingSystemConcepts,8thed.,Wiley,2008. 1996 Reasonforinclusion:ThisisacommonlyusedtextbookinmanyAmericanunviersitiesinanoperatingsystems 1997 course. It provides more indepth discussion on operating systems basics and is already included in the 1998 consolidatedreferencelistforKAsotherthantheComputingFoundationsKA. 1999 2000 7. The Joint Task Force on Computing Curricula, IEEE Computer Society, and Association for Computing 2001 Machinery, "Software Engineering 2004: Curriculum Guidelines for Undergraduate Degree Programs in 2002 SoftwareEngineering,"2004. 2003 Reasonforinclusion:TheseguidelinesarecitedintheKAdescription. 2004 2005