You are on page 1of 9

[ T Y P E

T E X T ]

Good Programming Practices and Guidelines


Any fool can write code that computers can understand, good programmers write code that humans can understand. "owler As we have discussed earlier, in most cases, maintainability is the most desirable quality of a software artifact. Code is no exception. Good software ought to have code that is easy to maintain. That is, it is not important to write code that works, it is important to write code that works and is easy to understand so that it can be maintained. The three basic principles that guide maintainability are simplicity, clarity, and generality or flexibility. The software will be easy to maintain if it is easy to understand and easy to enhance. !implicity and clarity help in making the code easier to understand, while flexibility facilitates easy enhancement of the software.

Self Documenting Code

"rom a maintenance perspective, what we need is what is called self-documenting code. !elf#documenting code is that code which explains itself without the need of comments and extraneous documentation, like flowcharts, $%& diagrams, process#flow state diagrams, etc. That is, the meaning of the code should be evident 'ust by reading the code without having to refer to information present outside this code. The question is how can we write code that is self#documenting( There are a number of attributes that contributes towards making the program self# documented. These include, the si)e of each function, choice of variables and other identifier names, style of writing expressions, structure of programming statements, comments, modularity, and issues relating to performance and portability. The following discussion tries to elaborate on these points. 1.1 Function Size

The si)e of individual functions plays a significant role in making the program easy or difficult to understand. *n general, as the function becomes longer in si)e, it becomes more difficult to understand. *deally speaking, a function should not be larger than +, lines of code and in any case should not exceed one page in length. And where did * get these numbers of -+,. lines and -one. page( The number +, is approximately the total line of code that fit on a computer screen and one page of course refers to one printed page. The idea behind these heuristics is that when one is reading a function, one should not need to go back and forth from one screen to the other or from one page to the other and the entire context should be present on one page or on one screen. 1. !odularit"

As mentioned earlier, abstraction and encapsulation are two important tools that can help in managing and mastering the complexity of a program. /e also discussed that the si)e of

individual functions plays a significant role in making the program easy or difficult to understand. *n general, as the function becomes longer, it becomes more difficult to understand. %odularity is a tool that can help us in reducing the si)e of individual functions, making them more readable. As an example, consider the following selection sort function void selectionSort(int a[], int size) { int i, j; int temp; int min; for (i = 0; i < size-1; i ){ min = i; for (j = i 1; j < size; j if (a[j] < a[min]) min = j; ! temp = a[i]; a[i] = a[min]; a[min] = temp; ! ! Although it is not very long but we can still improve its readability by breaking it into small functions to perform the logical steps. 0ere is the modified code void s"ap(int #$, int #%) { int temp; temp = $; $ = %; % = temp; ! int minim&m(int a[], int from, int to) { int i; int min; min = a[from]; for (i = from; i <= to; i ){ if (a[i] < a[min]) min = i; ! ret&rn min; ! void selectionSort(int a[], int size) { int min; int i; for (i = 0; i < size; i ){ min = minim&m(a, i, size '1);

){

! !

s"ap(a[i], a[min]);

*t is easy to see that the new selection!ort function is much more readable. The logical steps have been abstracted out into the two functions namely, minimum and swap. This code is not only shorter but also as a by#product we now have two functions 1minimum and swap2 that can be reused. 3eusability is one of the prime reasons to make functions but is not the only reason. %odularity is of equal concern 1if not more2 and a function should be broken into smaller pieces, even if those pieces are not reused. As an example, let us consider the quick!ort algorithm below. void (&ic)Sort(int a[], int left, int ri*+t) { int i, j; int pivot; int temp; int mid = (left ri*+t),-; if (left < ri*+t){ i = left - 1; j = ri*+t 1; pivot = a[mid]; do { do i ; "+ile (a[i] < pivot); do j--; "+ile (a[i] < pivot); if (i<j){ temp = a[i]; a[i] = a[j]; a[j] = temp; ! ! "+ile (i < j); temp = a[left]; a[left] = a[j]; a[j] = temp; (&ic)Sort(a, left, j); (&ic)Sort(a, j 1, ri*+t); ! This is actually a very simple algorithm but students find it very difficult to remember. *f broken in logical steps as shown next, it becomes trivial. void (&ic)Sort(int a[], int left, int ri*+t) { int p; if (left < ri*+t){ p = partition(a, left, ri*+t); (&ic)Sort(a, left, p-1); (&ic)Sort(a, p 1, ri*+t); !
#

! int partition(int a[], int left, int ri*+t) { int i; j; int pivot; i = left 1; j = ri*+t; pivot = a[left]; "+ile(i < ri*+t ## a[i] < pivot) i ; "+ile(j . left ## a[j] .= pivot) j ; if(i < j) s"ap(a[i], a[j]); s"ap(a[left], a[j]); ret&rn j; ! 1.# $dentifier %ames

*dentifier names also play a significant role in enhancing the readability of a program. The names should be chosen in order to make them meaningful to the reader. *n order to understand the concept, let us look at the following statement. if ($==0) ,, t+is is t+e case "+en "e are allocatin* a ne" n&m/er

*n this particular case, the meanings of the condition in the if#statement are not clear and we had to write a comment to explain it. This can be improved if instead of using x, we use a more meaningful name. 4ur new code becomes if (0lloc1la* == 0) The situation has improved a little bit but the semantics of the condition are still not very clear, as the meaning of , is not very clear. 5ow consider the following statement 2f (0lloc1la* == 345637894:) /e have improved the quality of the code by replacing the number , with a named constant 56/75$%863. 5ow, the semantics are clear and do not need any extra comments, hence this piece of code is self#documenting. Coding St"le Guide Consistency plays a very important role in making code self#documenting. A consistently written code is easier to understand and follow. A coding style guide is aimed at improving the coding process and to implement the concept of standardi)ed and relatively uniform code throughout the application or pro'ect. As a number of programmers participate in developing a large piece of code, it is important that a consistent style is adopted and used by all. Therefore, each organi)ation should develop a style guide to be adopted by its entire team. This coding style guide emphasi)es on C99 and :ava but the concepts are applicable to other languages as well.
&

.1

%aming Con'entions

Charles !imonyi of %icrosoft first discussed the 0ungarian 5otation. *t is a variable naming convention that includes information about the variable in its name 1such as data type, whether it is a reference variable or a constant variable, etc2. 6very company and programmer seems to have his or her own flavor of 0ungarian 5otation. The advantage of 0ungarian notation is that 'ust by looking at the variable name, one gets all the information needed about that variable. Bicapitalization or camel case 1frequently written CamelCase2 is the practice of writing compound words or phrases where the terms are 'oined without spaces, and every term is capitali)ed. The name comes from a supposed resemblance between the bumpy outline of the compound word and the humps of a camel. CamelCase is now the official convention for file names and identifiers in the :ava ;rogramming &anguage. *n our style guide, we will be using a naming convention where 0ungarian 5otation is mixed with CamelCase. .1.1 General %aming Con'entions for ()*) and C++ <. 5ames representing types must be nouns and written in mixed case starting with upper case. ;ine, 1ile<refi$ +. =ariable names must be in mixed case starting with lower case. line, file<refi$ This makes variables easy to distinguish from types, and effectively resolves potential naming collision as in the declaration &ine line> ?. 5ames representing constants must be all uppercase using underscore to separate words. 80=62>4:0>2?3S, @?;?:6:4A *n general, the use of such constants should be minimi)ed. *n many cases implementing the value as a method is a better choice. This form is both easier to read, and it ensures a uniform interface towards class values. int *et8a$2terations(),, 3?>B 80=62>4:0>2?3S = -C {
ret&rn -C; !

@. 5ames representing methods and functions should be verbs and written in mixed case starting with lower case. *et3ame(), comp&te>otal5idt+()
,

A. 5ames representing template types in C99 should be a single uppercase letter. template<class >. DDD template<class @, class A. DDD B. Global variables in C99 should always be referred to by using the :: operator. BBmain5indo"Dopen() , BBapplication@onte$tD*et3ame() C. ;rivate class variables should have 7 suffix. class Some@lass { private int len*t+6; DDD ! Apart from its name and its type, the scope of a variable is its most important feature. *ndicating class scope by using 7 makes it easy to distinguish class variables from local scratch variables. D. Abbreviations and acronyms should not be uppercase when used as name. e$portEtmlSo&rce(); openAvd<la%er(); ,, 3?>B $portE>8;So&rce(); ,, 3?>B openAFA<la%er();

$sing all uppercase for the base name will give conflicts with the naming conventions given above. A variable of this type would have to be named d=E, hT%& etc. which obviously is not very readable. F. Generic variables should have the same name as their type. void set>opic (>opic topic) ,, 3?>B void set>opic (>opic val&e) ,, 3?>B void set>opic (>opic a>opic) ,, 3?>B void set>opic (>opic $) void connect (Aata/ase data/ase) ,, 3?>B void connect (Aata/ase d/) ,, 3?>B void connect (Aata/ase oracleA9) 5on#generic variables have a role. These variables can often be named by combining role and type <oint startin*<oint, center<oint;
3ame lo*in3ame;

<,.All names should be written in 6nglish. file3ame; ,, 3?>B fil3avn


-

<<.

=ariables with a large scope should have long names> variables with a small scope can have short names. !cratch variables used for temporary storage or indices are best kept short. A programmer reading such variables should be able to assume that its value is not used outside a few lines of code. Common scratch variables for integers are i, j, k, m, n and for characters c and d.

<+.The name of the ob'ect is implicit, and should be avoided in a method name. lineD*et;en*t+(); ,, 3?>B lineD*et;ine;en*t+();

The latter seems natural in the class declaration, but proves superfluous in use. .1. S.ecific %aming Con'entions for (a'a and C++

<. The terms get/set must be used where an attribute is accessed directly. emplo%eeD*et3ame();
matri$D*et4lement (-, G); emplo%eeDset3ame (name); matri$Dset4lement (-, G, val&e);

+. is prefix should be used for boolean variables and methods. isSet, isFisi/le, is1inis+ed, is1o&nd, is?pen $sing the is prefix solves a common problem of choosing bad 8oolean names like status or flag. isStatus or isFlag simply doesnGt fit, and the programmer is forced to chose more meaningful names. There are a few alternatives to the is prefix that fits better in some situations. These are has, can and should prefixes /oolean +as;icense();
/oolean can4val&ate(); /oolean s+o&ld0/ort = false;

?. The term compute can be used in methods where something is computed. val&eSetDcomp&te0vera*e();
matri$Dcomp&te2nverse()

$sing this term will give the reader the immediate clue that this is a potential time consuming operation, and if used repeatedly, he might consider caching the result. @. The term find can be used in methods where something is looked up. verte$Dfind3earestFerte$();
matri$Dfind8in4lement();

This tells the reader that this is a simple look up method with a minimum of computations involved. A. The term initialize can be used where an ob'ect or a concept is established.
/

printerDinitialize1ontSet(); B. List suffix can be used on names representing a list of ob'ects. verte$ (one verte$),
verte$;ist (a list of vertices)

!imply using the plural form of the base class name for a list 1matrix6lement 1one matrix element2, matrix6lements 1list of matrix elements22 should be avoided since the two only differ in a single character and are thereby difficult to distinguish. A list in this context is the compound data type that can be traversed backwards, forwards, etc. 1typically a =ector2. A plain array is simpler. The suffix rra! can be used to denote an array of ob'ects.

C. n prefix should be used for variables representing a number of ob'ects. n<oints, n;ines The notation is taken from mathematics where it is an established convention for indicating a number of ob'ects. D. "o suffix should be used for variables representing an entity number. ta/le3o, emplo%ee3o The notation is taken from mathematics where it is an established convention for indicating an entity number. An elegant alternative is to prefix such variables with an i iTable, i6mployee. This effectively makes them named iterators.
F.

*terator variables should be called i, ', k etc. "+ile (2terator i = point;istDiterator();iD+as3e$t();) {


! B ) {

for (int i = 0; i < n>a/les; i B !

The notation is taken from mathematics where it is an established convention for indicating iterators. =ariables named j, k etc. should be used for nested loops only. <,.Complement names must be used for complement entities. *et,set, add,remove, create,destro%, start,stop, insert,delete, increment,decrement, old,ne", /e*in,end, first,last, &p,do"n, min,ma$, ne$t,previo&s, old,ne", open,close, s+o",+ide 3educe complexity by symmetry.
0

<<.Abbreviations in names should be avoided. comp&te0vera*e();


,, 3?>B comp0v*();

There are two types of words to consider. "irst are the common words listed in a language dictionary, these must never be abbreviated. 5ever write cmd cp pt comp init etc. instead instead instead instead instead of of of of of command cop% point comp&te initialize

Then there are domain specific phrases that are more naturally known through their acronym or abbreviations. These phrases should be kept abbreviated. 5ever write E%perte$t8ar)&p;an*&a*e @entral<rocessin*7nit <rice4arnin*:atio etc. instead of instead of instead of +tml cp& pe

<+.5egated 8oolean variable names must be avoided. /oolean is4rror;


/oolean is1o&nd; ,, 3?>B is3ot4rror ,, 3?>B is3ot1o&nd

The problem arise when the logical "#$ operator is used and double negative arises. *t is not immediately apparent what His5ot6rror means.
<?.

"unctions 1methods returning an ob'ect2 should be named after what they return and procedures 1%oid methods2 after what they do. This increases readability and makes it clear what the unit should do and especially all the things it is not supposed to do. This again makes it easier to keep the code clean of side effects. 5aming pointers in C99 specifically should be clear and should represent the pointer type distinctly. ;ine Hline ,,3?> line Hp;ine; or ;ine Hlineptr; etc

File 1andling Ti.s for ()*) and C++ <. C99 header files should have the extension .h. !ource files can have the extension .c&& 1recommended2, .C, .cc or .cpp. 8%@lassDc , 8%@lassD+

These are all accepted C99 standards for file extension.

You might also like