# Halstead's Software Metrics

## Halstead’s software science metrics

According to Halstead’s software metrics, a program is the implementation of an algorithm. These steps are considered as a number of tokens, and tokens are classified as operators and operands. All software science measures are functions of the counts of these tokens.

**Token count**

**Operator**–Any symbol or keyword in a program that specifies an algorithm is considered an operator. For example, arithmetic symbols like (+. -. /,*), punctuation marks, common names (while, for, print f), special symbols (=, :) and function names.**Operand**– In an algorithm or program, a symbol is used to represent data, constants, variables, labels, etc. is considered as an operand.

**For example: **

```
if (k< 2)
{
k=3;
x=x*k;
}
```

Maurice Howard Halstead brought Halstead's software metrics concept in 1977 to measure the software metrics. Halstead's complexity measure is used to measure program module complexity directly from source code.

**Two necessary measures of Halstead's software science are:**

**Size of vocabulary ղ ( known as Ita)**

The size of the vocabulary of a program is defined as a number of unique tokens in a program used to build a program.

**ղ =ղ _{1 }+ ղ_{2}**

where,

ղ = number of vocabulary in a program

ղ_{1 }= number of unique operators

ղ_{2} = number of unique operands

By using this formula, we can find the total number of vocabulary in a given source code.

**Length of program N**

The total number of tokens in a program is termed as the Length of the program.

**N = N _{1} + N_{2}**

Where,

N = Length of program

N_{1}= Total occurrences of operators

N_{2 }= Total occurrences of operators

Additional metrics are defined using these basic terms- the size of vocabulary and Length of the program. Another measure for the size of the program is known as:

**Program Volume (V)**

** V= N x log _{2}ղ**

Program volume is proportional to program size, represents the size of space necessary for sorting a program.

Unit of volume = Common unit of size “bits” {it is the actual size of the program if a uniform binary encoding for the vocabulary is used}

**Potential Volume**

V* = represents a program having a minimal size.

**V* = (2+n _{2} *) log_{2}(2+n_{2}*)**

**Program Level**

**L = V* / V**

Program level L = Ranges between 0 to 1

L =1 states that the program is written with minimum size.

Else if L=0 states program has a maximum size.

V* = potential volume {different or equivalent programs may implement an algorithm. Therefore program that is minimal in size will have potential volume}

Other different measures are:-

**Program difficulty D**

** D = ղ _{1}/2 x N_{2} / ղ_{2}**

** D = 1/L**

Difficulty measure is the difficulty of the program to read or write the program. If the program level decreases, difficulty to handle that program increases.

**Effort**–The amount of effort needed to translate an algorithm into implementation is specified programming language.

**E = V/L = D x V**

V = Volume of program

L = Level of program

D =Program difficulty

V= Volume of program

**Language Level λ**

** λ = L* V ^{*} { L= V^{*}/ V =>V^{*}= LV}**

** λ = L ^{2 }* V**

**Time required to program**

**T = E/ 18 sec**

T = Time

E = Effort

**Number of delivered bugs**

**B = E ^{2/3}/3000** {actually whole state delivered bugs is an estimated number of errors in the implementation}

Recently **V/3000** is accepted for delivered bugs B.

**Estimated program Length**

The first hypothesis of software science is that the length of well structure software program is a function of number of unique operators and operands.

**N =ղ _{1}log_{2}ղ_{1}+ ղ_{2}log_{2}ղ2**

ղ_{1 }= number of unique operators

ղ_{2} = number of unique operands

**Purity Ratio**

** P= N/N**

### Rules for finding the number of tokens in the C programming language

- Comments, a function declared, identifier, and heading are not counted as tokens.
- All the variables, constants, labels, and variable data are considered as operands.
- The same variable used in the different functions is counted as a unique operator.
- All looping statements, switch statements, case statements are considered as operators.
- All the reserve words (default, sizeof, break, default, continue, return) and bracket, terminator, or commas are counted as a token(operators).
- GOTO statement is the operator.
- “Array name” and index are operands.
- Hash derivatives are ignored.

Find the number of tokens using Halstead's metrics and find n, N, V, E, λ?

**Example: 1**

```
int sort (int x[ ], int n)
{
int i, j, save im1.
/* This function sorts an array*/
if (n<2) return 1;
for (i=2; i<=n, i=1)
{ im 1= i-1;
for (i=1; j<= im1; j + i)
if (x[i]< x[j])
{save =x[j];
x [i] = x[j];
x [j] = save;
}}
return 0;
}
```

Operators | Occurrences | Operands | Occurrences |

int | 4 | SORT | 1 |

( ) | 5 | x | 7 |

, | 4 | n | 3 |

[ ] | 7 | i | 8 |

if | 2 | j | 7 |

< | 2 | save | 8 |

; | 11 | im 1 | 3 |

for | 2 | 2 | 3 |

= | 6 | 1 | 2 |

- | 1 | 0 | 3 |

<= | 2 | 1 | |

++ | 2 | ||

return | 2 | ||

{} | 3 | ||

14 | 53 | 10 | 38 |

** n= n _{1}+n_{2}**

** = 14 + 10 = 24**

** N = 53+38 = 91**

** V = N log _{2} n**

** = 91 log _{2} 24**

**Example: 2**

```
if (x>5)
{
x= x + 2;
if (x < 7)
{
x=0;
}
}
```

N_{1 }( Operators) | N_{2} ( Operands) |

if | x |

( ) | 5 |

> | x |

{ | x |

= | 2 |

; | x |

+ | 7 |

If | x |

( ) | 0 |

< | |

{ | |

= | |

; | |

} | |

] | |

15 | 9 |

N_{1 }= 15 , N_{2} = 9

n_{1 }= 9 , n_{2} = 5

**Total Length (N)**

N = N_{1} + N_{2} = 15 + 9 = 24

**Vocabulary (n)**

n = n_{1} + n_{2} = 9 + 5 = 14

N/ N

**N = n _{1 }log_{2} n_{1 }+ n_{2} **

**log**

_{2}n_{2}_{ }= 9 log_{2} 9 + 5 log_{2} 5

= 9 log_{10} 9 / 0.3010 + 5 log_{10} 5 / 0.3010

= 9 x (3.17) + 5 x (2.32)

= 28.53 + 116

= 40.13

**Estimated Length = N/N** = 40.13/ 24 = 1.67

**Volume = N log _{2} n**

= 24 log_{2} 14

= 24 x log_{10} 14 / 0.3010

= 24 x 3.807 = 91.36

** ****D = N _{1}/ 2 x N_{2} / n_{2}**

= 9/2 x 9/5 = 81 /10 = 8.1

**Effort = V *D**

= 91.36 x 8.1 = 740.016