Halstead's Software Metrics
Halstead’s software science metrics
According to Halstead’s software metrics, a program is the implementation of an algorithm. These steps are considered as a number of tokens, and tokens are classified as operators and operands. All software science measures are functions of the counts of these tokens.
Token count
- Operator –Any symbol or keyword in a program that specifies an algorithm is considered an operator. For example, arithmetic symbols like (+. -. /,*), punctuation marks, common names (while, for, print f), special symbols (=, :) and function names.
- Operand– In an algorithm or program, a symbol is used to represent data, constants, variables, labels, etc. is considered as an operand.
For example:
if (k< 2)
{
k=3;
x=x*k;
}

Maurice Howard Halstead brought Halstead's software metrics concept in 1977 to measure the software metrics. Halstead's complexity measure is used to measure program module complexity directly from source code.
Two necessary measures of Halstead's software science are:
- Size of vocabulary ղ ( known as Ita)
The size of the vocabulary of a program is defined as a number of unique tokens in a program used to build a program.
ղ =ղ1 + ղ2
where,
ղ = number of vocabulary in a program
ղ1 = number of unique operators
ղ2 = number of unique operands
By using this formula, we can find the total number of vocabulary in a given source code.
- Length of program N
The total number of tokens in a program is termed as the Length of the program.
N = N1 + N2
Where,
N = Length of program
N1= Total occurrences of operators
N2 = Total occurrences of operators
Additional metrics are defined using these basic terms- the size of vocabulary and Length of the program. Another measure for the size of the program is known as:
- Program Volume (V)
V= N x log2ղ
Program volume is proportional to program size, represents the size of space necessary for sorting a program.
Unit of volume = Common unit of size “bits” {it is the actual size of the program if a uniform binary encoding for the vocabulary is used}
- Potential Volume
V* = represents a program having a minimal size.
V* = (2+n2 *) log2(2+n2*)
- Program Level
L = V* / V
Program level L = Ranges between 0 to 1
L =1 states that the program is written with minimum size.
Else if L=0 states program has a maximum size.
V* = potential volume {different or equivalent programs may implement an algorithm. Therefore program that is minimal in size will have potential volume}
Other different measures are:-
- Program difficulty D
D = ղ1/2 x N2 / ղ2
D = 1/L
Difficulty measure is the difficulty of the program to read or write the program. If the program level decreases, difficulty to handle that program increases.
- Effort –The amount of effort needed to translate an algorithm into implementation is specified programming language.
E = V/L = D x V
V = Volume of program
L = Level of program
D =Program difficulty
V= Volume of program
- Language Level λ
λ = L* V* { L= V*/ V =>V*= LV}
λ = L2 * V
- Time required to program
T = E/ 18 sec
T = Time
E = Effort
- Number of delivered bugs
B = E2/3/3000 {actually whole state delivered bugs is an estimated number of errors in the implementation}
Recently V/3000 is accepted for delivered bugs B.
- Estimated program Length
The first hypothesis of software science is that the length of well structure software program is a function of number of unique operators and operands.
N =ղ1log2ղ1+ ղ2log2ղ2
ղ1 = number of unique operators
ղ2 = number of unique operands
- Purity Ratio
P= N/N
Rules for finding the number of tokens in the C programming language
- Comments, a function declared, identifier, and heading are not counted as tokens.
- All the variables, constants, labels, and variable data are considered as operands.
- The same variable used in the different functions is counted as a unique operator.
- All looping statements, switch statements, case statements are considered as operators.
- All the reserve words (default, sizeof, break, default, continue, return) and bracket, terminator, or commas are counted as a token(operators).
- GOTO statement is the operator.
- “Array name” and index are operands.
- Hash derivatives are ignored.
Find the number of tokens using Halstead's metrics and find n, N, V, E, λ?
Example: 1
int sort (int x[ ], int n)
{
int i, j, save im1.
/* This function sorts an array*/
if (n<2) return 1;
for (i=2; i<=n, i=1)
{ im 1= i-1;
for (i=1; j<= im1; j + i)
if (x[i]< x[j])
{save =x[j];
x [i] = x[j];
x [j] = save;
}}
return 0;
}
Operators | Occurrences | Operands | Occurrences |
int | 4 | SORT | 1 |
( ) | 5 | x | 7 |
, | 4 | n | 3 |
[ ] | 7 | i | 8 |
if | 2 | j | 7 |
< | 2 | save | 8 |
; | 11 | im 1 | 3 |
for | 2 | 2 | 3 |
= | 6 | 1 | 2 |
- | 1 | 0 | 3 |
<= | 2 | 1 | |
++ | 2 | ||
return | 2 | ||
{} | 3 | ||
14 | 53 | 10 | 38 |
n= n1+n2
= 14 + 10 = 24
N = 53+38 = 91
V = N log2 n
= 91 log2 24
Example: 2
if (x>5)
{
x= x + 2;
if (x < 7)
{
x=0;
}
}
N1 ( Operators) | N2 ( Operands) |
if | x |
( ) | 5 |
> | x |
{ | x |
= | 2 |
; | x |
+ | 7 |
If | x |
( ) | 0 |
< | |
{ | |
= | |
; | |
} | |
] | |
15 | 9 |
N1 = 15 , N2 = 9
n1 = 9 , n2 = 5
Total Length (N)
N = N1 + N2 = 15 + 9 = 24
Vocabulary (n)
n = n1 + n2 = 9 + 5 = 14
N/ N
N = n1 log2 n1 + n2 log2 n2
= 9 log2 9 + 5 log2 5
= 9 log10 9 / 0.3010 + 5 log10 5 / 0.3010
= 9 x (3.17) + 5 x (2.32)
= 28.53 + 116
= 40.13
Estimated Length = N/N = 40.13/ 24 = 1.67
Volume = N log2 n
= 24 log2 14
= 24 x log10 14 / 0.3010
= 24 x 3.807 = 91.36
D = N1/ 2 x N2 / n2
= 9/2 x 9/5 = 81 /10 = 8.1
Effort = V *D
= 91.36 x 8.1 = 740.016