Home > Programming > String Termination in C/C++

String Termination in C/C++

This is a typical assignment given to students who are learning C/C++.
“Write a function that copies the contents of one string into another.”
Note that in this example a string is referred to as a character array.

Some people come up with code similar to this one:

void copy(char [],char []);

int main()
{
   char s1[10],s2[10];
   printf(”Enter String 1:”);
   gets(s1);
   copy(s1,s2);
   printf(”nThe copied string is: “);
   puts(s2);
   return 0;
}

void copy(char x[],char y[])
{
   int i=0;
   while(x[i])
   {
      y[i]=x[i];
      i++;
   }
}

Can you notice that something is missing in the copy() function? Yes, the null termination character was not appended at the end of the new string. I corrected a friend of mine who the same mistake and he asked me why the code works for other strings as well such as “Sanch”.

Luckily, we were near a computer so I could demonstrate my examples straight away to convince him.

Observe the declaration of this character array:

char name[10]=”Sanchit”;

Internally, this String is stored like this:
S,a,n,c,h,i,t,,,

Yes, if the string initial declaration is less than the size provided in the square brackets, the rest of the elements are filled with zeros. This is applicable for arrays of any data type.

Since the ASCII Value of is 0, I can represent my array like this again,

S,a,n,c,h,i,t,0×0,0×0,0×0

Does this ring any bells?

If you create another array of the same size (here: 10) and copy this string to it using the code given above, the rest of the elements after the String are already . So there is no need to the character as C would know where the string terminates.

But what if the String is larger? Assuming “Sanchit Karve” is stored in the same Array, it will look like this:
S,a,n,c,h,i,t, <space> ,K,a,r,v,e, , , …

If you notice, that inspite of the string being larger than the array size, it is stored completely after occupying the space after the array. Since, now the space outside the array has been accessed, the data out there is not zero. Instead they contain garbage values. So, for such situations, we need to append the character at the end of the string, so that C can figure where the string ends. Otherwise, C will output all bytes till a zero is encountered.

So, appending the character is not required if we assume that the string length is lesser the array size. However, we should also ensure that we declare enough space before input larger than the allocated size because of the following reasons:

  • Unlike other Languages C/C++ do not provide Array Bound Checking features.
  • Hackers make the most of such types of errors, known as Buffer Overflow Errors, and launch Buffer Overflow Attacks where the string is inserted with shellcode (yes, in hex), and the return address of the function is overwritten to the address of the string. This results in the processor passing control to the shellcode and executing it after the function returns. But just because we append the character does not mean that Buffer Overflow Errors don’t occur. But it’s done just as a safety measure so that the program can run without faults.
Categories: Programming Tags: ,
  1. No comments yet.
  1. No trackbacks yet.