Home > Code Internals > Microsoft does it differently

Microsoft does it differently

A while back I was studying the Len() function of VB6 and I came across something interesting.

Contrary to our belief that this function counts the number of characters in a string and returns it, it actually does something totally different.

Here’s how it works.

When any string is stored in VB, it is automatically stored in this format (in unicode):

Suppose the String is “ABC”:

06 00 00 00 41 00 42 00 43 00

The 41 to 43 part is a typical unicode style of storing strings, but 2 unicode characters before that, the number of bytes occupied by the string (including zeros) is stored (which is similar to Pascal style strings).
Hence 06 stands for 6 bytes occupied by “ABC” (since it’s stored as A,0,B,0,C,0)

So all that the Len() does is read 4 bytes before the beginning of the string and return that value itself, instead of calculating the length of the string.

So actually, the Len Function does nothing except read the length from the string format and return it.
Want to see how they do it?
Here’s the Disassembled Listing of the Len() Function MSVBVM60.DLL DLL File.

__vbaLenBstr proc near
   string = dword ptr 4

   mov eax, [esp+string] ; eax points to string
   test eax, eax ; ZF,SF,PF = EAX and EAX
   jz short break ; If String is Null then break from loop
   mov eax, [eax-4] ; Gets Unicode Length stored before string.

   ; Here’s how Text from textbox is stored internally:
   ; If text is born2c0de then in memory:
   ; (0×12 0×00) 0×00 0×00 (0×62 0×00 …)
   ; ie. length of string in bytes(unicode) (here 18 bytes)
   ; followed by a Unicode 0 (0×00 0×00)
   ; followed by the actual string in unicode
   ; (’b’ 00 ‘o’ 00 … ‘e’ 00)
   ; So [eax-4] just gets the unicode length of
   ; string which already is stored when a
   ; string is taken as input from keyboard.

   ; The Len() function doesn’t even calculate length!!!
   shr eax, 1 ; Divides Length by 2 to get Actual Length.
   ; Uses eax so it can be used as a return value.

break:
   retn 4
__vbaLenBstr endp
  1. Anujmorex
    March 4th, 2010 at 22:23 | #1

    This style of storing strings has its root in Pascal. I think they are called Pascal strings.

  1. No trackbacks yet.