m4s0n501
Sanchit's Blog

Visual Basic 6 Internal Event Handling

January 16th, 2009 1 comment

VB6 calls control events in a very specific way. It’s impressive but the design of event handling results in losing a certain number of guaranteed bytes per control used. By losing bytes, I am referring to the addition of redundant bytes in the executable code.

I studied the entire structure with Disassemblers and debuggers and found out that the total number of redundant bytes is  governed by the following formula:

Total Redundant Bytes = SUMMATION OF (4 x (Number_Of_Events + 6 – Events_Used) ) FROM 1 TO N

where:
N = Total Number of Controls Used
Number_Of_Events = Total Number Events Supported by a Control (Different for each control)
Events_Used = Number of Events Used for each control.

So Imagine a simple form having two text boxes and 1 command button for a simple login box, considering that only the following events are used:
Form_Load()
Command1_Click()

The Total number of Redundant Bytes is:
(4*(31+6-1)) + 2*(4*(24+6)) + (4*(17+6-1)) = 472
1 FORM + 2 TEXT BOXES + 1 CMD BTN

That’s 472 unnecessary bytes just for a simple login box using absolutely no user-written code.

Many people refer to VB6 as Visual Bloatware, and you now know why ;)

Categories: Code Internals Tags: , ,

Does Minesweeper Cheat?

January 16th, 2009 4 comments

I’ve come across a few people who believe that minesweeper lays down only half the number of bombs and only adds the other half on the board as the game progresses.

I debugged the game myself and have found that it is completely false. However, the game ensures that you don’t click a bomb on the first click. Hence, if the first click is a bomb, it is removed from the grid and placed at the first empty location from the coordinate (0,0) i.e. the top-left location.

Hence, if you clicked on (3,2) which originally had a bomb, it would be removed from (3,2) and placed at (0,0) if that location were empty.

This can also be verified without using a debugger. Use the xyzzy cheat to enable the white pixel at the top-left corner of the screen. Then, find a location which contains a bomb (ie. black pixel at top-left corner of the screen) but don’t click it. Now, move your cursor to the top-left corner of the screen and check if it has a bomb.

Now click on the previous location which had a bomb and you’ll notice that there isn’t a bomb once you click it. Then move your cursor to the top-left corner of the board and you’ll notice that the bomb has been shifted to the top-left location.

Minesweeper cheats FOR you and not against you to ensure that you can start the game without clicking a bomb everytime.

Categories: Myths Tags: , ,

Another Application of One-way Hash Functions

January 9th, 2009 No comments

We’ve all heard of one-way Hash Functions sometime or the other.
Most of us have heard about them from books.
An Algorithm’s explanation is usually followed by its applications and most books mention only one major application (in security) ie. implementing password checks and storing them in a database.

I used to wonder:”That’s it? One Application in Security?”, and searching on the internet (and some more books) didn’t help either.

Luckily, I’ve finally found my answers. Now, I’m able to appreciate one-way Hash functions a lot more because I’ve seen it in action. If I had read this application in a book I’m sure that I wouldn’t have realized its importance.

A few days ago, Nokia Corporation issued a notice to all customers that some of its BL-5C Model Batteries had some manufacturing defects which could cause it to explode. It asked customers to check if their batteries were manufactured between December 2005 and November 2006 and if so, get the battery replaced for free.

Nokia also allowed its customers to type in their 26 Character Battery Code on their website (www.nokia.com/batteryreplacement) to see if their battery was faulty or not.

I decided to check the script which finds out when the battery was manufactured. I thought that by looking at the source code, I could figure out exactly which batteries were faulty.

This is what the script looks like:

function rcrcheck_serial()
{
   var isgood=false;
   var serial=document.enterserial.serial.value;
   var a=”;
   var b=”;
   var c=”;
   if(serial.length<26)
   {
      alert(”The identification number is incorrect. Please check that you have entered the full 26 characters of the battery identification number.”);
      return false;
   }
   if(serial.length>26)
   {
      alert(”The identification number is incorrect. Please check that you have entered the full 26 characters of the battery identification number.”);
      return false;
   }
   a=md5(serial.substr(7,6));
   b=md5(serial.substr(13,1));
   c=md5(serial.substr(14,3));

   if(a!=”ea4a302b5cbd017871ec94fd6ae189b5?
&&amp;amp;amp;amp; a!=”1f098214896cc40cfabc3b2403a65b75? //###
&& a!=”fd06cd296b4bf634d85e26884565aa6c”) { //###
window.location=”rcrb2.html”;
return false;
}if(b==”8d9c307cb7f3c4a32822a51922d1ceaa” b==”7b8b965ad4bca0e41ab51de7b31363a1?) { //###
if(c==”84eb13cfed01764d9c401219faa56d53?){return true;} //###
if(c==”d2490f048dc3b77a457e3e450ab4eb38?){return true;} //###
if(c==”441954d29ad2a375cef8ea524a2c7e73?){return true;} //###
if(c==”0e51011a4c4891e5c01c12d85c4dcaa7?){return true;} //###
if(c==”af032fbcb07ffc7bd2569d86ae4ce1f5?){return true;} //###
if(c==”73f7634ab3f381fb40995f93740b3f8a”){return true;} //###
if(c==”738cccd4fda172441f216712a488dca6?){return true;} //###
if(c==”f803dfeb3583d5099a58a7478f28bd75?){return true;} //###
if(c==”7f5144f962efde75e0f7661e032166db”){return true;} //###
if(c==”8fc4c7ab4453d247e011738197b6136c”){return true;} //###

/* Some more Comparisons */

if(c==”defd40204344c9659a0a3eb4ebc125f6?){return true;} //###
if(c==”c4de9fe96832a877668d0dced80657b8?){return true;} //###
if(c==”2c62105ee18ecd5f0ee37bc8c35718eb”){return true;} //###
if(c==”3994f23bfb2b89994bd6e828977b42ae”){return true;} //###
if(c==”28fd0fbd334515deb8a8291b71941c9e”){return true;} //###
if(c==”9ac05befca7d6499e3abec9bdfef2b68?){return true;} //###
if(c==”1732cb437260c60a0744aea8aedfa331?){return true;} //###
if(c==”e1eee5e2b42d45443cdc82db1a3bc465?){return true;} //###
if(c==”7d06a9cf10f2e9e47e77d6c6cfaa7f54?){return true;} //###
if(c==”2618045a3a5fc883e65b6bec2fcac3c8?){return true;} //###
if(c==”2421fcb1263b9530df88f7f002e78ea5?){return true;} //###
if(c==”fccb60fb512d13df5083790d64c4d5dd”){return true;} //###
if(c==”15d4e891d784977cacbfcbb00c48f133?){return true;} //###
if(c==”c203d8a151612acf12457e4d67635a95?){return true;} //###
if(c==”13f3cf8c531952d72e5847c4183e6910?){return true;} //###
if(c==”550a141f12de6341fba65b0ad0433500?){return true;} //###
if(c==”67f7fb873eaf29526a11a9b7ac33bfac”){return true;} //###
if(c==”1a5b1e4daae265b790965a275b53ae50?){return true;} //###
if(c==”9a96876e2f8f3dc4f3cf45f02c61c0c1?){return true;} //###
if(c==”941e1aaaba585b952b62c14a3a175a61?){return true;} //###
if(c==”9431c87f273e507e6040fcb07dcb4509?){return true;} //###
if(c==”49ae49a23f67c759bf4fc791ba842aa2?){return true;} //###
if(c==”e44fea3bec53bcea3b7513ccef5857ac”){return true;} //###
if(c==”821fa74b50ba3f7cba1e6c53e8fa6845?){return true;} //###
if(c==”250cf8b51c773f3f8dc8b4be867a9a02?){return true;} //###
if(c==”42998cf32d552343bc8e460416382dca”){return true;} //###
if(c==”0353ab4cbed5beae847a7ff6e220b5cf”){return true;} //###
if(c==”51d92be1c60d1db1d2e5e7a07da55b26?){return true;} //###
if(c==”428fca9bc1921c25c5121f9da7815cde”){return true;} //###
if(c==”f1b6f2857fb6d44dd73c7041e0aa0f19?){return true;} //###
if(c==”68ce199ec2c5517597ce0a4d89620f55?){return true;} //###
if(c==”e836d813fd184325132fca8edcdfb40e”){return true;} //###
if(c==”ab817c9349cf9c4f6877e1894a1faa00?){return true;} //###
if(c==”8e6b42f1644ecb1327dc03ab345e618b”){return true;} //###
if(c==”ef575e8837d065a1683c022d2077d342?){return true;} //###
if(c==”2050e03ca119580f74cca14cc6e97462?){return true;} //###
if(c==”25ddc0f8c9d3e22e03d3076f98d83cb2?){return true;} //###
if(c==”5ef0b4eba35ab2d6180b0bca7e46b6f9?){return true;} //###
if(c==”598b3e71ec378bd83e0a727608b5db01?){return true;} //###
if(c==”74071a673307ca7459bcf75fbd024e09?){return true;} //###
}
if(b==”69691c7bdcc3ce6d5d8a1361f22d04ac” b==”6f8f57715090da2632453988d9a1501b”)
{ //###
   if(c==”2bb232c0b13c774965ef8558f0fbd615?) {return true;} //###
   if(c==”ba2fd310dcaa8781a9a652a31baf3c68?) {return true;} //###
   if(c==”69421f032498c97020180038fddb8e24?) {return true;} //###
   if(c==”85422afb467e9456013a2a51d4dff702?) {return true;} //###
   if(c==”13f320e7b5ead1024ac95c3b208610db”) {return true;} //###
}
window.location=”rcrb2.html”;
return false;
}

This code is awesome because, even after reading the source code, nobody can figure out which models are faulty. At the most what we can understand is that Battery Information (Date of Manufacture, Location of Manufacture) is present between the 8th and 17th characters. The characters between 18 and 26th positions could hold the amount of batteries manufactured by the factory before the current unit. We also know that the battery is faulty for some 334 combinations of characters between the 14th and 17th positions. But having this knowledge is futile.

Hence by using a one-way Hash Algorithm (MD5 in this case), we can hide such information (factory codes of factories which manufactured the faulty batteries) even in the source code. This way we can protect such vital information from being stolen by anyone even if he has access to the complete source code, and this according to me is one of the most brilliant applications of One-way Hash functions.

Categories: Programming Tags: ,

String Termination in C/C++

January 9th, 2009 No comments

This is a typical assignment given to students who are learning C/C++.
“Write a function that copies the contents of one string into another.”
Note that in this example a string is referred to as a character array.

Some people come up with code similar to this one:

void copy(char [],char []);

int main()
{
   char s1[10],s2[10];
   printf(”Enter String 1:”);
   gets(s1);
   copy(s1,s2);
   printf(”nThe copied string is: “);
   puts(s2);
   return 0;
}

void copy(char x[],char y[])
{
   int i=0;
   while(x[i])
   {
      y[i]=x[i];
      i++;
   }
}

Can you notice that something is missing in the copy() function? Yes, the null termination character was not appended at the end of the new string. I corrected a friend of mine who the same mistake and he asked me why the code works for other strings as well such as “Sanch”.

Luckily, we were near a computer so I could demonstrate my examples straight away to convince him.

Observe the declaration of this character array:

char name[10]=”Sanchit”;

Internally, this String is stored like this:
S,a,n,c,h,i,t,,,

Yes, if the string initial declaration is less than the size provided in the square brackets, the rest of the elements are filled with zeros. This is applicable for arrays of any data type.

Since the ASCII Value of is 0, I can represent my array like this again,

S,a,n,c,h,i,t,0×0,0×0,0×0

Does this ring any bells?

If you create another array of the same size (here: 10) and copy this string to it using the code given above, the rest of the elements after the String are already . So there is no need to the character as C would know where the string terminates.

But what if the String is larger? Assuming “Sanchit Karve” is stored in the same Array, it will look like this:
S,a,n,c,h,i,t, <space> ,K,a,r,v,e, , , …

If you notice, that inspite of the string being larger than the array size, it is stored completely after occupying the space after the array. Since, now the space outside the array has been accessed, the data out there is not zero. Instead they contain garbage values. So, for such situations, we need to append the character at the end of the string, so that C can figure where the string ends. Otherwise, C will output all bytes till a zero is encountered.

So, appending the character is not required if we assume that the string length is lesser the array size. However, we should also ensure that we declare enough space before input larger than the allocated size because of the following reasons:

  • Unlike other Languages C/C++ do not provide Array Bound Checking features.
  • Hackers make the most of such types of errors, known as Buffer Overflow Errors, and launch Buffer Overflow Attacks where the string is inserted with shellcode (yes, in hex), and the return address of the function is overwritten to the address of the string. This results in the processor passing control to the shellcode and executing it after the function returns. But just because we append the character does not mean that Buffer Overflow Errors don’t occur. But it’s done just as a safety measure so that the program can run without faults.
Categories: Programming Tags: ,

Passing 2D Arrays to functions

January 9th, 2009 No comments

Passing 2D arrays without bounds is a common error people make.
Some people assume that if this works:

void someFunc(int []);

then this should should work too:

void anotherFunc(int [][]);

As normal as it may seem, it is incorrect.

For passing 2D Arrays to a function, the column element must be mentioned like this:

void func1(int [][upperSize]);

But why do we need to include the column element? If C/C++ compilers can figure out the size of a 1D array then why not for 2D?

That’s because 2D Array elements are stored consecutively as two 1D Arrays.

Have a look at this code snippet:

int main()
{
int x[][2]={1,2,3,4,5,6,7,8,9,10};
int *p,*q;
int i;
p=&x[0][0];
for(i=0;i<10;i++)
printf(”%d “,*(p+i));

printf(”nn Now using pointer q:n”);

q=&x[1][0];
for(i=0;i<10;i++)
printf(”%d “,*(q+i));

return 0;
}

Run the Program. You’ll get this as the output.
1 2 3 4 5 6 7 8 9 10

Now using pointer q:
3 4 5 6 7 8 9 10 0 4239532

Now change the number 2 in the array declaration to 5 like this:

int x[][5]={1,2,3,4,5,6,7,8,9,10};

Run the code and observe the output:
1 2 3 4 5 6 7 8 9 10

Now using pointer q:
6 7 8 9 10 0 4239532 0 4235541 1

As you can see, the output using the pointer p remains unchanged because we’re setting it to the first element of the array itself, so it iterates to every element after it.

But q is set to the first element of the 2nd row.
Now which one is the first element of the second row? It’s 6. How do we know it?
It’s because we’ve placed 5 in the column bracket to specify the bounds of each column.

So if we don’t place any value, the compiler gets confused as to how many numbers can be grouped into one column.
Hence whenever such confusion occurs, the compiler generates the error: size of the int[] is unknown or zero.

Categories: Programming Tags: ,

Reversing a String recursively

January 9th, 2009 No comments

When you’re asked to reverse a string, you’ll mostly use the strrev() function or write your own boring implementation using loops.
Ever tried it recursively?

Have a look at this:
#include <iostream>

using namespace std;

void ret_str(char* s)
{

if(*s != ”)
ret_str(s+1);

cout << *s;

}

int main()
{
ret_str(”born2c0de”);
return 0;
}

Isn’t that some neat piece of code? All we have to do is push the next character to the stack, so when the stack is popped, the characters come out in reverse order.

However this functions isn’t efficient, it’s horribly slow and sluggish and larger strings will result in overloading the stack so you’re better off using the functions which use loops internally.

Categories: Programming Tags: , ,

Consequences of Data type Range Violations

January 9th, 2009 2 comments

I’ve come across a lot of people who wonder why a number stored in a variable, if exceeds its range, turns into a weirder negative number.
To explain this phenomena they offer a conclusion without showing any evidence and this conclusion is in no way close to the truth.
Here’s what happens.

The numbers change because of the complement notation that the computer uses to store negative numbers.

And it’s not the compiler’s fault but the processor’s limitation.
This code will give negative numbers on 32-bit,16-bit and 8bit Microprocessors as each register can hold 32/16/8 bits respectively.
You won’t encounter this problem on 64-bit processors as each register can hold a maximum of 64-bits. (Actually, you will encounter the same problem since the value of INT_MAX will be a 64-bit number for a 64 bit processor….but it’ll work fine with 32-bit numbers)

Adding a Number to a maximum value set in these registers creates the problem.
Have a look at this printf statement (I’ve typecasted it for portability)

printf(”%d = %xn%d = %x”,INT_MAX,INT_MAX,(unsigned long)INT_MAX+1,(unsigned long)INT_MAX+1);

The compiler generates the following code:

push    10000000000000000000000000000000b
push    10000000000000000000000000000000b
push    1111111111111111111111111111111b
push    1111111111111111111111111111111b
push    offset aDXDX ; format
call    _printf
add     esp, 14
…
…
aDXDX  db ‘%d = %x’,0Ah
       db ‘%d = %x’,0

Now, the values in the first 2 push Instructions is the binary representation of INT_MAX (2147483647 for 32-bit systems). Note that it is comprised of only ONE’s.

When 1 is added to INT_MAX, look what happens to the next two push instructions.
This number is 2147483648 but in unsigned notation…but in signed notation it’s -2147483648 since the Most Significant Bit decides the Sign of a Number.

And since by default the compiler assumes that a variable is signed, you get the negative value.

Try the same C code after replacing %d with the %u format specifier.
You’ll notice that 2147483648 will be the output.

Remember, there’s nothing like postive or negative numbers…It’s all about interpretation. 111 (for a 3 bit architecture) could mean 7(without using MSB for sign-convention…ie. unsigned) as well as -1 (using MSB for sign ie. signed).

You might wonder why I’ve blamed the Processor and not the compiler in spite of the fact that the Compiler has precalculated the result of addition and passed it to printf.
The reason is that I’ve compiled the above code in Aggressive Optimization, so the compiler generates push instructions.
Normally the Compiler plays safe and lets the processor take care of such values by generating the following code:

mov eax, 1111111111111111111111111111111b
lea edx, [eax+1]
push edx

void main() v/s int main()

January 8th, 2009 2 comments

Many people use void main() instead of int main() while writing C/C++ programs, inspite of void not being accepted as a return type for main in the C/C++ standard.

I too used to do the same until some programmers on a programming forum told me a few years ago about the C++ Standard.

But what does void do anyway? How is it different from return 0?

I disassembled a few test programs to check what really happens under the hood.  It is interesting to know what happens when a program is executed.

Almost everybody believes that main()/WinMain() is the entrypoint of all C/C++ programs but it isn’t so.

On Windows, a start() function gets called before the main() function. It first calls GetVersion(), GetCommandLine(), GetEnvironmentStrings(), GetStartupInfo() and GetModuleHandle() API Functions from kernel32.dll in the exact order. Then it passes the Module Handle, command-line arguments and environment variables as arguments to main() and then calls it.

But what about main()’s return value? Is it read after main() quits?

It doesn’t matter what data type you return from main(). Whether its int or void it doesn’t make any difference.
Here’s what actually happens.
In an executable file, the start() function which calls the main() function, expects a return value of data type integer so that it may decide what argument is passed to the exit() function which is called after main().
If void is chosen as the return type for main, the EAX register is passed as an argument to the exit function.
Since the EAX register almost always contains the return value of any function, so if we say return 0, EAX would be set to zero, that’s all.
Generally whenever void is used, EAX is set to zero at the end of main() , so it is exactly the same as passing return 0.
But since void main() is not a part of the standard, compiler developers are given full freedom to design their own implementation of the code after main() returns. Hence, we cannot always assume that EAX is set to 0 with a void main().

Ever wondered what happens to the return value after main() quits?
Take a look at this disassembled snippet of the start() function from a C/C++ program.

call dword ptr [esi+18h] ; call main()
add esp, 0Ch
; (4 x 3) 12 bytes are cleared from the stack to clear space
; occupied by main()'s three arguments.
push eax ; status for exit. Return value of main() is pushed
call _exit

So no matter what you do, even if you ignore the return value type, the compiler will still pass the contents of the EAX Register into the exit function. This could lead to some unwanted results, and hence it is better to return 0 (ie. EXIT_SUCCESS) or 1(EXIT_FAILURE) depending on when and why you have designed the program to end.

So even though, it is almost always the same, there is a reason why the standard recommends using int. Simply for the following reasons:
returning 0 by default is COMPILER DEPENDENT.
No Guarantee about the content of the EAX Register after main() exits.
Will not be portable to all Operating Systems.
May cause incorrect termination of main()

So even though void main and return 0 are alike, we should try to avoid its use.

Microsoft does it differently

January 8th, 2009 1 comment

A while back I was studying the Len() function of VB6 and I came across something interesting.

Contrary to our belief that this function counts the number of characters in a string and returns it, it actually does something totally different.

Here’s how it works.

When any string is stored in VB, it is automatically stored in this format (in unicode):

Suppose the String is “ABC”:

06 00 00 00 41 00 42 00 43 00

The 41 to 43 part is a typical unicode style of storing strings, but 2 unicode characters before that, the number of bytes occupied by the string (including zeros) is stored (which is similar to Pascal style strings).
Hence 06 stands for 6 bytes occupied by “ABC” (since it’s stored as A,0,B,0,C,0)

So all that the Len() does is read 4 bytes before the beginning of the string and return that value itself, instead of calculating the length of the string.

So actually, the Len Function does nothing except read the length from the string format and return it.
Want to see how they do it?
Here’s the Disassembled Listing of the Len() Function MSVBVM60.DLL DLL File.

__vbaLenBstr proc near
   string = dword ptr 4

   mov eax, [esp+string] ; eax points to string
   test eax, eax ; ZF,SF,PF = EAX and EAX
   jz short break ; If String is Null then break from loop
   mov eax, [eax-4] ; Gets Unicode Length stored before string.

   ; Here’s how Text from textbox is stored internally:
   ; If text is born2c0de then in memory:
   ; (0×12 0×00) 0×00 0×00 (0×62 0×00 …)
   ; ie. length of string in bytes(unicode) (here 18 bytes)
   ; followed by a Unicode 0 (0×00 0×00)
   ; followed by the actual string in unicode
   ; (’b’ 00 ‘o’ 00 … ‘e’ 00)
   ; So [eax-4] just gets the unicode length of
   ; string which already is stored when a
   ; string is taken as input from keyboard.

   ; The Len() function doesn’t even calculate length!!!
   shr eax, 1 ; Divides Length by 2 to get Actual Length.
   ; Uses eax so it can be used as a return value.

break:
   retn 4
__vbaLenBstr endp