To get the command line arguments in x86_64 on Mac OS X, I can do the following:
_main: sub rsp, 8 ; 16 bit stack alignment mov rax, 0 mov rdi, format mov rsi, [rsp + 32] call _printf
Where format is "%s". rsi gets set to argv.
So, from this, I drew out what (I think) the stack looks like initially:
top of stack <- rsp after alignment return address <- rsp at beginning (aligned rsp + 8) [something] <- rsp + 16 argc <- rsp + 24 argv <- rsp + 32 argv <- rsp + 40 ... ... bottom of stack
And so on. Sorry if that's hard to read. I'm wondering what [something] is. After a few tests, I find that it is usually just 0. However, occasionally, it is some (seemingly) random number.
Also, could you tell me if the rest of my stack drawing is correct?
You have it close.
argv is an array pointer, not where the array is. In
C it is written
char **argv, so you have to do two levels of dereferencing to get to the strings.
top of stack <- rsp after alignment return address <- rsp at beginning (aligned rsp + 8) [something] <- rsp + 16 argc <- rsp + 24 argv <- rsp + 32 envp <- rsp + 40 (in most Unix-compatible systems, the environment ... ... string array, char **envp) bottom of stack ... somewhere else: argv <- argv+0: address of first parameter (program path or name) argv <- argv+8: address of second parameter (first command line argument) argv <- argv+16: address of third parameter (second command line argument) ... argv[argc] <- argv+argc*8: NULL
argvaren't on the stack, x86-64 System V passes the first up-to-6 integer/pointer args in registers. The OP loading
[rsp+32]just happened to work if the stuff that called
maindidn't adjust the stack much since process entry (where the
argvarray itself is on the stack, so
_startwill do something like
mov edi, [rsp](argc) and
lea rsi, [rsp+8](argv) to get the args for
main. (And another LEA with RDI*8 to get envp) - Peter Cordes 2019-01-31 03:45
According to the AMD64 ABI (3.2.3, Parameter Passing), the parameters for
main(int argc, char **argv) are passed to (in left-to-right order)
rsi because they are of INTEGER class.
envp, if it were used, would be passed into
rdx, and so forth.
gcc places them into the current frame as follows (presumably for convenience? freeing up registers?):
mov DWORD PTR [rbp-0x4], edi mov QWORD PTR [rbp-0x10], rsi
When the frame pointer is omitted, the addressing is relative to
argv would be one eightbyte below
argc comes first, although that's not mandated) and therefore:
# after prologue mov rax, QWORD PTR [rbp-0x10] # or you could grab it from rsi, etc. add rax, 0x8 mov rsi, QWORD PTR [rax] mov edi, 0x40064c # format call 400418 <printf@plt>