• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

Get your programs assembly code and more information

Contents

  • Get your programs assembly code and more information
    • General Information
      • Pointers
      • $ Dollar and % Percentage signs
      • Instruction
    • Simple example
      • C-Code
      • Assembly
      • Further information
    • Fibonacci
      • C-Code
      • Assembly
    • References

I've talked today with a fellow student about some system internals and we weren't sure what actually happens. So I needed the assembly code of some example programs.

General Information

It is important to know that I will use AT&T syntax in this article! This is AT&T Syntax:

movl %esp, %ebp

And this is Intel Syntax:

MOVL EBP, ESP

Pointers

  • %esp: Stack pointer for top address of the stack.
  • %ebp: Stack base pointer for holding the address of the current stack frame.
  • %eax: Accumulator

The size of the eax register will always be 32 bit, regardless of the system's register size.[6]

$ Dollar and % Percentage signs

\(i, with $i \in mathbb{N}\), is a constant and percentages mean registers.[10][11]

Instruction

pushl : To push the source operand onto the stack[1] movl , : moves a long[2] call : Calls function (which might be printf, putchar, ...) subl $16, %esp: allocate a local variable[3] ret: transfers control back to the place where the current function was called.[3] leave: sets the stack pointer to the base frame address, effectively releasing the whole frame[4] andl $-16, %esp: Ands the stack with fffffff0 which effectivly aligns it on a 16 byte boundary. Access to aligned values on the stack are much faster than if they were unaligned. [5] jmp : Jump to target label. jbe : Jump below or equal. I am not quite sure what is compared ... can anybody help me? leal : Load effective address.[9] The LEA instruction never reads memory, it only computes the address that would be read by another instruction and stores this address in its first register operand.

Suffixes

Many instructions have suffixes. This is what they mean[6]:

  • b: byte (8 bit)
  • s: short (16 bit integer) or single (32-bit floating point)
  • w: word (16 bit)
  • l: long (32 bit integer or 64-bit floating point)
  • q: quad (64 bit)
  • t: ten bytes (80-bit floating point)

Simple example

C-Code

This program simply outputs

#include <stdio.h>

int main(void)
{
    printf("%i", 1337*42);
    return 0;
}

Assembly

Now I compile it and I save the assembly code:

gcc -S test.c; gcc test.c -o test

This gives me test.s (the assembly code) and an executable called "test".

    .file   "test.c"
    .section    .rodata
.LC0:
    .string "%i"
    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $16, %esp
    movl    $.LC0, %eax
    movl    $56154, 4(%esp)
    movl    %eax, (%esp)
    call    printf
    movl    $0, %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

This is code of the GNU Assembler. I guess other assemblers might produce other code. Could anybody please give me an example of other assemblers?

The first and most important thing you might notice is that neither "1337" nor "42" appear in the assembly code, but 56154 which is 1337*42. I didn't use any optimization options! You might also notice that constants begin with a dollar sign and registers (esp, ebp) begin with a percent sign.

The following ones are called assembly directives. They tell the assembler what to do next. .file, .section, .size and .ident are such directives. .data might be the most well-known one and tells the assembler to store something in the data segment of the program. .LC0 is a label for the immediately following string. .globl indicates that the following label (in this case "main") is a global symbol.

Line 14: I'm not quite sure why you need the 4. I thought the integer size could be the reason (see variable sizes in C), but as I used a string it still worked. As I used a character, it disappeared.

Further information

objdump gives even more information!

Archive header information: objdump -a test

test:     file format elf32-i386
test

File header information: objdump -f test

test:     file format elf32-i386
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x08048330

Object specific file header contents: objdump -p test

test:     file format elf32-i386

Program Header:
    PHDR off    0x00000034 vaddr 0x08048034 paddr 0x08048034 align 2**2
         filesz 0x00000100 memsz 0x00000100 flags r-x
  INTERP off    0x00000134 vaddr 0x08048134 paddr 0x08048134 align 2**0
         filesz 0x00000013 memsz 0x00000013 flags r--
    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x000004d8 memsz 0x000004d8 flags r-x
    LOAD off    0x00000f0c vaddr 0x08049f0c paddr 0x08049f0c align 2**12
         filesz 0x00000108 memsz 0x00000110 flags rw-
 DYNAMIC off    0x00000f20 vaddr 0x08049f20 paddr 0x08049f20 align 2**2
         filesz 0x000000d0 memsz 0x000000d0 flags rw-
    NOTE off    0x00000148 vaddr 0x08048148 paddr 0x08048148 align 2**2
         filesz 0x00000044 memsz 0x00000044 flags r--
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
         filesz 0x00000000 memsz 0x00000000 flags rw-
   RELRO off    0x00000f0c vaddr 0x08049f0c paddr 0x08049f0c align 2**0
         filesz 0x000000f4 memsz 0x000000f4 flags r--

Dynamic Section:
  NEEDED               libc.so.6
  INIT                 0x080482bc
  FINI                 0x080484ac
  HASH                 0x0804818c
  GNU_HASH             0x080481b4
  STRTAB               0x08048224
  SYMTAB               0x080481d4
  STRSZ                0x0000004c
  SYMENT               0x00000010
  DEBUG                0x00000000
  PLTGOT               0x08049ff4
  PLTRELSZ             0x00000018
  PLTREL               0x00000011
  JMPREL               0x080482a4
  REL                  0x0804829c
  RELSZ                0x00000008
  RELENT               0x00000008
  VERNEED              0x0804827c
  VERNEEDNUM           0x00000001
  VERSYM               0x08048270

Version References:
  required from libc.so.6:
    0x0d696910 0x00 02 GLIBC_2.0

Display the contents of the section headers: objdump -h test

test:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  08048134  08048134  00000134  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 00000020  08048148  08048148  00000148  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  08048168  08048168  00000168  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .hash         00000028  0804818c  0804818c  0000018c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .gnu.hash     00000020  080481b4  080481b4  000001b4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynsym       00000050  080481d4  080481d4  000001d4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .dynstr       0000004c  08048224  08048224  00000224  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .gnu.version  0000000a  08048270  08048270  00000270  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 .gnu.version_r 00000020  0804827c  0804827c  0000027c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .rel.dyn      00000008  0804829c  0804829c  0000029c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 .rel.plt      00000018  080482a4  080482a4  000002a4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 11 .init         00000030  080482bc  080482bc  000002bc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .plt          00000040  080482ec  080482ec  000002ec  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 13 .text         0000017c  08048330  08048330  00000330  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 14 .fini         0000001c  080484ac  080484ac  000004ac  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 15 .rodata       0000000b  080484c8  080484c8  000004c8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 16 .eh_frame     00000004  080484d4  080484d4  000004d4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 17 .ctors        00000008  08049f0c  08049f0c  00000f0c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 18 .dtors        00000008  08049f14  08049f14  00000f14  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 19 .jcr          00000004  08049f1c  08049f1c  00000f1c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 20 .dynamic      000000d0  08049f20  08049f20  00000f20  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 21 .got          00000004  08049ff0  08049ff0  00000ff0  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 22 .got.plt      00000018  08049ff4  08049ff4  00000ff4  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 23 .data         00000008  0804a00c  0804a00c  0000100c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 24 .bss          00000008  0804a014  0804a014  00001014  2**2
                  ALLOC
 25 .comment      00000023  00000000  00000000  00001014  2**0
                  CONTENTS, READONLY

Display DWARF info in the file: objdump --dwarf test

test:     file format elf32-i386

Contents of the .eh_frame section:

00000000 ZERO terminator

By the way, ELF is an executable file format and DWARF is a debugging file format. I guess they had to think quite long to find this backronym.

Fibonacci

C-Code

This is the most simple version of Fibonacci I could find:[7]

#include <stdio.h>

unsigned int fib(unsigned int n)
{
    return n < 2 ? n : fib(n-1) + fib(n-2);
}

int main(void)
{
    printf("%i", fib(13));
    return 0;
}

Assembly

    .file   "test.c"
    .text
.globl fib
    .type   fib, @function
fib:
    pushl   %ebp
    movl    %esp, %ebp
    pushl   %ebx
    subl    $20, %esp
    cmpl    $1, 8(%ebp)
    jbe .L2
    movl    8(%ebp), %eax
    subl    $1, %eax
    movl    %eax, (%esp)
    call    fib
    movl    %eax, %ebx
    movl    8(%ebp), %eax
    subl    $2, %eax
    movl    %eax, (%esp)
    call    fib
    leal    (%ebx,%eax), %eax
    jmp .L3
.L2:
    movl    8(%ebp), %eax
.L3:
    addl    $20, %esp
    popl    %ebx
    popl    %ebp
    ret
    .size   fib, .-fib
    .section    .rodata
.LC0:
    .string "%i"
    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $16, %esp
    movl    $13, (%esp)
    call    fib
    movl    $.LC0, %edx
    movl    %eax, 4(%esp)
    movl    %edx, (%esp)
    call    printf
    movl    $0, %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

References

  1. ↑: PUSHL Instruction. The University of Auckland, Department of Computer Science.
  2. ↑: IA-32 Assembly for Compiler Writers. Douglas Thain, Associate Professor, University of Notre Dame, Department of Computer Science and Engineering.
  3. ↑: From C To Assembly Language . Hiran Ramankutty, Linux Gazett, Issue 94.
  4. ↑: About leave in x86 assembly. zneak, Stackoverflow.
  5. ↑: GCC's assembly output of an empty program on x86, win32. nos, Stackoverflow.
  6. ↑: Why would one use “movl $1, %eax” as opposed to, say, “movb $1, %eax”. Jason, Stackoverflow.
  7. ↑: Fibonacci numbers (C). Literate Programs.
  8. ↑: The 68000's Instruction Set, page 27. Literate Programs.
  9. ↑: LEAL Assembler instruction. Nils Pipenbrinck, Stackoverflow
  10. ↑: What does this dollar sign mean in __asm?. Zimbabao, Stackoverflow
  11. ↑: What do the dollar ($) and percentage (%) signs represent in assembly intel x86?. Necrolis, Stackoverflow

Published

Mär 1, 2012
by Martin Thoma

Category

Code

Tags

  • Assembly language 5
  • C 23

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor