Exploiting Buffer Overflow in a C Program to Bypass Password Prompt

07 Oct 2017

Overview

This exercise takes students through the creation of a simple C program, one which is vulnerable to a buffer overflow attack. GDB is used to illustrate how the attack works and, more generally, how the concept of a stack is integral to the execution of compiled programs. Note: “MPesa” is used here as merely an example system that students are typically interested in; nothing here is reflective of the actual MPesa system, how it operates, or how it could be accessed surreptitiously.

Prerequisites

  • A Kali Linux instance:
    • No extra programs beyond those provided by a clean install are needed to complete this exercise.

Background Info to Provide Students

Students should incrementally build up to the final program involved in this exercise, manually typing code from the provided goals into a file called mpesa.c on their own machines. They should compile and run after completing each stage; the instructor should verify before providing the next goal.

The first goal (available here as PDF) starts with the menu and user selection code:

/**************************************************
 * mpesa.c
 * To compile:    $ gcc mpesa.c -o mpesa
 * To run:        $ ./mpesa
 **************************************************/

#include <stdio.h>
#include <string.h>

int main(void) {

    char buffer[15];
    int is_password_correct = 0;

    printf("\nKaribu Mfumo wa M-PESA");
    printf("\n=====================================");
    printf("\n1) Generate token kwa umeme.");
    printf("\n2) View system uptime.");
    printf("\n3) Exit.");

    printf("\n\nIngiza namba:");
    char choice = getchar();
    getchar();

    if (choice == '1') {
      printf("\nTODO: Write this code.\n");
    } else if (choice == '2') {
      printf("\nTODO: Write this code.\n");
    } else if (choice == '3') {
      printf("\nKwa heri.\n");
    } else {
      printf("\nUnknown choice. Exiting.\n");
    }
}

The next goal (available here as PDF) adds code for uptime retrieval and is as follows:

/**************************************************
 * mpesa.c
 * To compile:    $ gcc mpesa.c -o mpesa
 * To run:        $ ./mpesa
 **************************************************/

#include <stdio.h>
#include <string.h>

#include <errno.h>
#include <linux/unistd.h>
#include <linux/kernel.h>
#include <sys/sysinfo.h>

int main(void) {

    char buffer[15];
    int is_password_correct = 0;

    printf("\nKaribu Mfumo wa M-PESA");
    printf("\n=====================================");
    printf("\n1) Generate token kwa umeme.");
    printf("\n2) View system uptime.");
    printf("\n3) Exit.");

    printf("\n\nIngiza namba:");
    char choice = getchar();
    getchar();

    if (choice == '1') {
      printf("\nTODO: Write this code.\n");
    } else if (choice == '2') {
      struct sysinfo s_info;
      int error = sysinfo(&s_info);
      if(error != 0) {
        printf("\nUnable to get uptime; error code: %d\n", error);
      }
      printf("\nSystem up for %d seconds.\n", s_info.uptime);
    } else if (choice == '3') {
      printf("\nKwa heri.\n");
    } else {
      printf("\nUnknown choice. Exiting.\n");
    }
}

The final goal (available here as PDF) involves the code for password verification and token generation:

/**************************************************
 * mpesa.c
 * To compile:    $ gcc mpesa.c -o mpesa
 * To run:        $ ./mpesa
 **************************************************/

#include <stdio.h>
#include <string.h>

#include <errno.h>
#include <linux/unistd.h>
#include <linux/kernel.h>
#include <sys/sysinfo.h>

#include <stdlib.h>
#include <time.h>

int main(void) {

    char buffer[15];
    int is_password_correct = 0;

    printf("\nKaribu Mfumo wa M-PESA");
    printf("\n=====================================");
    printf("\n1) Generate token kwa umeme.");
    printf("\n2) View system uptime.");
    printf("\n3) Exit.");

    printf("\n\nIngiza namba:");
    char choice = getchar();
    getchar();

    if (choice == '1') {
      printf("\nEnter the password: ");
      gets(buffer);
      // TODO: Replace "PASSWORD" with a strong password
      // of your choice (15 characters or less).
      if(strcmp(buffer, "PASSWORD") != 0) {
        printf("\nIncorrect password; you aren't allowed to create tokens.\n");
      } else {
        printf("\nCorrect password.\n");
        is_password_correct = 1;
      }

      if (is_password_correct) {
        printf("Generating token...\n");
        srand(time(NULL));
        printf("Token: %d %d %d %d",
            rand() % 10000, rand() % 10000, rand() % 10000, rand() % 10000);
        printf("\nKwa heri.\n");
      }
    } else if (choice == '2') {
      struct sysinfo s_info;
      int error = sysinfo(&s_info);
      if(error != 0) {
        printf("\nUnable to get uptime; error code: %d\n", error);
      }
      printf("\nSystem up for %d seconds.\n", s_info.uptime);
    } else if (choice == '3') {
      printf("\nKwa heri.\n");
    } else {
      printf("\nUnknown choice. Exiting.\n");
    }
}

Demonstration

Once their code looks like that of the final goal above, have them change the hard-coded password of “PASSWORD” to a password of their choice (so long as it’s less than 15 characters). Have them compile the modified program and provide only the binary.

Run the binary a few times with a few passwords less than 15 characters, to demonstrate that the password verification code works as expected.

Then, enter the following passwords to overflow the input buffer and force the program to erroneously generate tokens:

  • 123456789012345678901234
  • aaaaaaaaaaaaaaaaaaaaaaaa
  • and so on, the key point being that these passwords are 24 bytes long: 15 for buffer, 8 for the two words stored after/below the buffer on the stack, and 1 for the first byte of the is_pwd_correct integer.

Returning to the students’ source code, point to the gets(buffer) line and explain that this function takes whatever it is given, and thus when given inputs that are too long, other parts of the program are written over with whatever we input.

Explanation

To illustrate how this works, students need a lot of visuals about how the stack is laid out and how its used during program execution. Although not as visual, it may be helpful to use gdb as well:

  • $ gcc -ggdb mpesa.c -o mpesa
  • gdb mpesa
  • (gdb) break 38 (the line immediately after getting the password)
  • (gdb) run
    • Enter 1, then an incorrect but non-overflowing password like hello
  • (gdb) info frame to show memory addresses
  • (gdb) p &buffer to show the address of the buffer variable
  • (gdb) x/50ub &buffer to show “hello” (or whatever password was entered previously) in ASCII decimal form, along with the bytes that succeed it on the stack
  • (gdb) x/4ub &is_password_correct to show that the password flag is still 0 and thus considered incorrect
  • (gdb) continue to show that no token is generated, as expected

Now run the program in gdb with a password that overflows the input buffer:

  • (gdb) run
    • Enter 1, then an incorrect and overflowing password like 123456789012345678901234
  • (gdb) x/50ub &buffer to show the contents of buffer, along with the bytes succeeding it which have now been overwritten
  • (gdb) x/4ub &is_password_correct to show that the password is no longer 0, and thus the password is considered correct
  • (gdb) continue to show that a token is generated despite the password being incorrect
  • (gdb) quit

Reflections

I had originally hoped to include a section in this exercise wherein students would use gdb to analyze a modified binary of my own in order to determine how long the input would need to be in order to force an overflow themselves. But students were not ready for this; they understood the stated reason for the stack, but lacked the visuals and experience necessary to intuitively understand it and apply their understanding to adhoc situations.

One thing that I did find useful in hindsight was to install hexer and show students that the hard-coded password can easily be found by inspecting the compiled binary itself. This lent itself to a discussion of why including passwords in binaries, at least as simple unobfuscated strings, is not a good idea.