4 min read
On this page

C Strings

C has no string type. A "string" in C is a null-terminated array of characters — a sequence of bytes ending with the value '\0' (zero). Every string function in the standard library relies on this convention. Every buffer overflow vulnerability in the history of C string handling stems from it. Understanding C strings means understanding that they are just arrays of bytes with a termination convention, and that getting the termination or the buffer size wrong has consequences.

What Is a C String?

A C string is a char array where the last meaningful character is followed by a null byte ('\0', which has the integer value 0).

#include <stdio.h>

int main(void) {
    // These are equivalent
    char s1[] = "hello";
    char s2[] = {'h', 'e', 'l', 'l', 'o', '\0'};

    printf("s1: %s (size: %zu)\n", s1, sizeof(s1));
    printf("s2: %s (size: %zu)\n", s2, sizeof(s2));

    // The null terminator is there
    for (size_t i = 0; i < sizeof(s1); i++) {
        printf("s1[%zu] = '%c' (%d)\n", i, s1[i] ? s1[i] : '?', s1[i]);
    }

    return 0;
}
s1: hello (size: 6)
s2: hello (size: 6)
s1[0] = 'h' (104)
s1[1] = 'e' (101)
s1[2] = 'l' (108)
s1[3] = 'l' (108)
s1[4] = 'o' (111)
s1[5] = '?' (0)

"hello" is 5 visible characters plus 1 null terminator = 6 bytes. The compiler adds the '\0' automatically for string literals. If you build a string manually, you must add it yourself.

Essential String Functions

All are declared in <string.h>.

strlen — String Length

Returns the number of characters before the null terminator:

#include <stdio.h>
#include <string.h>

int main(void) {
    char msg[] = "hello";
    printf("strlen: %zu\n", strlen(msg));     // 5
    printf("sizeof: %zu\n", sizeof(msg));     // 6 (includes '\0')

    return 0;
}
strlen: 5
sizeof: 6

strlen walks the array until it finds '\0'. If there is no null terminator, it walks off the end of the array — undefined behavior.

strcpy — String Copy

Copies a string including the null terminator:

#include <string.h>

char dest[32];
strcpy(dest, "hello");   // copies 6 bytes (5 chars + '\0')

strcpy does not check whether dest is large enough. If the source string is longer than the destination buffer, you have a buffer overflow.

strncpy — Bounded String Copy

#include <string.h>
#include <stdio.h>

int main(void) {
    char buffer[8];
    strncpy(buffer, "hello, world", sizeof(buffer));
    // WARNING: strncpy does NOT guarantee null termination
    // if src is longer than n, no '\0' is written
    buffer[sizeof(buffer) - 1] = '\0';   // manually ensure termination

    printf("buffer: %s\n", buffer);

    return 0;
}
buffer: hello,

strncpy has a surprising behavior: if the source is shorter than n, it pads the rest of the destination with null bytes. If the source is longer, it does not null-terminate. Always manually null-terminate after strncpy.

strcmp — String Comparison

#include <string.h>
#include <stdio.h>

int main(void) {
    const char *a = "apple";
    const char *b = "banana";

    int result = strcmp(a, b);
    if (result < 0) {
        printf("\"%s\" comes before \"%s\"\n", a, b);
    } else if (result > 0) {
        printf("\"%s\" comes after \"%s\"\n", a, b);
    } else {
        printf("strings are equal\n");
    }

    return 0;
}
"apple" comes before "banana"

strcmp returns 0 for equal strings, a negative value if the first string comes before the second, and a positive value otherwise. Never compare strings with == — that compares pointer addresses, not contents.

strcat — String Concatenation

#include <string.h>
#include <stdio.h>

int main(void) {
    char buffer[64] = "Hello";     // must be large enough for result
    strcat(buffer, ", ");
    strcat(buffer, "world!");
    printf("%s\n", buffer);

    return 0;
}
Hello, world!

Like strcpy, strcat does not check buffer bounds. Each call walks the destination string to find the end, then appends. Repeated concatenation with strcat is O(n^2) in the total string length.

The Buffer Overflow Problem

This is the bug that has caused more security vulnerabilities than any other:

#include <stdio.h>
#include <string.h>

void greet(const char *name) {
    char buffer[16];
    // BUG: if name is longer than 15 characters, this overflows
    strcpy(buffer, "Hello, ");
    strcat(buffer, name);
    strcat(buffer, "!");
    printf("%s\n", buffer);
}

int main(void) {
    greet("Al");                           // fine: "Hello, Al!" fits
    greet("Bartholomew Reginald Smith");   // overflow: 35+ chars into 16-byte buffer
    return 0;
}

The strcat writes past the end of buffer, overwriting adjacent stack memory. In the worst case, this overwrites the function's return address, allowing an attacker to redirect execution to arbitrary code.

Safe Alternatives

snprintf — The Right Way to Build Strings

#include <stdio.h>

void greet_safe(const char *name) {
    char buffer[64];
    int written = snprintf(buffer, sizeof(buffer), "Hello, %s!", name);

    if (written < 0) {
        fprintf(stderr, "encoding error\n");
        return;
    }
    if ((size_t)written >= sizeof(buffer)) {
        fprintf(stderr, "output truncated (%d chars needed)\n", written);
    }

    printf("%s\n", buffer);
}

int main(void) {
    greet_safe("world");
    greet_safe("a very long name that might exceed our buffer size if we are not careful");
    return 0;
}
Hello, world!
Hello, a very long name that might exceed our buffer size if we are not careful!

snprintf never writes more than size bytes (including the null terminator). It returns the number of characters that would have been written if the buffer were large enough. If the return value is >= the buffer size, the output was truncated.

snprintf is the single most important string function in C. Use it instead of sprintf, strcpy, and strcat whenever possible.

String Literals Are Read-Only

#include <stdio.h>

int main(void) {
    // String literal: stored in read-only memory
    const char *greeting = "hello";

    // This compiles but crashes at runtime (or is UB):
    // char *mutable = "hello";
    // mutable[0] = 'H';   // undefined behavior

    // If you need a modifiable string, use an array:
    char editable[] = "hello";
    editable[0] = 'H';
    printf("%s\n", editable);   // Hello

    return 0;
}
Hello

String literals like "hello" are stored in a read-only section of the executable. A pointer to a literal should always be const char *. If you need to modify the string, copy it into a char array.

String/Number Conversions

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    // String to integer: use strtol, not atoi
    const char *num_str = "12345";
    char *end;
    long value = strtol(num_str, &end, 10);

    if (*end != '\0') {
        printf("parse error at: %s\n", end);
    } else {
        printf("parsed: %ld\n", value);
    }

    // String to double: use strtod
    const char *pi_str = "3.14159";
    double pi = strtod(pi_str, &end);
    printf("pi: %f\n", pi);

    // Number to string: use snprintf
    char buffer[32];
    snprintf(buffer, sizeof(buffer), "%d", 42);
    printf("as string: \"%s\"\n", buffer);

    return 0;
}
parsed: 12345
pi: 3.141590
as string: "42"

Never use atoi — it has no error handling. If the input is not a valid number, atoi returns 0 with no way to distinguish from a valid "0". strtol reports errors through the end pointer and errno.

Common Pitfalls

  • Forgetting the null terminator — every string must end with '\0'. If you build a string byte by byte or with strncpy on a long source, you must add the terminator yourself. Without it, strlen, printf("%s"), and every other string function will read past the end of your buffer.
  • Using strcpy/strcat without checking buffer size — these functions do not know the destination size. Use snprintf instead.
  • Comparing strings with ==== compares pointer addresses. "hello" == "hello" may be true (the compiler may merge identical literals) or false. Always use strcmp.
  • Modifying string literalschar *s = "hello"; s[0] = 'H'; is undefined behavior. Use char s[] = "hello" for modifiable strings.
  • Using strlen in a loop conditionfor (int i = 0; i < strlen(s); i++) calls strlen on every iteration, making the loop O(n^2). Compute the length once before the loop.
  • Off-by-one with buffer sizes — a string of length n requires n + 1 bytes. char buf[5] can hold a 4-character string, not a 5-character string.

Key Takeaways

  • C strings are null-terminated char arrays. The '\0' terminator is what makes them strings — without it, they are just arrays of bytes.
  • strcpy and strcat do not check buffer sizes. They are the primary cause of buffer overflow vulnerabilities. Use snprintf instead.
  • strncpy does not guarantee null termination. Always manually terminate after using it.
  • String literals are read-only. Use const char * for pointers to literals and char[] for modifiable copies.
  • Use strtol/strtod for string-to-number conversion, never atoi. Error handling matters.
  • snprintf is the safest and most versatile string function in C. It handles formatting, bounds checking, and null termination in one call.
  • The string handling mistakes that C makes possible are responsible for the majority of security vulnerabilities in C programs. Every buffer overflow, every format string vulnerability, every read-past-end bug traces back to the null-terminated byte array convention.