C Strings
C has no string type. A "string" in C is a null-terminated array of characters — a sequence of bytes ending with the value '\0' (zero). Every string function in the standard library relies on this convention. Every buffer overflow vulnerability in the history of C string handling stems from it. Understanding C strings means understanding that they are just arrays of bytes with a termination convention, and that getting the termination or the buffer size wrong has consequences.
What Is a C String?
A C string is a char array where the last meaningful character is followed by a null byte ('\0', which has the integer value 0).
#include <stdio.h>
int main(void) {
// These are equivalent
char s1[] = "hello";
char s2[] = {'h', 'e', 'l', 'l', 'o', '\0'};
printf("s1: %s (size: %zu)\n", s1, sizeof(s1));
printf("s2: %s (size: %zu)\n", s2, sizeof(s2));
// The null terminator is there
for (size_t i = 0; i < sizeof(s1); i++) {
printf("s1[%zu] = '%c' (%d)\n", i, s1[i] ? s1[i] : '?', s1[i]);
}
return 0;
}
s1: hello (size: 6)
s2: hello (size: 6)
s1[0] = 'h' (104)
s1[1] = 'e' (101)
s1[2] = 'l' (108)
s1[3] = 'l' (108)
s1[4] = 'o' (111)
s1[5] = '?' (0)
"hello" is 5 visible characters plus 1 null terminator = 6 bytes. The compiler adds the '\0' automatically for string literals. If you build a string manually, you must add it yourself.
Essential String Functions
All are declared in <string.h>.
strlen — String Length
Returns the number of characters before the null terminator:
#include <stdio.h>
#include <string.h>
int main(void) {
char msg[] = "hello";
printf("strlen: %zu\n", strlen(msg)); // 5
printf("sizeof: %zu\n", sizeof(msg)); // 6 (includes '\0')
return 0;
}
strlen: 5
sizeof: 6
strlen walks the array until it finds '\0'. If there is no null terminator, it walks off the end of the array — undefined behavior.
strcpy — String Copy
Copies a string including the null terminator:
#include <string.h>
char dest[32];
strcpy(dest, "hello"); // copies 6 bytes (5 chars + '\0')
strcpy does not check whether dest is large enough. If the source string is longer than the destination buffer, you have a buffer overflow.
strncpy — Bounded String Copy
#include <string.h>
#include <stdio.h>
int main(void) {
char buffer[8];
strncpy(buffer, "hello, world", sizeof(buffer));
// WARNING: strncpy does NOT guarantee null termination
// if src is longer than n, no '\0' is written
buffer[sizeof(buffer) - 1] = '\0'; // manually ensure termination
printf("buffer: %s\n", buffer);
return 0;
}
buffer: hello,
strncpy has a surprising behavior: if the source is shorter than n, it pads the rest of the destination with null bytes. If the source is longer, it does not null-terminate. Always manually null-terminate after strncpy.
strcmp — String Comparison
#include <string.h>
#include <stdio.h>
int main(void) {
const char *a = "apple";
const char *b = "banana";
int result = strcmp(a, b);
if (result < 0) {
printf("\"%s\" comes before \"%s\"\n", a, b);
} else if (result > 0) {
printf("\"%s\" comes after \"%s\"\n", a, b);
} else {
printf("strings are equal\n");
}
return 0;
}
"apple" comes before "banana"
strcmp returns 0 for equal strings, a negative value if the first string comes before the second, and a positive value otherwise. Never compare strings with == — that compares pointer addresses, not contents.
strcat — String Concatenation
#include <string.h>
#include <stdio.h>
int main(void) {
char buffer[64] = "Hello"; // must be large enough for result
strcat(buffer, ", ");
strcat(buffer, "world!");
printf("%s\n", buffer);
return 0;
}
Hello, world!
Like strcpy, strcat does not check buffer bounds. Each call walks the destination string to find the end, then appends. Repeated concatenation with strcat is O(n^2) in the total string length.
The Buffer Overflow Problem
This is the bug that has caused more security vulnerabilities than any other:
#include <stdio.h>
#include <string.h>
void greet(const char *name) {
char buffer[16];
// BUG: if name is longer than 15 characters, this overflows
strcpy(buffer, "Hello, ");
strcat(buffer, name);
strcat(buffer, "!");
printf("%s\n", buffer);
}
int main(void) {
greet("Al"); // fine: "Hello, Al!" fits
greet("Bartholomew Reginald Smith"); // overflow: 35+ chars into 16-byte buffer
return 0;
}
The strcat writes past the end of buffer, overwriting adjacent stack memory. In the worst case, this overwrites the function's return address, allowing an attacker to redirect execution to arbitrary code.
Safe Alternatives
snprintf — The Right Way to Build Strings
#include <stdio.h>
void greet_safe(const char *name) {
char buffer[64];
int written = snprintf(buffer, sizeof(buffer), "Hello, %s!", name);
if (written < 0) {
fprintf(stderr, "encoding error\n");
return;
}
if ((size_t)written >= sizeof(buffer)) {
fprintf(stderr, "output truncated (%d chars needed)\n", written);
}
printf("%s\n", buffer);
}
int main(void) {
greet_safe("world");
greet_safe("a very long name that might exceed our buffer size if we are not careful");
return 0;
}
Hello, world!
Hello, a very long name that might exceed our buffer size if we are not careful!
snprintf never writes more than size bytes (including the null terminator). It returns the number of characters that would have been written if the buffer were large enough. If the return value is >= the buffer size, the output was truncated.
snprintf is the single most important string function in C. Use it instead of sprintf, strcpy, and strcat whenever possible.
String Literals Are Read-Only
#include <stdio.h>
int main(void) {
// String literal: stored in read-only memory
const char *greeting = "hello";
// This compiles but crashes at runtime (or is UB):
// char *mutable = "hello";
// mutable[0] = 'H'; // undefined behavior
// If you need a modifiable string, use an array:
char editable[] = "hello";
editable[0] = 'H';
printf("%s\n", editable); // Hello
return 0;
}
Hello
String literals like "hello" are stored in a read-only section of the executable. A pointer to a literal should always be const char *. If you need to modify the string, copy it into a char array.
String/Number Conversions
#include <stdio.h>
#include <stdlib.h>
int main(void) {
// String to integer: use strtol, not atoi
const char *num_str = "12345";
char *end;
long value = strtol(num_str, &end, 10);
if (*end != '\0') {
printf("parse error at: %s\n", end);
} else {
printf("parsed: %ld\n", value);
}
// String to double: use strtod
const char *pi_str = "3.14159";
double pi = strtod(pi_str, &end);
printf("pi: %f\n", pi);
// Number to string: use snprintf
char buffer[32];
snprintf(buffer, sizeof(buffer), "%d", 42);
printf("as string: \"%s\"\n", buffer);
return 0;
}
parsed: 12345
pi: 3.141590
as string: "42"
Never use atoi — it has no error handling. If the input is not a valid number, atoi returns 0 with no way to distinguish from a valid "0". strtol reports errors through the end pointer and errno.
Common Pitfalls
- Forgetting the null terminator — every string must end with
'\0'. If you build a string byte by byte or withstrncpyon a long source, you must add the terminator yourself. Without it,strlen,printf("%s"), and every other string function will read past the end of your buffer. - Using strcpy/strcat without checking buffer size — these functions do not know the destination size. Use
snprintfinstead. - Comparing strings with == —
==compares pointer addresses."hello" == "hello"may be true (the compiler may merge identical literals) or false. Always usestrcmp. - Modifying string literals —
char *s = "hello"; s[0] = 'H';is undefined behavior. Usechar s[] = "hello"for modifiable strings. - Using strlen in a loop condition —
for (int i = 0; i < strlen(s); i++)callsstrlenon every iteration, making the loop O(n^2). Compute the length once before the loop. - Off-by-one with buffer sizes — a string of length
nrequiresn + 1bytes.char buf[5]can hold a 4-character string, not a 5-character string.
Key Takeaways
- C strings are null-terminated
chararrays. The'\0'terminator is what makes them strings — without it, they are just arrays of bytes. strcpyandstrcatdo not check buffer sizes. They are the primary cause of buffer overflow vulnerabilities. Usesnprintfinstead.strncpydoes not guarantee null termination. Always manually terminate after using it.- String literals are read-only. Use
const char *for pointers to literals andchar[]for modifiable copies. - Use
strtol/strtodfor string-to-number conversion, neveratoi. Error handling matters. snprintfis the safest and most versatile string function in C. It handles formatting, bounds checking, and null termination in one call.- The string handling mistakes that C makes possible are responsible for the majority of security vulnerabilities in C programs. Every buffer overflow, every format string vulnerability, every read-past-end bug traces back to the null-terminated byte array convention.