Deep Dive // Encoding

Base64
Encoding

How binary data survives in a text-only world. The encoding you see in JWTs, PEM certificates, SSH keys, and API tokens.

Scroll to explore
01 // The Problem

The problem Base64 solves

Cryptographic keys, certificates, and encrypted data are raw bytes. Many systems that need to carry this data (JSON, HTTP headers, emails, URLs) only support text. Base64 is the bridge.

Why not just use hex?

Hexadecimal doubles the size of your data: every byte becomes two characters. A 256-bit key takes 64 hex characters. Base64 is more efficient. It encodes 3 bytes into 4 characters, a 33% size increase instead of 100%. That same 256-bit key takes just 44 Base64 characters.

The tradeoff is simplicity. Hex maps one byte to two characters, easy to read at a glance. Base64 works on 3-byte groups, which makes it more compact but harder to decode in your head.

Raw binary
1 byte → 8 characters
Size: 8x
Hexadecimal
1 byte → 2 characters
Size: 2x
Base64
3 bytes → 4 characters
Size: 1.33x
02 // The Alphabet

64 characters, 6 bits each

Base64 uses a carefully chosen set of 64 characters that are safe in virtually every text-based system. Each character represents a 6-bit value (26=642^6 = 64), compared to hex where each character represents 4 bits.

The Base64 Alphabet

64 characters, each representing a 6-bit value (0 to 63). Hover over any character to see its index and binary value.

A-Z (0-25)
a-z (26-51)
0-9 (52-61)
+/ (62-63)
Hover over a character to inspect it

The choice of these 64 characters is not arbitrary. Letters, digits, plus, and slash are all safe in most text protocols. A variant called base64url swaps + and / for - and _ to avoid conflicts in URLs. This is what JWTs use.

03 // The Algorithm

How Base64 encoding works

The algorithm is simple: take 3 bytes (24 bits), split them into four 6-bit groups, and look up each group in the alphabet. If the input is not a multiple of 3 bytes, pad with = signs.

The three steps

1.

Take 3 input bytes and concatenate their bits into a single 24-bit stream.

2.

Split those 24 bits into four 6-bit groups. Each group is a number from 0 to 63.

3.

Look up each number in the Base64 alphabet to get the output character. If the last group has fewer than 3 bytes, pad the output with = signs.

Base64 Step by Step

Type text and watch each 3-byte group get split into four 6-bit sextets, each mapped to a Base64 character.

Group 1: 3 bytes
Input bytes
'H'7201001000
'i'10501101001
'!'3300100001
Concatenated bits, split into 6-bit groups
010010
000110
100100
100001
6-bit value → Base64 character
18S
6G
36k
33h
Input (3 bytes)Hi!
Base64 output (4 chars)SGkh
3 bytes → 4 characters (133% of original size). Base64 always produces 4 characters (multiples of 4).

Why the = padding?

Base64 always outputs in groups of 4 characters. If the input is not a multiple of 3 bytes, the last group is incomplete. The = signs tell the decoder exactly how many bytes were in the final group, so it can strip the zero-padding during decoding.

1 byte of input → 2 Base64 characters + ==. 2 bytes → 3 characters + =. 3 bytes → 4 characters, no padding needed. The pattern then repeats.

04 // Real World

Base64 in the wild

You encounter Base64 constantly without necessarily realizing it. Here are the most common places it shows up in cryptography and web development.

PEM certificates

-----BEGIN CERTIFICATE-----
MIIBIjANBgkqhkiG9w0BAQE...
-----END CERTIFICATE-----

TLS certificates and private keys are DER-encoded binary wrapped in Base64 between header/footer lines. Every HTTPS connection starts with one.

JSON Web Tokens (JWT)

eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.signature

Each JWT segment is base64url-encoded JSON. The header and payload are not encrypted, just encoded. Anyone can decode them.

SSH public keys

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAB... user@host

The long string after ssh-rsa is Base64-encoded binary containing the key type and the actual public key parameters.

API keys and tokens

sk-ant-api03-Wk7x2F4...

Many API providers Base64-encode or use Base64 variants for their keys. It makes binary identifiers safe to copy, paste, and transmit in HTTP headers.

05 // Try It

Decode it yourself

Paste any Base64 string you find in the wild and see what is actually inside.

Base64 Decoder

Paste any Base64 string and see the raw bytes underneath. Try one of the real-world examples below.

Decoded textHello World
Raw bytes (11 bytes) — hex48 65 6c 6c 6f 20 57 6f 72 6c 64
06 // Common Mistake

Base64 is not encryption

This is worth emphasizing because it is one of the most common mistakes in software development.

Encoding is not encrypting

Base64 is a reversible encoding. There is no key, no secret, no security. Anyone who sees a Base64 string can decode it instantly. The JWT payload eyJzdWIiOiIxMjM0NTY3ODkwIn0 is not hidden. It is just {"sub":"1234567890"} written in a URL-safe format.

If you need data to be secret, encrypt it. If you need data to survive a text-only channel, encode it. Base64 does the second thing. Cryptographic algorithms from the encryption chapter do the first.