What Is Base64? How It Works, Use Cases, UTF-8 and URL-Safe Encoding

Base64 is an encoding scheme that represents binary data, such as images or compressed files, using only 64 printable characters made up mostly of letters and digits. It is widely used to carry binary through places where only text can safely travel, such as email, Data URLs and JSON. In this article we lay out how it works, what it is for, the UTF-8 caveat and URL-safe Base64, together with a real conversion example.

Important: Base64 is not encryption. It uses no key, and anyone can easily convert it back. It merely turns data into a form that can be "carried safely as text"; it cannot protect secrets.

1. What is Base64?

Base64 is a conversion that expresses arbitrary binary (a sequence of bytes taking values 0 to 255) using only the 64 characters A–Z, a–z, 0–9, + and /. These are "printable ASCII characters" that almost every system handles safely, so binary can travel intact even over text-only paths.

The "64" in the name comes from the fact that 64 distinct characters are used for the representation. As noted above, this is a reversible encoding, not encryption and not compression. In fact, as we will see, the size actually grows.

2. How it works — 3 bytes into 4 characters

The basic unit of Base64 is 3 bytes (= 24 bits). These 24 bits are split into four groups of 6 bits, and each 6-bit value (0 to 63) is mapped through the 64-character table to produce 4 characters.

The role of padding (=)

When the source data is not a multiple of 3 bytes, it does not divide cleanly into 6-bit units. If 1 byte remains, == is appended to the end of the output; if 2 bytes remain, = is appended — this padding keeps the output in groups of 4 characters. The = is not data but "filler for length alignment".

3. A simple conversion example (Man → TWFu)

As a classic example, let us convert the string Man. Man is exactly 3 bytes, so no padding is needed.

  1. Take each character's ASCII code: M=77, a=97, n=110.
  2. Write them as 8-bit binary: 01001101 01100001 01101110 (concatenated, 24 bits).
  3. Split into 6-bit groups: 010011 010110 000101 101110.
  4. Convert to decimal: 19 22 5 46.
  5. Map through the table: 19=T, 22=W, 5=F, 46=u.

The result is that Man becomes TWFu. Notice how 3 bytes turned into 4 characters.

4. Why it is used (use cases)

Base64 is used to "carry binary over paths that can only safely pass text". Typical situations include:

5. The UTF-8 caveat (Japanese and emoji)

The browser's btoa() only accepts byte sequences in the Latin-1 range (0 to 255). So passing a string containing Japanese or emoji directly, as in btoa('こんにちは'), raises an exception (error).

The correct approach is to first convert the string into UTF-8 bytes and then apply Base64. The long-standing way is as follows.

Today it is recommended to use TextEncoder / TextDecoder to turn the string into a Uint8Array of UTF-8 bytes before encoding. Either way, the key point is "to UTF-8 bytes first". The tool on this site performs this conversion internally, so it handles Japanese and emoji as-is.

6. URL-safe Base64 and the size question

The + and / used in standard Base64 have special meaning in URLs and file names (/ is a path separator, and + may be interpreted as a space), so they can be awkward as-is. For this reason, RFC 4648 section 5 defines a URL-safe variant that replaces them.

ItemStandard Base64URL-safe Base64
62nd character+- (hyphen)
63rd character/_ (underscore)
Padding= appendedOften omitted
Main useMIME, Data URLs, etc.URLs, file names, JWT, etc.

One more practical point is size. Because 3 bytes become 4 characters, the output is about 33% (4/3 times) larger than the source data (and even more once line breaks are included). It is convenient for embedding small images, but for large files keep the increase in mind.

Free Tool Try Base64 Encode / Decode Encode and decode text and data to and from Base64. Japanese and emoji are handled with internal UTF-8 conversion, so you can just type them in.

Frequently Asked Questions (FAQ)

Is Base64 encryption?

No. Base64 is not encryption; it is an encoding that converts binary data into 64 printable characters. It uses no key and anyone can convert it back to the original data, so it cannot protect confidential information. If secrecy is required, apply encryption separately.

Why does Japanese text come out garbled?

The browser's btoa() only handles the Latin-1 range (0 to 255), so passing characters outside that range, such as Japanese or emoji, causes an error or garbled output. You must first convert the string into UTF-8 bytes before applying Base64. The classic way is btoa(unescape(encodeURIComponent(s))), and using TextEncoder is recommended today.

What is URL-safe Base64?

The "+" and "/" used in standard Base64 have special meaning in URLs and file names, so RFC 4648 section 5 defines a variant that replaces them with "-" and "_". This is URL-safe Base64, widely used in JWTs and similar. The "=" padding is often omitted.

← Back to the Tech Blog list