Base64 is an encoding scheme that represents binary data, such as images or compressed files, using only 64 printable characters made up mostly of letters and digits. It is widely used to carry binary through places where only text can safely travel, such as email, Data URLs and JSON. In this article we lay out how it works, what it is for, the UTF-8 caveat and URL-safe Base64, together with a real conversion example.
1. What is Base64?
Base64 is a conversion that expresses arbitrary binary (a sequence of bytes taking values 0 to 255) using only the 64 characters A–Z, a–z, 0–9, + and /. These are "printable ASCII characters" that almost every system handles safely, so binary can travel intact even over text-only paths.
The "64" in the name comes from the fact that 64 distinct characters are used for the representation. As noted above, this is a reversible encoding, not encryption and not compression. In fact, as we will see, the size actually grows.
2. How it works — 3 bytes into 4 characters
The basic unit of Base64 is 3 bytes (= 24 bits). These 24 bits are split into four groups of 6 bits, and each 6-bit value (0 to 63) is mapped through the 64-character table to produce 4 characters.
- One byte is 8 bits, so 8 × 3 = 24 bits are handled together.
- The 24 bits are divided into 6 bits × 4 (6 bits give 64 possibilities, 0 to 63).
- Each 6-bit value is turned into a character using the table A=0, B=1, …, Z=25, a=26, …, z=51, 0=52, …, 9=61,
+=62,/=63.
The role of padding (=)
When the source data is not a multiple of 3 bytes, it does not divide cleanly into 6-bit units. If 1 byte remains, == is appended to the end of the output; if 2 bytes remain, = is appended — this padding keeps the output in groups of 4 characters. The = is not data but "filler for length alignment".
3. A simple conversion example (Man → TWFu)
As a classic example, let us convert the string Man. Man is exactly 3 bytes, so no padding is needed.
- Take each character's ASCII code:
M=77,a=97,n=110. - Write them as 8-bit binary:
01001101 01100001 01101110(concatenated, 24 bits). - Split into 6-bit groups:
010011 010110 000101 101110. - Convert to decimal:
19 22 5 46. - Map through the table:
19=T,22=W,5=F,46=u.
The result is that Man becomes TWFu. Notice how 3 bytes turned into 4 characters.
4. Why it is used (use cases)
Base64 is used to "carry binary over paths that can only safely pass text". Typical situations include:
- Email (MIME): the standard encoding for putting attachments and images into a text-based email body.
- Data URLs: embedding images and similar directly into HTML/CSS, as in
data:image/png;base64,iVBORw0K..., avoiding a separate file load. - JSON / HTTP headers: safely storing binary or non-UTF-8 data in fields that assume text.
- Basic authentication / JWT: passing credentials or tokens as text (note: this is readable, so not for secrecy).
5. The UTF-8 caveat (Japanese and emoji)
The browser's btoa() only accepts byte sequences in the Latin-1 range (0 to 255). So passing a string containing Japanese or emoji directly, as in btoa('こんにちは'), raises an exception (error).
The correct approach is to first convert the string into UTF-8 bytes and then apply Base64. The long-standing way is as follows.
- Encode:
btoa(unescape(encodeURIComponent(s)))—encodeURIComponentturns it into UTF-8 percent-escapes, andunescapebrings each byte back to a Latin-1 character before passing it tobtoa. - Decode:
decodeURIComponent(escape(atob(b)))— the reverse procedure restores the original string.
TextEncoder / TextDecoder to turn the string into a Uint8Array of UTF-8 bytes before encoding. Either way, the key point is "to UTF-8 bytes first". The tool on this site performs this conversion internally, so it handles Japanese and emoji as-is.
6. URL-safe Base64 and the size question
The + and / used in standard Base64 have special meaning in URLs and file names (/ is a path separator, and + may be interpreted as a space), so they can be awkward as-is. For this reason, RFC 4648 section 5 defines a URL-safe variant that replaces them.
| Item | Standard Base64 | URL-safe Base64 |
|---|---|---|
| 62nd character | + | - (hyphen) |
| 63rd character | / | _ (underscore) |
| Padding | = appended | Often omitted |
| Main use | MIME, Data URLs, etc. | URLs, file names, JWT, etc. |
One more practical point is size. Because 3 bytes become 4 characters, the output is about 33% (4/3 times) larger than the source data (and even more once line breaks are included). It is convenient for embedding small images, but for large files keep the increase in mind.
Free Tool Try Base64 Encode / Decode Encode and decode text and data to and from Base64. Japanese and emoji are handled with internal UTF-8 conversion, so you can just type them in.Frequently Asked Questions (FAQ)
Is Base64 encryption?
No. Base64 is not encryption; it is an encoding that converts binary data into 64 printable characters. It uses no key and anyone can convert it back to the original data, so it cannot protect confidential information. If secrecy is required, apply encryption separately.
Why does Japanese text come out garbled?
The browser's btoa() only handles the Latin-1 range (0 to 255), so passing characters outside that range, such as Japanese or emoji, causes an error or garbled output. You must first convert the string into UTF-8 bytes before applying Base64. The classic way is btoa(unescape(encodeURIComponent(s))), and using TextEncoder is recommended today.
What is URL-safe Base64?
The "+" and "/" used in standard Base64 have special meaning in URLs and file names, so RFC 4648 section 5 defines a variant that replaces them with "-" and "_". This is URL-safe Base64, widely used in JWTs and similar. The "=" padding is often omitted.