onlineunicodetools logo

Unicode to bytes converter

World's simplest unicode tool

This browser-based utility converts Unicode data to bytes. Anything that you paste or enter in the text area on the left automatically gets converted to bytes on the right. It supports the most popular Unicode encodings, such as UTF-8, UTF-16, UCS-2, UTF-32, and UCS-4, and it works with emoji characters. For the multi-byte encodings, you can select the byte order format and add a BOM marker. For the output bytes, you can choose between binary, octal, decimal, hexadecimal, or any other radix from 2 to 36. You can also add a prefix and zero-padding to bytes, set the output separator, and skip whitespace characters. Created by encoding gurus from team Browserling.
announcement a new project!
Super exciting news! We just launched TECHURLS – simple and fun tech news reader. Check it out!
fullscreen fullscreen_exit
Unicode Encoding
Select the input encoding and the byte order format.
Insert Byte Order Mark (BOM) at the beginning of bytes for the UTF16, UTF32, UCS2, and UCS4 encodings.
Bytes Radix
Radix of the output bytes.
Custom radix from 2 to 32.
Add padding to bytes.
Add base indicator to bytes.
Print digits in uppercase.
Print base indicator in uppercase.
Separator and Whitespaces
Delimit bytes with this symbol.
Do not convert newline characters to bytes.
Do not convert tab characters to bytes.
Do not convert space characters to bytes.
Unicode to bytes converter tool What is a unicode to bytes converter?
This utility converts Unicode characters to bytes in the given encoding and base. You can use any of the five most popular Unicode encodings (UTF8/UTF16/UCS2/UTF32/UCS4) and use binary to hexatridecimal bases for the bytes. The difference between the encodings is how many bytes are required to represent any of 1,114,112 Unicode glyphs in memory. In the UTF8 encoding, 1 to 4 bytes (8, 16, 24, or 32 bits) are required to store a character. In the UTF16 and UCS2 encodings, one symbol is represented by a pair of bytes or two pairs of bytes (16 or 32 bits). In the UTF32 and UCS4 encodings, the representation is fixed-length and uses 4 bytes (exactly 32 bits). A sequence of two bytes is called a word and a sequence of four bytes is called a double-word. There are two ways to store bytes in words and double words – Big Endian and Little Endian. In the Big Endian format, the most significant bytes are stored first, and in the Little Endian format, the least significant bytes are stored first. To make it easier to determine the byte order, you can add a special byte mark in front of the bytes. This mark is called BOM – Byte Order Mark. By default, the bytes are converted to radix-16 (hex) before they are printed but you can quickly switch between binary, octal, decimal, and hex output formats. For the binary output format, you can add the 0b-prefix and enable padding to 8 bits. For the octal output format, you can add the o-prefix and enable padding to 3 digits. For the hex output format, you can add the 0x-prefix and enable padding to two digits. You can also change the case of base prefix symbols and output digits. For example, with the uppercase prefix and uppercase digits, the hex value "a0" will be printed as "0XA0". To improve the output and make the bytes more readable, you can customize the separator character that goes between the bytes and also use three quick options to skip converting spaces, tabs, and newlines.
Unicode to bytes converter examples Click to use
Rumi Quote
In this example, we convert a Rumi quote written in a full-width Unicode font to octal bytes in UTF-8 encoding. The UTF-8 encoding uses 1, 2, 3, or 4 bytes for each character. The octal bytes are padded and use three digits for each byte. We have also added the octal o-prefix and have separated the output values with the space symbol.
"You are not a drop in the ocean. You are the entire ocean in a drop." ― Rumi
o042 o357 o274 o271 o357 o275 o217 o357 o275 o225 o040 o357 o275 o201 o357 o275 o222 o357 o275 o205 o040 o357 o275 o216 o357 o275 o217 o357 o275 o224 o040 o357 o275 o201 o040 o357 o275 o204 o357 o275 o222 o357 o275 o217 o357 o275 o220 o040 o357 o275 o211 o357 o275 o216 o040 o357 o275 o224 o357 o275 o210 o357 o275 o205 o040 o357 o275 o217 o357 o275 o203 o357 o275 o205 o357 o275 o201 o357 o275 o216 o056 o040 o357 o274 o271 o357 o275 o217 o357 o275 o225 o040 o357 o275 o201 o357 o275 o222 o357 o275 o205 o040 o357 o275 o224 o357 o275 o210 o357 o275 o205 o040 o357 o275 o205 o357 o275 o216 o357 o275 o224 o357 o275 o211 o357 o275 o222 o357 o275 o205 o040 o357 o275 o217 o357 o275 o203 o357 o275 o205 o357 o275 o201 o357 o275 o216 o040 o357 o275 o211 o357 o275 o216 o040 o357 o275 o201 o040 o357 o275 o204 o357 o275 o222 o357 o275 o217 o357 o275 o220 o056 o042 o040 o342 o200 o225 o040 o357 o274 o262 o357 o275 o225 o357 o275 o215 o357 o275 o211
Required options
These options will be used automatically if you select this example.
Select the input encoding and the byte order format.
Insert Byte Order Mark (BOM) at the beginning of bytes for the UTF16, UTF32, UCS2, and UCS4 encodings.
Radix of the output bytes.
Add padding to bytes.
Add base indicator to bytes.
Print digits in uppercase.
Print base indicator in uppercase.
Delimit bytes with this symbol.
Do not convert newline characters to bytes.
Do not convert tab characters to bytes.
Do not convert space characters to bytes.
Unicode Phrase
This example turns a beautiful Unicode phrase into hexadecimal UTF-32 encoding with the Little Endian byte order. The UTF-32 encoding uses 4 bytes for each glyph and to quickly determine the encoding and byte order, it adds the BOM marker (0xFF, 0xFE, 0x00, 0x00) in front of the output. The output bytes are comma-separated, padded to two digits, and the hexadecimal prefix "0X" is printed in uppercase.
♥⊱╮ღ꧁𝓫𝓮 𝔂𝓸𝓾𝓻𝓼𝓮𝓵𝓯꧂ღ╭⊱♥
0XFF, 0XFE, 0X00, 0X00, 0X65, 0X26, 0X00, 0X00, 0XB1, 0X22, 0X00, 0X00, 0X6E, 0X25, 0X00, 0X00, 0XE6, 0X10, 0X00, 0X00, 0XC1, 0XA9, 0X00, 0X00, 0XEB, 0XD4, 0X01, 0X00, 0XEE, 0XD4, 0X01, 0X00, 0X20, 0X00, 0X00, 0X00, 0X02, 0XD5, 0X01, 0X00, 0XF8, 0XD4, 0X01, 0X00, 0XFE, 0XD4, 0X01, 0X00, 0XFB, 0XD4, 0X01, 0X00, 0XFC, 0XD4, 0X01, 0X00, 0XEE, 0XD4, 0X01, 0X00, 0XF5, 0XD4, 0X01, 0X00, 0XEF, 0XD4, 0X01, 0X00, 0XC2, 0XA9, 0X00, 0X00, 0XE6, 0X10, 0X00, 0X00, 0X6D, 0X25, 0X00, 0X00, 0XB1, 0X22, 0X00, 0X00, 0X65, 0X26, 0X00, 0X00, 0X0A, 0X00, 0X00, 0X00, 0X0A, 0X00, 0X00, 0X00
Required options
These options will be used automatically if you select this example.
Select the input encoding and the byte order format.
Insert Byte Order Mark (BOM) at the beginning of bytes for the UTF16, UTF32, UCS2, and UCS4 encodings.
Radix of the output bytes.
Add padding to bytes.
Add base indicator to bytes.
Print digits in uppercase.
Print base indicator in uppercase.
Delimit bytes with this symbol.
Do not convert newline characters to bytes.
Do not convert tab characters to bytes.
Do not convert space characters to bytes.
Vegetable Emoji
In this example, we convert a list of Unicode vegetable emojis to bytes with a custom radix-20. The input list is first encoded to UTF-8 encoding, then it's converted to individual bytes, then bytes are converted to radix-20, and then they are printed. We have also activated options to skip newlines, tabs, and spaces during the conversion so that the input structure is preserved.
1) broccoli: 🥦 2) tomato: 🍅 3) carrot: 🥕 4) leafy green: 🥬 5) aubergine: 🍆 6) cucumber: 🥒 7) avocado: 🥑 8) potato: 🥔
29-21 4i-5e-5b-4j-4j-5b-58-55-2i c0-7j-85-86 2a-21 5g-5b-59-4h-5g-5b-2i c0-7j-71-6d 2b-21 4j-4h-5e-5e-5b-5g-2i c0-7j-85-79 2c-21 58-51-4h-52-61 53-5e-51-51-5a-2i c0-7j-85-8c 2d-21 4h-5h-4i-51-5e-53-55-5a-51-2i c0-7j-71-6e 2e-21 4j-5h-4j-5h-59-4i-51-5e-2i c0-7j-85-76 2f-21 4h-5i-5b-4j-4h-50-5b-2i c0-7j-85-75 2g-21 5c-5b-5g-4h-5g-5b-2i c0-7j-85-78
Required options
These options will be used automatically if you select this example.
Select the input encoding and the byte order format.
Insert Byte Order Mark (BOM) at the beginning of bytes for the UTF16, UTF32, UCS2, and UCS4 encodings.
Radix of the output bytes.
Custom radix from 2 to 32.
Add padding to bytes.
Add base indicator to bytes.
Print digits in uppercase.
Print base indicator in uppercase.
Delimit bytes with this symbol.
Do not convert newline characters to bytes.
Do not convert tab characters to bytes.
Do not convert space characters to bytes.
Pro tips Master online unicode tools
You can pass input to this tool via ?input query argument and it will automatically compute output. Here's how to type it in your browser's address bar. Click to try!
https://onlineunicodetools.com/convert-unicode-to-bytes?input=%22%EF%BC%B9%EF%BD%8F%EF%BD%95%20%EF%BD%81%EF%BD%92%EF%BD%85%20%EF%BD%8E%EF%BD%8F%EF%BD%94%20%EF%BD%81%20%EF%BD%84%EF%BD%92%EF%BD%8F%EF%BD%90%20%EF%BD%89%EF%BD%8E%20%EF%BD%94%EF%BD%88%EF%BD%85%20%EF%BD%8F%EF%BD%83%EF%BD%85%EF%BD%81%EF%BD%8E.%20%EF%BC%B9%EF%BD%8F%EF%BD%95%20%EF%BD%81%EF%BD%92%EF%BD%85%20%EF%BD%94%EF%BD%88%EF%BD%85%20%EF%BD%85%EF%BD%8E%EF%BD%94%EF%BD%89%EF%BD%92%EF%BD%85%20%EF%BD%8F%EF%BD%83%EF%BD%85%EF%BD%81%EF%BD%8E%20%EF%BD%89%EF%BD%8E%20%EF%BD%81%20%EF%BD%84%EF%BD%92%EF%BD%8F%EF%BD%90.%22%20%E2%80%95%20%EF%BC%B2%EF%BD%95%EF%BD%8D%EF%BD%89&encoding=utf8&bom=false&base=octal&padding=true&prefix=true&uppercase-base=false&uppercase-prefix=false&separator=%20&skip-newlines=false&skip-tabs=false&skip-spaces=false
All unicode tools
Didn't find the tool you were looking for? Let us know what tool we are missing and we'll build it!
Quickly find code positions of all Unicode values.
Quickly decode code positions to Unicode values.
Quickly encode Unicode values to UTF-8 encoding.
Quickly encode Unicode values to UTF-16 encoding.
Quickly encode Unicode values to UTF-32 encoding.
Quickly create a picture from Unicode symbols.
Quickly generate random Unicode text in a given range.
Quickly generate all Unicode values from the given code point interval.
Quickly filter Unicode symbols that are within the given code point interval.
Quickly split Unicode data into graphemes.
Quickly sort Unicode glyphs in increasing or decreasing order.
Quickly find the length of Unicode text.
Quickly increase Unicode code point values.
Quickly decrease Unicode code point values.
Quickly reverse the order of symbols in Unicode text.
Quickly rotate Unicode characters to the left and right.
Quickly create multiple copies of Unicode text.
Quickly extract all characters from Unicode text.
Quickly split Unicode data into pieces.
Quickly split Unicode text into chunks of constant length.
Quickly merge Unicode snippets together.
Quickly shorten Unicode text to the given length.
Quickly left-pad Unicode text with any character.
Quickly right-pad Unicode text with any character.
Quickly align Unicode data to the center.
Quickly align Unicode data to the right.
Quickly convert ordinary numbers to Unicode numbers in various fonts.
Quickly convert ordinary letters to Unicode letters in various fonts.
Quickly convert ordinary text to fancy Unicode text.
Quickly combine input Unicode with diacritical marks.
Quickly circularly rearrange Unicode symbols.
Quickly encode Unicode data to HTML entities.
Quickly URL-escape Unicode symbols.
Quickly encode Unicode values to base64.
Quickly encode Unicode values to a data URI.
Quickly convert Unicode characters to raw bytes.
Quickly convert Unicode data to base-2 (binary).
Quickly convert Unicode data to base-8 (octal).
Quickly convert Unicode data to base-10 (decimal).
Quickly convert Unicode data to base-16 (hexadecimal).
Quickly convert Unicode symbols to raw ASCII bytes.
Quickly convert ASCII bytes to Unicode symbols.
Quickly create a picture from Unicode emojis.
Release Zalgo on your Unicode text.
Coming soon These unicode tools are on the way
Name Unicode Symbols
Spell out the names of Unicode characters in the input text.
URL-decode Unicode
URL-unescape Unicode text.
Convert Binary to Unicode
Convert base-2 data to Unicode encoding.
Convert Octal to Unicode
Convert base-8 data to Unicode encoding.
Convert Decimal to Unicode
Convert base-10 data to Unicode encoding.
Convert Hex to Unicode
Convert base-16 data to Unicode encoding.
Convert Unicode to Any Base
Convert Unicode text to any radix.
Convert Any Base to Unicode
Convert any radix data to Unicode.
Convert Unicode to Latin1
Convert Unicode text to Latin1 encoding.
Convert Latin1 to Unicode
Convert Latin1 encoded data to Unicode.
Convert Bytes to Unicode
Convert raw bytes to Unicode.
Remove Combining Characters
Delete diacritical marks from Unicode data
Remove Zalgo from Unicode
Make Unicode Zalgo text readable again.
Validate Unicode
Check if the given Unicode has valid encoding.
Convert Unicode to Punycode
Encode Unicode text to Punycode encoding.
Convert Punycode to Unicode
Decode Punycode encoding to Unicode.
Decode Base64 to Unicode
Convert base64 data to Unicode text.
Encode Unicode to Data URI
Convert Unicode to a valid data URL.
Decode Data URI to Unicode
Convert a valid data URL to Unicode text.
Convert HTML to Unicode
Decode HTML entities to Unicode data.
Convert UTF8 to Unicode
Decode UTF8 encoding to Unicode.
Convert UTF16 to Unicode
Decode UTF16 encoding to Unicode.
Convert UTF32 to Unicode
Decode UTF32 encoding to Unicode.
Convert Unicode to Uppercase
Convert all Unicode characters to uppercase.
Convert Unicode to Lowercase
Convert all Unicode characters to lowercase.
Convert Unicode to Randomcase
Randomize case of all Unicode characters.
Convert Unicode to Lowercase
Convert all Unicode characters to lowercase.
JSON Stringify Unicode
Encode Unicode to JSON.
JSON Parse Unicode
Decode JSON to Unicode.
Shuffle Unicode Symbols
Randomly rearrange the order of input graphemes.
Analyze Unicode
Print statistics about Unicode data and code points, etc.