What is UTF-8 Encoding? A Guide for Non-Programmers - Deepstash
Machine Learning With Google

Learn more about computerscience with this collection

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Machine Learning With Google

Discover 95 similar ideas in

It takes just

14 mins to read

What Is UTF-8

What Is UTF-8

UTF-8 stands for “Unicode Transformation Format - 8 bits.” That’s not helpful to us yet, so let’s rewind to the basics.

7

117 reads

What Is Encoding ?

What Is Encoding ?

Encoding is the process of converting characters in human languages into binary sequences that computers can process.

7

82 reads

ASCII: Converting Symbols to Binary

ASCII: Converting Symbols to Binary

The American Standard Code for Information Interchange (ASCII) was an early standardized encoding system for text.

ASCII’s library includes every upper-case and lower-case letter in the Latin alphabet, every digit from 0 to 9, and some symbols (like /, !, and ?). It assigns each of these characters a unique three-digit code and a unique byte.

But ASCII is so limited, it gives us 256 different bytes, or 256 ways to represent a character. When ASCII was introduced in 1960, this was okay, since developers needed only 128 bytes to represent all the English characters and symbols they needed.

7

70 reads

Unicode: A Way to Store Every Symbol, Ever

Unicode: A Way to Store Every Symbol, Ever

Enter Unicode, an encoding system that solves the space issue of ASCII. Like ASCII, Unicode assigns a unique code, called a code point, to each character. However, Unicode’s more sophisticated system can produce over a million code points, more than enough to account for every character in any language.

7

49 reads

UTF-8: Unicode Transformation Format

UTF-8: Unicode Transformation Format

UTF-8 is an encoding system for Unicode. It can translate any Unicode character to a matching unique binary string, and can also translate the binary string back to a Unicode character.

UTF-8 is currently the most popular encoding method on the internet because it can efficiently store text containing any character.

6

50 reads

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving & library

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Personalized recommendations

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates