ASCII Table
This ASCII-Table is mostly derived from the Unicode 15.0 Documentation.
Table of Contents
Mostly Printable Characters
Codes 0x20 to 0x7E are referred to as the printable characters, 0x7F is a control character. Wikipedia has more background information.
Char | Hex | Dec |
---|---|---|
SPACE | 20 | 32 |
! | 21 | 33 |
" | 22 | 34 |
# | 23 | 35 |
$ | 24 | 36 |
% | 25 | 37 |
& | 26 | 38 |
' | 27 | 39 |
( | 28 | 40 |
) | 29 | 41 |
* | 2a | 42 |
+ | 2b | 43 |
, | 2c | 44 |
- | 2d | 45 |
. | 2e | 46 |
/ | 2f | 47 |
0 | 30 | 48 |
1 | 31 | 49 |
2 | 32 | 50 |
3 | 33 | 51 |
4 | 34 | 52 |
5 | 35 | 53 |
6 | 36 | 54 |
7 | 37 | 55 |
8 | 38 | 56 |
9 | 39 | 57 |
: | 3a | 58 |
; | 3b | 59 |
< | 3c | 60 |
= | 3d | 61 |
> | 3e | 62 |
? | 3f | 63 |
Char | Hex | Dec |
---|---|---|
@ | 40 | 64 |
A | 41 | 65 |
B | 42 | 66 |
C | 43 | 67 |
D | 44 | 68 |
E | 45 | 69 |
F | 46 | 70 |
G | 47 | 71 |
H | 48 | 72 |
I | 49 | 73 |
J | 4a | 74 |
K | 4b | 75 |
L | 4c | 76 |
M | 4d | 77 |
N | 4e | 78 |
O | 4f | 79 |
P | 50 | 80 |
Q | 51 | 81 |
R | 52 | 82 |
S | 53 | 83 |
T | 54 | 84 |
U | 55 | 85 |
V | 56 | 86 |
W | 57 | 87 |
X | 58 | 88 |
Y | 59 | 89 |
Z | 5a | 90 |
[ | 5b | 91 |
\ | 5c | 92 |
] | 5d | 93 |
^ | 5e | 94 |
_ | 5f | 95 |
Char | Hex | Dec |
---|---|---|
` | 60 | 96 |
a | 61 | 97 |
b | 62 | 98 |
c | 63 | 99 |
d | 64 | 100 |
e | 65 | 101 |
f | 66 | 102 |
g | 67 | 103 |
h | 68 | 104 |
i | 69 | 105 |
j | 6a | 106 |
k | 6b | 107 |
l | 6c | 108 |
m | 6d | 109 |
n | 6e | 110 |
o | 6f | 111 |
p | 70 | 112 |
q | 71 | 113 |
r | 72 | 114 |
s | 73 | 115 |
t | 74 | 116 |
u | 75 | 117 |
v | 76 | 118 |
w | 77 | 119 |
x | 78 | 120 |
y | 79 | 121 |
z | 7a | 122 |
{ | 7b | 123 |
| | 7c | 124 |
} | 7d | 125 |
~ | 7e | 126 |
DEL | 7f | 127 |
Note on the DEL
code:The DEL being 0x7F, meaning it has all 7 bits set, probably originates from how one "deleted" something from paper tape, by just punching all of the holes one could "erase" the original character. Its meaning is ambigious, but it usually refers to a backspace.
Fun Fact: The lowercase characters in ASCII were an "aftertought" and only added later in 1965, two years after the initial release.
Control characters
Char | Hex | Oct | Dec | Name | C Esc |
---|---|---|---|---|---|
NUL | 00 | 000 | 0 | Null | |
SOH | 01 | 001 | 1 | Start Of Heading | |
STX | 02 | 002 | 2 | Start Of Text | |
ETX | 03 | 003 | 3 | End Of Text | |
EOT | 04 | 004 | 4 | End Of Transmission | |
ENQ | 05 | 005 | 5 | Enquiry | |
ACK | 06 | 006 | 6 | Acknowledge | |
BEL | 07 | 007 | 7 | Bell / Alert | \a |
BS | 08 | 010 | 8 | BAckspace | |
HT | 09 | 011 | 9 | HOrizontal Tab | \t |
LF | 0a | 012 | 10 | Line Feed | \n |
VT | 0b | 013 | 11 | Vertical Tab | \v |
FF | 0c | 014 | 12 | Form Feed | \f |
CR | 0d | 015 | 13 | Carriage Return | \r |
SO | 0e | 016 | 14 | Shift Out | |
SI | 0f | 017 | 15 | Shift In | |
DLE | 10 | 020 | 16 | Data Link Escape | |
DC1 | 11 | 021 | 17 | Device Control 1 | |
DC2 | 12 | 022 | 18 | Device Control 2 | |
DC3 | 13 | 023 | 19 | Device Control 3 | |
DC4 | 14 | 024 | 20 | Device Control 4 | |
NAK | 15 | 025 | 21 | Negative Acknowledge | |
SYN | 16 | 026 | 22 | Synchronous Idle | |
ETB | 17 | 027 | 23 | End Of Transmission Block | |
CAN | 18 | 030 | 24 | Cancel | |
EM | 19 | 031 | 25 | End Of Medium | |
SUB | 1a | 032 | 26 | Substitute | |
ESC | 1b | 033 | 27 | Escape | \e |
FS | 1c | 034 | 28 | File Seperator | |
GS | 1d | 035 | 29 | Group Seperator | |
RS | 1e | 036 | 30 | Record Seperator | |
US | 1f | 037 | 31 | Unit Seperator |
Note: The Names have been titlecased to make them easier to read.
Notation
Escaping in Scripting and Programming
Sometimes in languages one can't use all the characters available in the source file, especially when using control characters. Encoding these as other characters is called escaping. This usually happens with text between double quotes "
.
There are differences between languages and implementations but the general rules are:
- Special characters preceeded by a backslash
\
are taken literally. (i.e.\"
or\\
) (The exact behaviour varies a lot by language.) - Latin characters preceeded by a backslash to generate control characters (i.e.
\n
) (The C Esc column) - Hexadecimal value of the character prefixed by
\x
(i.e.0x1b
) (should work almost everywhere) - Octal encoding prefixed by just a backslash (i.e.
\033
) (Use this when usingprintf
on the command line)
Note: C Esc
here is short C-Escape as the C language apparantly started this way of excaping characters in strings. It is now used by almost all modern languages and in other contexts. i.e. printf
, sed
, etc. exact support may vary.
Escaping in XML and HTML
XML and html have their own way of escaping non-printable characters, this involves a sequence sandwiched between an ampersand &
and a semicolon ;
.
In general Characters can be escaped using &#<dec>;
where <dec>
is replaced by the decimal value associted with the character. (i.e, &
to encode an ampersand &
)
There are also named escapes to make remembering them easier:
Character | XML-escape |
---|---|
& |
& |
< |
< |
> |
> |
" |
" |
Control and Shift
With the ASCII table control and shift keys an be implemented using simple addition and substraction.
The lowercase character can be obtained by adding 32 (0x20) to the code of the corresponding uppercase character and the control key goes 64 (0x40) in the opposite direction.
With that ctrl+c maps to code 3 "End of Text". And ctrl+d maps to "End of Transmission". You may know those shortcuts from the terminal, they hopefully make a bit more sense now.
Control characters are sometimes written down/printed as the character one gets when adding 64 to the control characters value prefixed by an ^
.
This maps escape to ^]
and the nullbyte to ^@
. (You have probably seen those when opening a binary file in a text editor.)
Unicode
Creating a unicode table andkeeping it updated is out of scope here, besides that: Wikipedia has a List of Unicode Characters and there are the offical Unicode Character Code Charts. (Plus a whole lot of other unicode tables out there.)
Notation Hint: Unicode codepoints (that is the number assigned in the unicode standard, not the binary character representation) are written U+<hex>
where <hex>
is the codepint number in hexadecimal. Example: U+1f603
for 😃.
There is also a little commandline tool called uni
, that is pretty good at providing a searchable unicode table.