A Charset
is a mapping between sequences of
16-bit Unicode code units and sequences of bytes. Charsets are used
when encoding characters into bytes and decoding bytes into characters.
Using APIs that rely on the JVM’s default Charset under the hood is dangerous.
The default charset can vary from machine to machine or JVM to JVM. This can
lead to unstable character encoding/decoding between runs of your program, even
for ASCII characters (e.g.: A
is 0100 0001
in UTF-8
, but is 0000 0000
0100 0001
in UTF-16
).
If you need stable encoding/decoding, you must specify an explicit charset. The
StandardCharsets
class provides these constants for you.
When in doubt, use UTF-8.
Suppress false positives by adding the suppression annotation @SuppressWarnings("DefaultCharset")
to the enclosing element.