String.split(String) and Pattern.split(CharSequence) have surprising
behaviour. For example, consider the following puzzler from
https://konigsberg.blogspot.com/2009/11/final-thoughts-java-puzzler-splitting.html:
String[] nothing = "".split(":");
String[] bunchOfNothing = ":".split(":");
The result is [""] and []!
More examples:
| input | input.split(":") |
Pattern.compile(":").split(input) |
Splitter.on(':').split(input) |
|---|---|---|---|
"" |
[""] |
[""] |
[""] |
":" |
[] |
[] |
["", ""] |
":::" |
[] |
[] |
["", "", "", ""] |
"a:::" |
["a"] |
["a"] |
["a", "", "", ""] |
":::b" |
["", "", "", "b"] |
["", "", "", "b"] |
["", "", "", "b"] |
Prefer either:
Guava’s
Splitter,
which has less surprising behaviour and provides explicit control over the
handling of empty strings and the trimming of whitespace with trimResults
and omitEmptyStrings.
String.split(String, int)
or
Pattern.split(CharSequence, int)
and setting an explicit ‘limit’ to -1 to match the behaviour of
Splitter.
TIP: if you use Splitter, consider extracting the instance to a static
final field.
Suppress false positives by adding the suppression annotation @SuppressWarnings("StringSplitter") to the enclosing element.