The latest version of the specification is version v2.

String

A string is a sequence of one or more Unicode characters. It MUST be encoded as UTF-8.

The empty string SHOULD NOT be allowed as a valid string to avoid any ambiguous meaning when serialising or deserialising from less-expressive formats like CSV.

It is advised to reject as well any string with no printable characters.

Finally, a string MUST NOT have a Byte Order Mark.

© Crown copyright released under the Open Government Licence.