The string datatype is a sequence of one or more Unicode characters. It MUST be encoded as UTF-8.

The empty string SHOULD NOT be allowed as a valid string to avoid any ambiguous meaning when serialising or deserialising from less-expressive formats like CSV.

It is advised to reject as well any string with no printable characters.

Finally, a string MUST NOT have a Byte Order Mark.

© Crown copyright released under the Open Government Licence.