[Obscure Rule] applies for this entire section.

In the last chapter, we made the observation that a computer character set has two parts: a character repertoire and an agreement on how the repertoire's characters will be encoded as numbers. SQL has a similar rule: An SQL Character set is a combination of two things:

  1. A character repertoire: the set of characters that belong to the Character set.
  2. A Form-of-use: the repertoire's encoding scheme -- the one-to-one mapping scheme between each character in the repertoire and a set of internal codes (usually 8-bit values) that define how the repertoire's characters are encoded as numbers. (These codes are also used to specify the order of the characters within the repertoire.)

All SQL character strings belong to some Character set. Whenever you're working with an SQL character string, you may either specify the Character set it belongs to, or allow it to belong to a default Character set chosen by your DBMS. (To simplify matters, we recommend that you always follow the latter course. This will ensure that you get standard results across SQL-sessions.)

To explicitly specify a Character set for a character string, add a CHARACTER SET clause to a <data type> specification and/or "_<Character set name>" to a <literal>, as shown in the appropriate syntax diagrams in this chapter. Your current <AuthorizationID> must have the USAGE Privilege for the Character set named.

If you choose not to specify a Character set for a character string, the current default Character set is implicit. Your DBMS will choose the current default Character set using these rules:

  1. A character string <data type> specification (in CREATE SCHEMA, CREATE TABLE, CREATE DOMAIN, ALTER TABLE and ALTER DOMAIN) that doesn't include an explicit CHARACTER SET clause is treated as if the default Character set of the Schema it's defined in was explicitly named.
    [NON-PORTABLE] In any operation other than defining a Domain, defining a Column or defining a Field (e.g.: in a CAST operation), a character string <data type> specification that doesn't include a CHARACTER SET clause will be treated as if it belongs to a Character set that is non-standard because the SQL Standard requires implementors to define what the operation's default Character set is. [OCELOT Implementation] The OCELOT DBMS that comes with this book uses ISO8BIT -- the DBMS's initial default Character set -- as the default Character set for such operations.
  2. Any other character string value that doesn't include an explicit Character set specification must either consist only of <SQL language character>s or the value's Character set defaults to (a) the default Character set of the Schema, if it's found in a CREATE SCHEMA statement, (b) the default Character set of the SQL-session, if it's found in a dynamic SQL statement or (c) the default Character set of the Module you're running, if it's found in any other SQL statement in a Module.

Every Character set has at least one Collation: its default Collation. You may define additional Collations for any Character set.

If you want to restrict your code to Core SQL, don't explicitly define the Character set that any character string belongs to -- always allow it to belong to the default Character set.

Comments

Comments loading...