Try PostgreSQL's citext

postgresql
Posted on: 2024-11-01

The PostgreSQL citext type (docs here) is a good choice for lots of user-facing text.

For example, if you want to sort names alphabetically, you probably want Banana to come after apple, even though upper case Latin characters have smaller codepoint values than lower case ones (see the decimal set in man ascii).

And if you have a unique index, you probably don't want to allow both foo@example.com and Foo@example.com.

citext does the right thing in both cases: it store the text with the case given, but for comparisons it ignores case.

(Note that citext is an extension; you have to CREATE EXTENSION IF NOT EXISTS citext in order to use it.)

Ordering

WITH test_data(name) AS (
    VALUES
        ('Banana'),
        ('APE'),
        ('apple')
)
SELECT name FROM test_data ORDER BY name ASC;

  name
--------
 APE
 Banana
 apple
(3 rows)

vs

WITH test_data(name) AS (
    VALUES
        ('Banana'::citext),
        ('APE'::citext),
        ('apple'::citext)
)
SELECT name FROM test_data ORDER BY name ASC;

  name
--------
 APE
 apple
 Banana
(3 rows)

Uniqueness

WITH test_data(name) AS (
    VALUES
        ('Apple'),
        ('apple'),
        ('APPLE')
)
SELECT DISTINCT name FROM test_data;

 name
-------
 APPLE
 apple
 Apple
(3 rows)

vs

WITH test_data(name) AS (
    VALUES
        ('Apple'::citext),
        ('apple'::citext),
        ('APPLE'::citext)
)
SELECT DISTINCT name FROM test_data;

 name
-------
 Apple
(1 row)