PostgreSQL case and accent insentitive SQL queries
A few days ago we were implementing a Spring Boot application which stored its data into a PostgreSQL database. As a requirement we needed to search into the database with LIKE operator in our SQL
queries and at the same time perform case and accent insensitive searches. Since we are Greek, this is pretty typical for us because our letters have diacritics (“tonos”). So a query like the following
select *
from blablatable
where name like '%χρη%' ;
should return results for:
- Χρήστος
- ΧΡΗΣΤΟΣ
- χΡήστος
The aforementioned SQL query works perfectly normal in a MySQL or MariaDB database with collation utf8_general_ci
. However in PostgreSQL this is not the case. So after some searching and reading StackOverflow #11005036, we figured out how to make case and accent insensitive search SQL queries.
Fire up a PostgreSQL Docker container
We chose to use PostgreSQL inside a Docker container, so we fire up one:
docker run --name postgres \
-p 5432:5432 \
-e POSTGRES_PASSWORD=strongpassword \
-e POSTGRES_USER=stronguser \
-e POSTGRES_DB=airplanes \
-d postgres:13.0
Connect to PostgreSQL using a Docker container
We used a temporary container to connect to our newly created database. You can skip this step and use another client such as pgAdmin or DBeaver.
docker run -it --rm postgres:13.0 psql -h host.docker.internal -d airplanes --user stronguser
Password for user stronguser:
<enter thy password>
Mind the host.docker.internal
. This host name is used only in Docker for Windows to mark the host computer where Docker runs.
Create an extension in PostgreSQL
In order to use case and accent insensitive search SQL queries, we need to use a PostgreSQL extension. So, execute in your favourite PostgreSQL client:
CREATE EXTENSION unaccent;
Create a table and insert some data
Then we need to create a simple table in the default schema public
, which will have some text data with accents:
create sequence public.BOBOS_SEQUENCE start 1 increment 1;
create table public.bobos (
ID int8 not null DEFAULT nextval('BOBOS_SEQUENCE'),
name varchar(255),
primary key (ID)
);
After that we populate it with some test data:
INSERT INTO public.bobos (name) VALUES('Τάκης, the best handmade gyros of Triandria');
INSERT INTO public.bobos (name) VALUES('Χρήστος');
INSERT INTO public.bobos (name) VALUES('ΧΡΗΣΤΟΣ');
INSERT INTO public.bobos (name) VALUES('χΡήστος');
SQL case and accent insensitive queries
Finally in order to perform case and accent insensitive SQL queries , execute the following statement:
select *
from public.bobos
where unaccent(LOWER(NAME)) like unaccent(LOWER('%χρη%')) ;
The results will be like the following:
id|name |
--|-------|
2|Χρήστος|
3|ΧΡΗΣΤΟΣ|
4|χΡήστος|
As you might have noticed, we need to use both unaccent()
extension and LOWER()
function in order to meet the desired result.
I hope you found this article useful and you do not waste any more time searching in the web. Have a nice day and love thy neighbour!
Comments