The key phrase:
The nature of some locale categories is that their value has to be fixed for the lifetime of a database cluster. That is, onceTest bed:initdb has run, you cannot change them anymore.LC_COLLATE andLC_CTYPE are those categories. They affect the sort order of indexes, so they must be kept fixed, or indexes on text columns will become corrupt.PostgreSQL enforces this by recording the values ofLC_COLLATE> and LC_CTYPE> that are seen by initdb>. The server automatically adopts those two values when it is started.
Linux, Slackware 8.1, libc 2.2.5, postgresql 7.3.3, perl 5.6.1Steps to success:
localedef -i ru_RU -f UTF-8 ru_RU.UTF8
export LC_CTYPE=ru_RU.utf8
export LC_COLLATE=ru_RU.utf8
perl testlocale-utf8.pl|iconv -f utf8 -t koi8-r
Output shoul be sorted cyrillic letters.
export LC_CTYPE=ru_RU.utf8
export LC_COLLATE=ru_RU.utf8
initdb -E UTF8 --pgdata=/db1/pgdata.utf8
pg_ctl -D /db1/pgdata.utf8 start
createdb -E UTF8 utf8
psql -l
List of databases
Name | Owner | Encoding
-----------+----------+----------
template0 | postgres | UNICODE
template1 | postgres | UNICODE
utf8 | megera | UNICODE
(3 rows)
psql utf8
utf8=# create table tt (a text);
CREATE TABLE
utf8=# \copy tt from './cyrillic.utf8'
\.
utf8=# \o out.utf8
utf8=# select * from tt order by a asc;
utf8=# \q
zen:~/app/locale$ iconv -f utf8 -t koi8-r out.utf8
Output should be sorted cyrillic letters !
PostgreSQL works well with cyrillic and UTF8
Bad news:
I discovered that upper(), lower() function doesn't works in my setup. Read http://fts.postgresql.org/db/msg.html?mid=1070198 for details.