Gevel contrib module provides several functions useful for analyzing
   GIN and GiST indexes.

   Online version of this document is available
   [23]http://www.sai.msu.su/~megera/wiki/Gevel

   Caution: This module was designed for developers of GiST/GIN based
   indices !

Authors

     * Oleg Bartunov <oleg@sai.msu.su>, Moscow, Moscow University, Russia
     * Teodor Sigaev <teodor@sigaev.ru>, Moscow, Moscow University,Russia

License

   Stable version, included into PostgreSQL distribution, released under
   BSD license. Development version, available from this site, released
   under the GNU General Public License, version 2 (June 1991)

Downloads

   Old version of gevel available from
   [24]http://www.sai.msu.su/~megera/postgres/gist/gevel/

   Development version is available from CVS:
cvs -d :pserver:anoncvs@sigaev.ru:/root login
(just enter an empty password)
cvs -d :pserver:anoncvs@sigaev.ru:/root co gevel

Installation

% cd PGSQLSRC/contrib
% tar xzvf gevel.tar.gz
% make
% make install
% make installcheck
% psql regression < gevel.sql

Examples

   All examples below require installing contrib/rtree_gist module and
   already installed gevel module.
cd contrib/rtree_gist
make
make install
make installcheck
psql regression < gevel.sql

   Now you have regression database with two tables and indices:
regression=# \pset tuples_only 1
Showing only tuples.
regression=#  \d
          List of relations
 Schema |  Name   | Type  |  Owner
--------+---------+-------+----------
 public | boxtmp  | table | postgres
 public | polytmp | table | postgres
(2 rows)
regression=# \di
             List of relations
 Schema | Name | Type  |  Owner   |  Table
--------+------+-------+----------+---------
 public | bix  | index | postgres | boxtmp
 public | pix  | index | postgres | polytmp
(2 rows)


     * gist_stat(INDEXNAME) - show some statistics about GiST tree

regression=# select gist_stat('pix');
 Number of levels:          2
Number of pages:           30
Number of leaf pages:      29
Number of tuples:          3129
Number of leaf tuples:     3100
Total size of tuples:      137676 bytes
Total size of leaf tuples: 136400 bytes
Total size of index:       245760 bytes

     * gist_tree(INDEXNAME,MAXLEVEL) - show GiST tree up to MAXLEVEL

regression=# select gist_tree('pix',0);
 0(l:0) blk: 0 numTuple: 29 free: 6888b(15.63%)

   Designation (from left to right):

     * 0 - page number
     * (l:0) - tree level
     * blk: 0 - block number
     * numTuple: 29 - the number of tuples
     * free: 6888b - free space in bytes
     * (15.63%) - occupied space in percents

     gist_tree(INDEXNAME) - show full GiST tree
regression=# select gist_tree('pix');
 0(l:0) blk: 0 numTuple: 29 free: 6888b(15.63%)
    1(l:1) blk: 13 numTuple: 180 free: 244b(97.01%)
    2(l:1) blk: 11 numTuple: 175 free: 464b(94.32%)
    3(l:1) blk: 2 numTuple: 101 free: 3720b(54.43%)
    4(l:1) blk: 17 numTuple: 111 free: 3280b(59.82%)
    5(l:1) blk: 18 numTuple: 101 free: 3720b(54.43%)
    6(l:1) blk: 10 numTuple: 98 free: 3852b(52.82%)
    7(l:1) blk: 19 numTuple: 111 free: 3280b(59.82%)
    8(l:1) blk: 9 numTuple: 97 free: 3896b(52.28%)
    9(l:1) blk: 20 numTuple: 104 free: 3588b(56.05%)
    10(l:1) blk: 14 numTuple: 96 free: 3940b(51.74%)
    11(l:1) blk: 21 numTuple: 106 free: 3500b(57.13%)
    12(l:1) blk: 7 numTuple: 103 free: 3632b(55.51%)
    13(l:1) blk: 1 numTuple: 101 free: 3720b(54.43%)
    14(l:1) blk: 16 numTuple: 97 free: 3896b(52.28%)
    15(l:1) blk: 24 numTuple: 103 free: 3632b(55.51%)
    16(l:1) blk: 4 numTuple: 98 free: 3852b(52.82%)
    17(l:1) blk: 25 numTuple: 98 free: 3852b(52.82%)
    18(l:1) blk: 3 numTuple: 97 free: 3896b(52.28%)
    19(l:1) blk: 26 numTuple: 96 free: 3940b(51.74%)
    20(l:1) blk: 6 numTuple: 103 free: 3632b(55.51%)
    21(l:1) blk: 8 numTuple: 162 free: 1036b(87.31%)
    22(l:1) blk: 23 numTuple: 94 free: 4028b(50.66%)
    23(l:1) blk: 12 numTuple: 82 free: 4556b(44.19%)
    24(l:1) blk: 27 numTuple: 105 free: 3544b(56.59%)
    25(l:1) blk: 5 numTuple: 90 free: 4204b(48.51%)
    26(l:1) blk: 28 numTuple: 100 free: 3764b(53.90%)
    27(l:1) blk: 22 numTuple: 101 free: 3720b(54.43%)
    28(l:1) blk: 15 numTuple: 95 free: 3984b(51.20%)
    29(l:1) blk: 29 numTuple: 95 free: 3984b(51.20%)

     * gist_print(INDEXNAME) - prints objects stored in GiST tree, works
       only if objects in index have textual representation (type_out
       functions should be implemented for given object type). It's known
       to work with R-tree GiST based index (contrib/rtree_gist). Note, in
       example below, objects are of type box.

# \di
             List of relations
 Schema | Name | Type  |  Owner   |  Table
--------+------+-------+----------+---------
 public | bix  | index | postgres | boxtmp
 public | pix  | index | postgres | polytmp
(2 rows)
# \d pix
Index "public.pix"
 Column | Type
--------+------
 p      | box
gist, for table "public.polytmp"

# select * from gist_print('pix') as t(level int, a box) where level =1;

   for 8.1 and later this query should looks like
# select * from gist_print('pix') as t(level int, valid bool, a box) where level =1;

 level | valid |              a
-------+-------+-----------------------------
     1 | t     | (37357,50073),(34242,357)
     1 | t     | (43499,49770),(40358,43)
     1 | t     | (31193,24679),(25047,12410)
     1 | t     | (31018,12142),(25083,6)
     1 | t     | (49944,25174),(43471,12802)
     1 | t     | (12577,49757),(6302,37534)
     1 | t     | (12528,37333),(6171,24861)
     1 | t     | (50027,49751),(46817,25462)
     1 | t     | (46870,49912),(43664,25722)
     1 | t     | (24855,25574),(12447,19263)
     1 | t     | (25054,19126),(12403,12796)
     1 | t     | (32737,49923),(31178,1038)
     1 | t     | (3184,24465),(15,81)
     1 | t     | (24951,49983),(12740,44000)
     1 | t     | (24919,43956),(12617,37901)
     1 | t     | (40387,49852),(37338,25217)
     1 | t     | (40325,24963),(37375,491)
     1 | t     | (24919,12698),(12654,6518)
     1 | t     | (25002,6338),(12350,51)
     1 | t     | (49985,12554),(43447,222)
     1 | t     | (25003,37769),(12552,25573)
     1 | t     | (34270,49382),(32763,594)
     1 | t     | (6205,50012),(3,37527)
     1 | t     | (6163,37358),(120,25034)
     1 | t     | (12343,24542),(9295,294)
     1 | t     | (9308,24151),(6234,620)
     1 | t     | (6230,24629),(3169,108)
     1 | t     | (31179,50040),(28113,25556)
     1 | t     | (28048,49694),(25000,25000)
(29 rows)


     * Using Gevel module it's possible to visualize index tree, see for
       example [25]Rtree Index.
     * gin_stat(INDEXNAME) prints statistics about each index entry in GIN
       index (PostgreSQL 8.2+)

   For each index entry 'key' it returns set of records (key, nrows),
   where 'nrows' is the estimated number of rows from tables, returned by
   select. For example, in text search 'nrows' is the number of documents,
   which contains word 'key'. This statistics may be used ONLY as
   estimation of true counts, since it doesn't takes into account the
   visibility information (from heap) and uses approximate counts in case
   of long posting lists.

   Vacuum table after table modification to obtain more accurate
   statistics.

   Top-5 most frequent words (from 463873) :
=# SELECT * FROM gin_stat('gin_idx') as t(word text, ndoc int)  order by ndoc desc limit 5;
  word  |  ndoc
--------+--------
 page   | 340858
 figur  | 240366
 use    | 148022
 model  | 134442
 result | 129010
(5 rows)

Time: 520.714 ms

   Compare exact stats, obtained using ts_stat() function, with estimated
   counts:
=# select a.word, b.ndoc as exact, a.estimation as estimation,
round ( (a.estimation-b.ndoc)*100.0/a.estimation,2)||'%' as error from
(SELECT * FROM gin_stat('gin_x_idx') as t(word text, estimation int)
 order by estimation desc limit 5 ) as a, stat b where a.word = b.word;
  word  | exact  | estimation | error
--------+--------+------------+-------
 page   | 340430 |     340858 | 0.13%
 figur  | 240104 |     240366 | 0.11%
 use    | 147132 |     148022 | 0.60%
 model  | 133444 |     134442 | 0.74%
 result | 128977 |     129010 | 0.03%
(5 rows)

Time: 550.562 ms

   Example for integer arrays:
# SELECT * FROM gin_stat('gin_idx') as t(value int, nrow int) where nrow > 250;
  value | nrow
 -------+------
     31 |  254
     47 |  251
     52 |  257
     59 |  259
 (4 rows)