GiST internals
GiST internal interface
GiST interface to database is conformed to PostgreSQL index interface agreements. Source files are located under src/backend/access/gist and header files - under src/include/access in PostgreSQL source tree. Basic algorithms implemented in GiST are described here.
- gistbuild
- Creates index.
Parameters: table and index descriptors - gistinsert
- Insert new value to the index.
Parameters: index descriptor, value, pointer to table record - gistbulkdelete
- Dead tuples cleanup.
Parameters: index descriptor. Tree-walking and calling callback function to recognize does tuple is dead or live. - gistvacuumcleanup
- remove empty pages and restoring broken (after crash) tuples. Also rebuilds internal keys if needed (lock index).
Parameters: index descriptor. - gistbeginscan, gistendscan, gistrescan
- Base methods for search in index - create scan, finish scan, repeat scan.
Parameters:
- gistbeginscan - index descriptor, search keys (may be ommitted)
- gistendscan, gistrescan - scan
- gistmarkpos
- Save scan state.
Parameters: scan - gistrestrpos
- Restore scan state.
Parameters: scan - gistgettuple, gistgetmulti
- Returns one or several pointers to records in table, according search key, specified in gistbeginscan.
Parameters: scan - gist_redo
- Restoring of one WAL record.
Parameters: LSN (LogSequenceNumber), record of WAL - gist_desc
- Describes WAL record (debug only)
- gist_xlog_startup
- Prepare to recover using WAL
- gist_xlog_cleanup
- Finalize recovering.
NOTES:
- Functions gistmarkpos, gistrestrpos and gistrescan are required by PostgreSQL optimizer and executor and may be not needed for other database engine.
- GiST has no direct access to tables, only via pointers ( the number of page and the number of record on the page) to records. So, callback function in gistbulkdelete uses pointer to record and this function should recognize if pointer is still valid (points to the record) or not.
GiST requirements to memory manager
Memory manager should have:
- malloc/realloc/free functions
- create,switch, clean and destroy memory context, used by alloc. functions. At the same time there are could ve several memory contexts.
GiST requirements to lock manager
- share/exclusive locks at the page level. Deadlock detector NOT needed.
- share/exclusive locks at the index level. Deadlock detector NOT needed.
GiST requirements to page manager
- Ability to easy find page by its number/index
- When creating new pages it's very desirable to use pages marked as free by gistvacuumcleanup
GiST requirements to WAL
- LSN (Log Sequence Number) - monotonically increased (on the whole period of database life) sequence number (PostgreSQL uses 64-bit integer), that identifies log record.
- Function which writes WAL record and returns its LSN
NOTE: It's possible to implement GiST concurrency without WAL. In that case we need a sequencer which generates monotonically increased integers.
What we need to know
- Agreements on passing keys (by value, by pointers, length, memory usage,…)
- It'd be nice to know specification (names, descriptions) of WAL, lock and page manager interfaces.
- Specifications of functions (if any ) for converting value → tuple, tuple → page.