Posts

Showing posts with the label full-text search

Expression Index

Image
Introduction Expression Index is an extremely powerful feature of PostgreSQL, which can be used for any data type (such as text, numbers, time, arrays and more) as long as the result of the expression returns a Deterministic value You can use Expression Index for all index types ( B-Tree , Hash , GIN , GiST and more) as long as the expression you write in the CREATE INDEX command returns a data type that the Index type supports for processing The nature of Expression Index is that instead of indexing on the original column value, Postgres precomputes the result of the function or expression as soon as you INSERT/UPDATE and saves the result directly into the Index file Characteristics After creating an Expression Index, when using a query, you must write it exactly like the expression in the Index, otherwise the index will not work Every time a row of data is modified, Postgres must run that function or expression to get the result to create the Index. Therefore, avoid using function...

GiST Index

Image
Introduction GiST (Generalized Search Tree) is also a high-order balanced tree, which has a hierarchical structure similar to the traditional B-Tree structure. However, the core difference lies in the storage content inside nodes including Root Node and Internal Nodes: each entry in a branch node contains two pieces of information: Predicate: This is the data area created by Postgres from a general level (at the root node) to a detailed level (at the internal node) Nodes at higher levels will contain a general description for all child nodes below them With such a structure, it helps to eliminate extremely quickly data areas that definitely do not satisfy the condition, instead of having to check each row one by one. TID points to lower-level child nodes. Leaf Nodes contain information Specific actual data of the field value TID points to the row data in the Heap Data type GiST supports well for handling data types without linear order (cannot be sorted from smallest to largest like re...

GIN Index with TEXT

Image
Introduction In this article, we will continue to explore how to use GIN Index with the text data type First, please note that GIN Index only works with multi-valued data types, meaning fields containing multiple values, so you can easily create an index with the array type However, if you create a similar index with the text type, an error will be reported because text is only a single-valued type, containing only one content string, so you must pass an additional function that defines how to split the text value into items in an array before the index can be created If you remember, when using GIN Index with a text array, you will encounter limitations regarding partial search and must enter the exact word to search for it to work, now you can solve that problem by using text combined with two functions, to_tsvector and pg_trgm Full-Text Search This method uses the to_tsvector function and its index creation process will be as follows Tokenization & Normalization Split the text...