Indexing — Data Structures

Sandali Tharuki
Nerd For Tech
Published in
4 min readJun 2, 2021

--

Courtesy: Beautiful Free Images & Pictures | Unsplash

Indexing is a data structure technique that helps to speed up data retrieval. As we can quickly locate and access the data in the database, it is a must-know data structure that will be needed for database optimizing. Indexing minimizes the number of disk accesses required when a query is processed. Indexes are created as a combination of the two columns.

  • First column is the Search key. It contains a copy of the primary key or candidate key of the table. The values of this column may be sorted or not. But if the values are sorted, the corresponding data can be accessed easily.
  • Second column is the Data reference or Pointer. It contains the address of the disk block where we can find the corresponding key value.

Types of indexing

Primary Indexing

Primary indexing only has two columns. First column has the primary key values which are the search keys. The second column has the pointers which contain the address to the corresponding data block of the search key value. The table should be ordered and there is a one-to-one relationship between the records in the index file and the data blocks. This is a more traditional yet a fast mechanism.

  • Dense Index

There is an index record that contains a search key and pointer for every search key value in the data file. Though the Dense index is a fast method it requires more memory to store index records for each key value.

  • Sparse Index

There are only a few index records that point to the search key value. First, the index record starts searching sequentially by pointing to a location of a value in the data file until it finds the actual location of the search key value. Though sparse indexing is time-consuming, it requires less memory to store index records as it has less of them.

Secondary Indexing (Non clustered Indexing)

The columns in the Secondary indexing hold the values of the candidate key along with the respective pointer which has the address to the location of the values. Index and data are communicated with each other through an intermediate node.

Clustered Indexing

The table is ordered in clustered indexing. At the times when the indexes are created using the non-primary key, we combine two or more columns together to get the unique values to identify data uniquely and use it to create the index.

The pointers are pointed to the cluster as a whole.

Multilevel Indexing

If the primary index does not fit in the memory, multilevel indexing is used. When the database increases its size the indices also get increased. A single-level index can be too big to store in the main memory. In multilevel indexing, the main data block breaks down into smaller blocks that can be stored in the main memory.

  • B+ Tree Indexing
  • B- Tree Indexing

Conclusion

Indexing is the one of important data structure techniques that can be used to access and retrieve data in a more efficient way. An Index is created with the Search key and the Data reference. Different types of Indexing are available. The Primary Index is an ordered file and has a Primary key field as the search key and the Pointer field. Primary Index is categorized into two Indexing methods: Dense Indexing, Sparse Indexing. The Secondary Index (Non-Clustered Index) is an unordered file with a non-primary key filed as the search key and the pointer field. The Clustered Index is an ordered file with a non-primary key field and a pointer field. Multilevel Indexing is created when the primary key does not fit in the main memory. The disadvantage of Indexing is we need to have a primary key on the table we perform Indexing.

Hope this article would be helpful.
Thank you!!!

--

--

Sandali Tharuki
Nerd For Tech

Undergraduate-Faculty of Information Technology | University of Moratuwa | Sri Lanka