It takes time for a database to search through row upon row of information and find specific values. It’s a task made more difficult as rows of data are removed and new ones are added. Over time, data can be stored more randomly than you would think or like. Indexes are a way to overcome this and to help make it quicker to find values in database tables.
A few weeks ago I started talking about databases. I began with some thoughts about MySQL and its MyISAM and InnoDB schemas and then I talked about MySQL configuration and some of the variables you might want to change from their default values.
I want to continue and talk about the use of indexes as a way to speed up database searches. Today I’ll talk about indexes in general and their pros and cons. In the coming weeks I’ll talk about the different types of indexes, how to add them to a table, and some thoughts on when it’s best to use them.
General Thoughts About Indexes
Imagine trying to find information about a specific topic in a book. If the book is a few pages long, more of a thin pamphlet than a book, it’s easy enough to flip through all the pages and scan for words related to your topic. If the book is a few hundred pages long and your topic of interest is spread throughout all its pages, you aren’t going to scan every one of them to find your topic. You’re going to flip to the back of the book and hope for an index so you can lookup where your topic of interest is discussed and then check the pages where it’s mentioned.
It’s similar when searching a database. It isn’t too hard for MySQL to look through a few rows in a single table and search row by row for the desired information, but when the rows have grown into the hundreds or thousands it’s not so quick to search row by row for a particular value or several values. Indexes can help speed up this searching process.
Without an index, MySQL starts at the first row, searches it for the desired information, and then moves on to the next row to search it. It continues searching one row at a time like that until it reaches the last row in the table. For a small table with a few rows this isn’t an issue, but for larger tables with many, many rows, it’s not very efficient.
If the column or columns in question have indexes, MySQL can more quickly determine the general part of the table where the desired rows will be, saving itself from having to look through every row in sequence. A sequential read can be faster, as in the cases of a few rows or a query that needs to access most of the rows in the table, but otherwise an index is going to lead to faster read times.
An index is essentially a new table that the database creates and uses to organize the rows of data in a way that makes it easier to search. Instead of the rows being organized in the order they were entered, string data can be organized alphabetically, numerical data can be organized numerically, and where the data is a date it can still be organized by date.
Here’s a good article about indexes by Francisco Claria in which he offered some test data and showed the difference in times required to find values in the unindexed and indexed versions of the same data. The tables each hold 1.7 million records so these aren’t small tables.
The end result was nearly a two second difference. Admittedly that’s probably larger than most of the tables you and I will work with, but still, two seconds is a large block of time when we’re considering website performance.
Pros and Cons of Indexes
Indexes are not always appropriate. Again, for tables with only a few rows, it’s actually quicker for MySQL to run through a sequential search than to make use of indexes. Similarly, when you want most or all of the rows in a table returned, a sequential search will be faster. However, there are plenty of common cases where an index will speed up a database search.
- To find rows matching a WHERE clause in a query.
- To eliminate rows from consideration.
- To retrieve rows from other tables when JOINS are part of a query and where the columns being searched are of similar type and size.
- To find min and max values in a column.
- To sort or group results.
- To return results from the index itself as opposed to searching the table directly.
While indexes are good at speeding up the time it takes to search and find data in SELECT queries, they can actually slow down queries to UPDATE, INSERT, or DELETE because MySQL needs to reindex the data when the table changes. If you have tables that you modify more often than you read the data, you may not want to use indexes for the purpose of quicker searching.
You should also consider that indexes take up additional space in the database as they are additional tables of data that need to be stored. This probably won’t be a big consideration, but it could be depending on how much space your hosting company allows for your database.
You want to create indexes where they help, but not just for the sake of adding an index so keep in mind where an index helps, where it doesn’t, and where it could possible hurt.
I’ll leave you with a few links in case you want to read more before I get to the next couple of topics about indexes.
- The Basics of Database Indexes For Relational Databases
- Indexing Database Tables
- MySQL Indexing Explained
If your database is small and contains tables with not too many rows, then you probably won’t see significant performance improvements from using indexes.
However, over time as your database becomes less ordered, as rows are added and removed, and as the amount of data increases, indexes become very useful in speeding up database reads.
Not all indexes are the same, though. There are several types of indexes, each with different strengths and appropriate uses. That’s where I want to pick things up next week when I talk about different types of indexes.
Download a free sample from my book, Design Fundamentals.