< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039199167010047123209178152124239215162147045098086162037036201

Using SQL

SQL stands for Structured Query Language. It was developed in the mid-1970s by IBM Research to serve as an English interface query language to the System R relational database prototype.

SQL consists of a list of powerful and flexible commands that are used to manipulate information collected in tables, by operating and controlling sets of records at a time.

  • SQL is an interactive query language for ad hoc database queries.

  • SQL is a database programming language.

  • SQL is a data definition and data administration language.

  • SQL is the language of networked database servers.

  • SQL helps protect the data in a multi-user networked environment.

Nowadays, SQL servers are the dominant model for creating client/server applications. The most important tendency among database servers of any size is the revelation of SQL as the choice for the manipulation, definition, and control of data.

SQL has been an ISO standard for a long time. It is a powerful language for databases that adhered to the relational model.

The relational model clearly separates the physical aspects of data from their logical implementation. It frees you from being concerned with the details of how data is stored and makes the access to data purely logical.

By using SQL statements, you just need to specify the tables, columns, and row qualifiers to get to any data item.

SQL Mini-Tutorial

The idea behind this mini-tutorial is to teach you how to change and query the database. Of course, this book does not cover everything. It should give you a brief understanding of the concepts and basic usage of SQL statements. If it becomes necessary to delve deeper in this topic, the last heading of this section contains a list of Web sites that have some beneficial and complete SQL tutorials.

Selecting the Information

In a relational database, data is stored in tables. In our example, we have the USERS Table. ID, NAME, EMAIL, and AGE are the columns of this table.

Table 8.2. USERS
ID NAME EMAIL AGE
1 Andre alessa@bebemania.com.br 25
2 Renata rtaveira@bebemania.com.br 30
3 Cleber clessa@bebemania.com.br 45
4 Beth beth@alugueaqui.com.br 40

Now, say that you want to know the EMAIL and the AGE of each user. You have to use the SELECT statement as follows:

							
SELECT EMAIL, AGE
FROM USERS

						

The following list is the result of your query:

							
EMAIL                       AGE
alessa@bebemania.com.br    25
rtaveira@bebemania.com.br  30
clessa@bebemania.com.br    45
beth@alugueaqui.com.br     40

						

Let me explain to you what you have done: you asked to see all the rows from the USERS table, filtering only the EMAIL and AGE columns. Note that column names and table names do not have spaces—they must be entered as just one word. The general syntax for a SELECT statement (when selecting all the rows from a table) is

							
SELECT Column1Name, Column2Name, …
FROM TableName
					
						

Note

This basic syntax doesn't filter which rows are selected or do anything else interesting.



You can use the asterisk symbol in order to retrieve all the columns from a table without typing every column name:

							
SELECT * FROM TableName;
					
						
Relational Operators

Six important relational operators exist in SQL, and after introducing them, we'll see how they're used:

= Equal
<> Not Equal
< Less Than
> Greater Than
<= Less Than or Equal To
>= Greater Than or Equal To

The WHERE clause of a SELECT statement specifies which rows of a table must be selected. For example, let's determine which users are 25 years old.

							
SELECT NAME
FROM USERS
WHERE AGE = 25;

						

The resultset is as follows:

							
NAME
Andre
					
						
Joins

Good database design suggests that each table in a database must contain data of only one single entity. Detailed information can be acquired by joining tables according to their primary and foreign keys. For example, we will create Table 8.3.

Table 8.3. NATIONALITY
ID ORIGIN
1 Greek
2 Spain
6 USA
8 Brazil

Let's discuss the concept of keys. A primary key is a column or set of columns that uniquely identifies the rest of the data in any given row. For example, in the USERS table, the ID column uniquely identifies each row.

A foreign key is a column in a table that is a primary key of another table. It means that any data in a foreign key column must exist in the other table where that column is the primary key. For example, in the NATIONALITY table, the column ID is a foreign key to the primary key of the USERS table, which is the ID column.

The purpose of these keys is to associate data across tables, eliminating data redundancy in the tables—this is the power of relational databases.

To find the names of the user whose name comes from Spain, use the following query:

							
SELECT USERS.NAME
FROM USERS, NATIONALITY
WHERE USERS.ID = NATIONALITY.ID
AND NATIONALITY.ORIGIN = "Spain"

						

The resultset is as follows:

							
NAME
Renata
					
						
Using Aggregate Functions

I will present five important aggregation functions: SUM, AVG, MAX, MIN, and COUNT. They are called aggregation functions because they summarize the results of a query, rather than listing all the rows.

  • SUM()—   Returns the total value of a given column, based on the selected rows.

  • AVG()—   Returns the average value of the given column.

  • MAX()—   Returns the highest value in the given column.

  • MIN()—   Returns the lowest value in the given column.

  • COUNT(*)—   Returns the number of rows that satisfy the WHERE clause.

Let's look at some examples:

							
SELECT     SUM(AGE), AVG(AGE)
FROM       USERS

						

The resultset is as follows:

							
SUM      AVG
135      33.75

SELECT     COUNT(*)
FROM       USERS
WHERE      AGE > 30

						

The resultset is as follows:

							
COUNT(*)
2

						

Sometimes, when you are working with aggregation functions, the group by clause might be required. For instance, let's say that you need to list the average age by username from your USERS table. The following SELECT statement can be used to group the resultset of your query.

							
SELECT    NAME, AVG(AGE)
FROM      USERS
GROUP BY  NAME
					
						
Adding Data

To insert rows in a table, use the following syntax:

							
INSERT INTO <TABLE NAME> [(<COLUMN1 NAME>, <COLUMN2 NAME>, …)]
 VALUES (<VALUE1>, <VALUE2>, …);

						

Note

In order to not use the column name part of your statement (because it's optional), in most cases, you need to provide values for all the columns of your table.



For example

							
INSERT INTO USERS (ID, NAME, EMAIL, AGE) VALUES (5, "Bruno",
 "bruno@alugueaqui.com.br", 17)
					
						
Deleting Data

Let's delete a row from a table.

							
DELETE FROM USERS
WHERE NAME = "Cleber"

						

If more than one row exists in which NAME = "Cleber", the other row will be deleted too. Using the primary key is a good way to uniquely identify a row for deletion.

To delete all the rows from the table, type the following:

							
DELETE FROM USERS

						
Updating Data

Let's update the age of one user.

							
UPDATE USERS
SET AGE = 18
WHERE NAME = "Bruno"

						

This statement sets Bruno's age to 18. If we had more than one Bruno in our database, we would have to include more conditions in the WHERE clause. It is also possible to update multiple columns at the same time—you just need to separate the attribution statements with commas.

							
UPDATE USERS
SET AGE = 18, EMAIL = "bruno@bebemania.com.br"
WHERE NAME = "Bruno" AND ID = 5
					
						
Cool SQL Language Web Pages

The Introduction to Structured Query Language site can be found at http://w3.one.net/~jhoffman/sqltut.htm.

Several links to SQL material can be found at http://www.lessaworld.com/links_basics_sql.html.

PostgreSQL Databases

PostgreSQL is a free (open-source) SQL database. It is a sophisticated Object-Relational database system derived from Postgres4.2. It conforms to (most of) ANSI SQL and offers many interesting capabilities, including subselects, transactions, and user-defined types and functions. It is the most advanced open-source database available anywhere.

Commercial Support is also available. For details, check out its Web site at http://www.postgresql.org.

pg Module

The pg module was written by D'Arcy J.M. Cain in order to provide an interface to the PostgreSQL database system. It embeds the PostgreSQL query library allowing easy use of its powerful features from a Python script. This module is available for download at http://www.druid.net/pygresql.

The pg module exposes its own DB API interface specification, as you can see next.

							
>>> import pg
>>> for rs in pg.DB('dbname').query('SELECT * FROM USERS').dictresult():
…    print rs
…
					
						

Note

At the time of this chapter was written, it was announced that the latest version of pygresql began supporting the Python DB API 2.0.



MySQL Modules

MySQL is a true multiuser, multithreaded SQL database server. It is a client/server implementation that consists of a server daemon mysqld and many different client programs and libraries. MySQL is very fast for performing queries, but can slow down if lots of updates are being performed. Also, it doesn't have transaction support. For more information, check out http://www.mysql.com.

MySQLdb Module

You need to get and build the MySQLdb module before using it. Check out http://dustman.net/andy/python/MySQLdb.

							
>>> import MySQLdb
>>> connection = MySQLdb.connect(host="spam", db="client", port=3316, \
…                                 user="alessa", passwd="1020erw")
…
>>> con = connection.cursor()
>>> sql_statement = "SELECT * FROM USERS WHERE AGE > 21"
>>> con.execute(sql_statement)
>>> result_set = con.fetchall()
>>> connection.close()
					
						
Python Interface for MySQL

This interface was designed by Joseph Skinner and modified by Joerg Senekowitsch. For more information, check out http://www.mysql.com/Contrib/MySQLmodule-1.4.tar.gz.

The GadFly SQL Module

The GadFly SQL module is a SQL database engine written entirely in Python by Aaron Watters in compliance with the Python Database API. It uses fewer system resources than PostgreSQL, and its speed is comparable to Microsoft Access. However, it doesn't have the performance of commercial software (such as Oracle). This module is easily used by client/server applications because it includes TCP/IP support.

This module entirely fits in a small file, so it doesn't leave huge footprints.

Because it only supports a small subset of the SQL language, it offers excellent code for those who want to learn more about SQL parsing engines and client/server communications. For more information, check out http://www.chordate.com/gadfly.html.

MetaKit Database Engine

MetaKit is a C++ library for storage, transport, and manipulation of structured objects and collections. The next examples show how the MetaKit database engine does on-the-fly restructuring:

Example 1

						
>>> import Mk4py
>>> dbhandle = Mk4py.Storage('datafile.mk',1)
>>> workspace = dbhandle.getas('users[name:S,email:S]')
>>> workspace.append(name='Andre',email='alessa@bebemania.com.br')
>>> workspace.append(name='Renata',email='rtaveira@bebemania.com.br')
>>> dbhandle.commit()

					

Example 2

						
>>> import Mk4py
>>> dbhandle = Mk4py.Storage('datafile.mk',1)
>>> workspace = dbhandle.getas('users[name:S,email:S,age:I]')
>>> for user in workspace:
…print user.name
…user.age = input('age: ')
…
>>> dbhandle.commit()
>>> for user in workspace.sort():
>>>   print user.name, user.email, user.age

					

If you run these two examples in order, you'll have restructured on-the-fly. It will be instant, regardless of the number of rows. If for any reason the transaction is not completed, neither will the restructure be. For more information, check out their Web site at http://www.equi4.com/metakit/python.html.


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | BookmarkCONTINUE >

Index terms contained in this section

aggregate functions
      relational databases
Cain, DŐ
      Arcy J.M.
clauses
      group by
      WHERE
database engines
      MetaKit
databases
      PostgreSQL
     relationalÓ
            Ň
      Structured Query Language (SQLs) 2nd 3rd 4th 5th 6th 7th
engines
     database
            MetaKit
foreign key
functions
     aggregate
            relational databases
GadFly SQL module
group by clause
IBM Research
      Structured Query Language (SQL)
joins
     tables
            relational databases
keys
      foreign
      primary
MetaKit database engine
modules
      GadFly SQL
      MySQL
      pg 2nd
MySQL module
pg module 2nd
PostgreSQL databases
primary key
programming languages
      Structured Query (SQL) 2nd 3rd 4th 5th 6th 7th
redundancy
      tables
relational databasesÓ
      Ň
SELECT statement
      WHERE clause
statements
     SELECT
            WHERE clause
Structured Query Language (SQLs) 2nd 3rd 4th 5th 6th 7th
tables
     joins
            relational databases
      redundancy
WHERE clause

© 2002, O'Reilly & Associates, Inc.