10.4.2 Working with an Existing MySQL Table
10.4.2 Working with an Existing MySQL Table
In Section 10.4.1, we discuss a program that creates a table in MySQL from within a Perl program. Usually, a table is created using an interface—using either command-line or GUI-based—without the need of writing a program. A table can also be created using pure SQL commands stored in a file, without using Perl as an intermediary. The program in Section 10.4.1 is an exercise illustrating that a program can create a table if necessary.
Once a table has been created, we can populate it. A program can populate a table from data using various sources, such as from a terminal, a graphical user interface, a text file, another database table—either of the same kind of database or another. For example, it is possible to read from a table in an Oracle database and write to a table in a MySQL database. The program that follows populates a database by inserting data hard-coded within the program as well by reading from a text file. The program also runs a few SELECT queries. SELECT statements in SQL ask the database to select certain rows from within a table within it that satisfy specified constraints. SELECT is a very commonly used SQL
query or statement.
Program 10.11
#!/usr/bin/perl
#insertFetch.pl
use strict;
use DBI;
my ($driver, $database, $user, $password);
my ($dsn, $dbh, @databases, @tables, $table, $sth, $query);
my ($numRows, $numFields, $fieldNames);
$driver = "mysql";
$database = "test";
$dsn = "DBI:$driver:$database";
$user = "Project";
$password = "Project";
##need to start the connection
$dbh = DBI->connect ($dsn, $user, $password) or
die "Cannot connect to $dsn: $DBI::errstr";
#List all the tables; private driver functions
@tables = $dbh->func ('_ListTables');
print "Tables in database $database are ...\n",
join "\n", @tables, "\n\n";
#prepare a query;
$table = "friends";
$query = "select * from $table";
$sth = $dbh -> prepare ($query);
$sth -> execute;
$numFields = $sth ->{'NUM_OF_FIELDS'};
print "Number of fields in table $table = $numFields\n";
$numRows = $sth -> rows;
print "Found $numRows rows in $dsn\n";
$fieldNames = $sth -> {'NAME'};
print "The fields in table $table = \n ",
join (" ", @$fieldNames), "\n\n";
#Create a query with
$query = qq{
INSERT INTO $table (SSNUM, NAME, AGE, HOMETOWN, SCHOOL)
VALUES (?, ?, ?, ?, ?)};
$sth = $dbh -> prepare ($query);
$sth -> execute (198640756,"Joe White", 18, "Colorado Springs", "PPCC")
or warn $dbh -> errstr;
open IN, "friends.txt";
while (){
chomp;
my ($ssnum, $name, $age, $hometown, $school) = split /::/, $_;
print "$ssnum $name $age $hometown $school\n";
$sth -> execute ($ssnum, $name, $age, $hometown, $school)
or warn $dbh -> errstr;
}
$sth -> finish();
#all entries
print "\n\nAll entries in TABLE $table\n\n";
$query = qq{select STUDENT_ID, SSNUM, NAME,
AGE, HOMETOWN, SCHOOL from $table};
fetchPrintRows ($dbh, $sth, $query);
#Find entries from Springs
print "\n\nAll Colorado Springs entries in TABLE $table sorted by AGE and SSNUM\n\n";
$query = qq{select STUDENT_ID, SSNUM, NAME,
AGE, HOMETOWN, SCHOOL from $table
WHERE HOMETOWN like "%Springs%"
ORDER BY AGE, STUDENT_ID, SSNUM};
fetchPrintRows ($dbh, $sth, $query);
#perform an update
$query = qq{update friends set SSNUM = 222334444 where
name="Christopher Paul"};
$sth = $dbh -> prepare ($query);
$sth -> execute () or warn $dbh -> errstr;
print "\n\nAll entries in TABLE with age>20 $table";
print " sorted by AGE, STUDENT_ID and SSNUM\n\n";
$query = qq{select STUDENT_ID, SSNUM, NAME,
AGE, HOMETOWN, SCHOOL from $table
WHERE AGE > 20
ORDER BY AGE, STUDENT_ID, SSNUM};
fetchPrintRows ($dbh, $sth, $query);
#disconnect from the database
$dbh -> disconnect ();
#############################
#subroutine to fetch the rows
sub fetchPrintRows{
my ($dbhandle, $sthandle, $aQuery) = @_;
$sthandle = $dbhandle -> prepare ($aQuery);
$sthandle -> execute () or warn $dbhandle -> errstr;
while (my $row = $sthandle -> fetchrow_hashref()){
printf "%5d %9d %-20s %3d %-20s %-5s\n",
$row -> {'STUDENT_ID'},
$row -> {'SSNUM'},
$row -> {'NAME'},
$row -> {'AGE'},
$row -> {'HOMETOWN'},
$row -> {'SCHOOL'};
}
} #sub ends
The program starts by declaring to use the DBI.pm module and hence, imports all variables, functions and methods in the DBI.pm module. As discussed earlier, depending on the underlying database, the appropriate driver needs to be loaded on the system. Here, the database is MySQL. Therefore, the driver used is mysql. The DBI.pm is the high, database-independent level interface from the side of the Perl program; the mysql.pm module provides the low-level MySQL-specific interface. The program connects to a database using the login name Project
and the password Project. If it cannot connect, the program dies.
Using the DBI.pm method func, one can call private methods that are non-standard and non-portable and that are implemented by a specific driver, here the MySQL driver. func takes a function name as argument, here _ListTables.
@tables = $dbh->func ('_ListTables');
A function argument sent to func takes arguments, the arguments come first, and the function name comes last. One must consult driver-specific documentation to determine what is appropriate. In this specific case, the MySQL server returns the number of tables in the test database. The list is printed by the program. We can look at the output given later in the section to confirm.
Next, the program takes an SQL query and prepares it.
$query = "select * from $table";
$sth = $dbh -> prepare ($query);
prepare takes as its argument an SQL statement for later execution. Some database driver modules put the statement into an internal compiled form so that it runs faster with execute. At this time, MySQL does not really do anything to prepare a statement and merely stores the statement. prepare returns a handle that can be used to get attributes of the statement as well as to execute the statement. Some drivers can give useful information only after the statement has been executed. In general, DBI drivers do not parse; the contents of the statement are passed directly to the database driver and subsequently to the database engine. The query prepared and executed is given below.
$query = "select * from $table";
The select SQL statement asks the database to obtain one or more rows of data from a specified table to satisfy specified conditions, if any. This select statement does not have any associated conditions. As a result, every row of the table is returned by the database. A select statement can request the database to return a set of explicitly listed fields for each row returned. In this select statement, no explicit list of fields has been given. Instead, * has been specified as the field list. * indicates that all fields of the chosen rows be returned.
Once the select statement is executed, we can query to find the number of fields in the table.
$numFields = $sth ->{'NUM_OF_FIELDS'};
We can find the number of rows returned by a query. Here, assuming that the table friend has been constructed recently with no rows of data, the number of rows returned by the select statement is given above is zero. The rows method returns the number of rows affected by the last row-affecting command or -1 if the number of rows is not known or not available. The reader is advised to consult the output of the program given later in the section.
The names of the fields in the table are found by using the NAME attribute of the statement handle $sth.
$fieldNames = $sth -> {'NAME'};
The attribute returns a reference to a list of field names. The field names are printed by the program.
Next, the program creates a string that represents a query for inserting rows of data into the table. The SQL query for insertion is given below.
INSERT INTO $table (SSNUM, NAME, AGE, HOMETOWN, SCHOOL)
VALUES (?, ?, ?, ?, ?)
The query or statement asks the database to insert five fields when it inserts a row into the table friends. The fields are SSNUM, NAME, AGE. HOMETOWN, and SCHOOL. Note that the table can have up to seven columns in each row out of which six can be specified in a query. Here, the column NOTES remains unspecified. The value of STUDENT_ID is autoincremented. The fields to be inserted are specified within parentheses. The SQL keyword VALUES is used to specify the values to be inserted into the table. The values are given in a list. The values can be given directly, or one or more values can be left empty with placeholders. A placeholder is represented by ?. In the current SQL statement, five field values need to be specified. Each of the five fields is left unspecified at this time. The values are specified later in the program. The query is prepared and the statement executed.
$sth = $dbh -> prepare ($query);
$sth -> execute (198640756,"Joe White", 18, "Colorado Springs", "PPCC")
or warn $dbh -> errstr;
We have seen the execute method of the statement handle earlier. It has been called so far with no arguments. However, in the current invocation of execute, we provide five argument values. These five values correspond to the five unspecified fields in the SQL query. Thus, an SQL query may have one or more unspecified fields in the query string that is prepared. Any unspecified field values must be specified in the execute method call. Thus, the SQL query must be fully specified by the time it is executed. Also, a few fields may be specified in the prepared query and the rest specified in the execute method call. The values unspecified in the prepared query must be specified in the same sequence in the execute method call. In the current INSERT SQL statement, the values of all the fields are specified directly in terms of the program code.
The sequence of statements that follow are used to read a text file line by line. Each line represents a row of data. The fields in a row are separated from each other by two colons. This is a made-up separator for our data file. The program separates the fields using the split statement. The values obtained by splitting are given the names $ssnum, $name, $age, $hometown, and $school, respectively. The call to the execute method that follows are passed these five variables
in the same order as argument.
$sth -> execute ($ssnum, $name, $age, $hometown, $school)
or warn $dbh -> errstr;
The point to note here is that there is time expended in prepareing a query. Thus, if many similar queries are prepared individually, one at a time, time is wasted. Some such waste of time can be obviated by prepareing the query with one or more missing values, and specifying the missing values during execution. In the current program, we could have prepared the query inside the while statement. However, in such a case, the query would have been prepared for every iteration of the loop, i.e., for every line of data in the text file friends.txt. We save some time by prepareing the query once, before the while loop is entered. It results in saving of time which may not be noticeable only if a few lines of data are inserted inside the while loop, but would be substantial if tens or hundreds of thousands of line are inserted. The contents of the file data.txt are printed to the screen as the data are entered into the table.
Next, the program prints every row of data that has been entered into the table onto the standard output. This requires the program to prepare the following query.
select STUDENT_ID, SSNUM, NAME,
AGE, HOMETOWN, SCHOOL from $table
Note that this select query has no conditions associated with it. As a result, execution of this query results in selection of all rows in the database. Only the specified fields are selected from each row. The prepareing of the query, its execution and the printing of every row’s data takes place in the subroutine fetchPrintRows. This subroutine, seen at the bottom of the program, takes three arguments: a database handle, a statement handle, and a query. The query is prepared and executed. There is a while loop in the subroutine. The conditional of the while loop is given below.
my $row = $sthandle -> fetchrow_hashref()
The right hand side of the assignment has a call to the fetchrow_hashref method of the statement handle object $sth. fetchrow_hashref returns a reference for the next row of data returned by the database. The row of data is represented within Perl as a hash. That is, as soon as row of data is returned by the database driver, Perl converts it into a hash automatically before making it available to the program. The program sees the row of data as a hash table where each field name is used as a key and the corresponding content of the field as value. The program prints the field values nicely using printf and an associated format specification. The while
loop in the subroutine is executed as many times as there are rows in the result returned by the database driver.
Next, the main program executes another SQL query. The query is given below.
select STUDENT_ID, SSNUM, NAME,
AGE, HOMETOWN, SCHOOL from $table
WHERE HOMETOWN like "%Springs%"
ORDER BY AGE, STUDENT_ID, SSNUM
This query has one condition. The condition requires that the value of the HOMETOWN field is
like "%Springs%"
The like keyword allows pattern matching. The % sign inside the string to be matched stands for any sequence of characters. Thus, the HOMETOWN field’s value needs to contain the word Springs. It can be preceded or followed by any number of other characters. The query also has an ORDER clause. The ORDER clause asks the database to sort the returned rows by using the fields specified. Here, sorting is performed using the fields AGE, STUDENT_ID and SSNUM, in this specific order. The default sort order is ascending. By looking at the output that follows the program, we an see the results of sorting.
The next SQL query executed is an update statement. When giving an SQL keyword, it can be specified in any case. Thus, keywords such as update, select, where, order by can be specified in upper case or lower case. However, the names of tables and fields are case-sensitive in Unix machines including Mac OS X. On Windows machines, case does not matter for names. However, it is advisable that case be observed so that the program is portable. The update query is given below.
update friends set SSNUM = 222334444 where
name="Christopher Paul"
It changes Christopher Paul’s social security number.
This is followed by the execution of an SQL query that prints the names of all rows in the updated table for which the value of the AGE field is more than 20. The rows are printed sorted by AGE, STUDENT_ID and SSNUM, in this specific order.
This program has quite a bit of output. The reader is advised to look at the output carefully and compare it with the text of the program. It will aid in understanding the code. Once all the SQL statements have been executed, the program disconnects from the database source. An output run is shown below.
Tables in database test are ...
SITES
friends
testac
testad
testae
trainingSet
Number of fields in table friends = 7
Found 0 rows in DBI:mysql:test
The fields in table friends =
STUDENT_ID SSNUM NAME AGE HOMETOWN SCHOOL NOTES
198640757 Shane Jahnke 19 Colorado Springs UCCS
198640758 Seth Gross 20 Colorado Springs UCCS
198640759 Matt Gustafson 21 Colorado Springs UCCS
198640760 Christopher Paul 23 Colorado Springs CTU
198640761 Justin O'Malley 19 Colorado Springs NULL
198640762 Nick Freeman 19 Colorado Springs NULL
198640763 Clint Trebesh 18 Glenwood Springs UCCS
198640764 Griffin Heath 20 Evergreeen UCCS
198640765 Qin Jiang 22 Trinidad UCCS
198640766 Brooke Peterson 22 Denver UCD
All entries in TABLE friends
1 198640756 Joe White 18 Colorado Springs PPCC
2 198640757 Shane Jahnke 19 Colorado Springs UCCS
3 198640758 Seth Gross 20 Colorado Springs UCCS
4 198640759 Matt Gustafson 21 Colorado Springs UCCS
5 198640760 Christopher Paul 23 Colorado Springs CTU
6 198640761 Justin O'Malley 19 Colorado Springs NULL
7 198640762 Nick Freeman 19 Colorado Springs NULL
8 198640763 Clint Trebesh 18 Glenwood Springs UCCS
9 198640764 Griffin Heath 20 Evergreeen UCCS
10 198640765 Qin Jiang 22 Trinidad UCCS
11 198640766 Brooke Peterson 22 Denver UCD
All Colorado Springs entries in TABLE friends sorted by AGE and SSNUM
1 198640756 Joe White 18 Colorado Springs PPCC
8 198640763 Clint Trebesh 18 Glenwood Springs UCCS
2 198640757 Shane Jahnke 19 Colorado Springs UCCS
6 198640761 Justin O'Malley 19 Colorado Springs NULL
7 198640762 Nick Freeman 19 Colorado Springs NULL
3 198640758 Seth Gross 20 Colorado Springs UCCS
4 198640759 Matt Gustafson 21 Colorado Springs UCCS
5 198640760 Christopher Paul 23 Colorado Springs CTU
All entries in TABLE friends with age>20 sorted by AGE,
STUDENT_ID and SSNUM
4 198640759 Matt Gustafson 21 Colorado Springs UCCS
10 198640765 Qin Jiang 22 Trinidad UCCS
11 198640766 Brooke Peterson 22 Denver UCD
5 222334444 Christopher Paul 23 Colorado Springs CTU
