Web Review - Searching a Data File

	Search for:
	Jump to:

A Songline PACE Production

Script of the Week

Searching a Data File

by Brent Michalski
Sept. 4, 1998

I know what you're saying: "Searching? We just did that a few weeks ago!" Yes, I know, we covered searching recently -- but this is a different program and you are going to learn a lot from it, I promise!

Searching

View the demo.

This week we'll be covering the search component of our simple database application. If we are going to create a database, we had better be able to search it! Think about it, if we want to delete or modify a record we have to search the database so that we can find the correct record to delete or modify.

I am going to cover some new ground this week as well. The search script shown today, while fully functional, is not exactly how we are going to implement it in our final database application. I've done this so that we can cover some new material and because without a complete database program, it will be hard to decide exactly what to do yet. We will work that out as we get closer to finishing the complete application.

Diving in

Our search example this week builds upon our add record script from two weeks ago. You can even add records and then search the database and they will be there!

I want to dive right in to the code this week. I am still pumped from The Perl Conference last week and want to talk Perl. Or should I say speak Perl-ian?

I have numbered the lines of code, the line numbers are not part of the program. You can also see the program without the line numbers. The line numbers make it easier for me to describe the program.

1: #!/usr/bin/perl
2: use CGI qw(:standard);
3: use CGI::Carp qw(fatalsToBrowser);
4: $q = new CGI;
5: print $q->header();
6: $database = "../../21/perl/datafile.txt";
7: $search_for   = $q->param('search_for');
8: $search_field = $q->param('search_field');
9: $search_for = "\." if $search_for eq "";
10: &search_database($search_for);
11: $count = @results;
12: &no_match if($count < 1);
13: &multiple_match if($count > 1);
14: &single_match if($count == 1);
15: exit;
16: sub search_database{
17:   my $search_for = $_[0];
18:   open(DB, $database) or die "Error opening file: $!\n";
19:     while(<DB>){
20:       if($search_field eq "all"){
21:         if(/$search_for/oi){push @results, $_};
22:       } else {
23:         ($key,$name,$email,$phone,$notes)=split(/\|/);
24:         if(${$search_field} =~ /$search_for/oi){push @results, $_};
25:       } # End of else.
26:     } # End of while.
27:   close (DB);
28: } # End of subroutine.

29: sub multiple_match{
30:   print $q->start_html(-TITLE=>'Multiple Matches',-BGCOLOR=>'white');
31:   print "<PRE>";
32:   print "You searched for: <B>$search_for</B>\n";
33:   print "I found $count matches.\n\n";
34:   print "Please note that these results are being dumped ";
35:   print "from the data file\n";
36:   print "without any formatting.\n\nThe fields are:\n";
37:   print "<B>key | name | email | phone | notes</B>\n\n";
38:   foreach(@results){ print; }
39:   print<<HTML;
40: ... End of matches.
41:   </PRE></BODY></HTML>
42: HTML
43: } # End of subroutine.
44: sub single_match{
45:   print $q->start_html(-TITLE=>'Single Match',-BGCOLOR=>'white');
46:   print "<PRE>";
47:   print "You searched for: <B>$search_for</B>\n";
48:   print "There was only one match, here it is:\n\n";
  
49:   ($key,$name,$email,$phone,$notes) = split(/\|/,$results[0]);
  
50:   print "<B>Name:</B>   $name\n";
51:   print "<B>Email:</B> $email\n";
52:   print "<B>Phone:</B>  $phone\n";
53:   print "<B>Notes:</B>  $notes\n";
54:   print $q->end_html;
55: } # End of subroutine.
56: sub no_match{
57:   print $q->start_html(-TITLE=>'No Match',-BGCOLOR=>'white');
58:   print "<H2><CENTER>There were no matches for <I>$search_for</I>, ";
59:   print "please try again.</CENTER></H2>";
60:   print $q->end_html;
61: } # End of subroutine.

Line-by-line explanation

Line 1: Tells the program where to find Perl on the Web server. This line will vary depending on where Perl is installed on your server so you need to make any necessary changes. On a UNIX server, this line is required. If you are running this program on an NT server, this line is not required but won't hurt anything if included.

Line 2: Loads the CGI.pm module into the program. The argument in the qw/:standard/ imports the standard functions into the script. These functions are part of the CGI.pm module.

Line 3: Loads the Carp package. Carp is part of the standard CGI.pm distribution and it allows you to get more graceful error messages. By using Carp fatalsToBrowser, we get most of our error messages on our browser rather than getting the nasty 500 Internal Server Error message. Using the Carp package can be a very valuable debugging tool, I will be using it in most of my future scripts.

Line 4: Creates a new CGI object and calls it $q.

Line 5: Prints the standard header for CGI scripts. The header tells the Web server what kind of data it is sending. This line is equivalent to the following line:

print "Content-type: text/html\n\n";

Line 6: This variable stores the location of the database file. The database file is simply a pipe delimited text file.

Lines 7-8: Get the search information from the calling Web page and stores the results in the appropriate variables.

Line 9: Stores a period (.) in the $search_for variable if nothing was passed from the calling Web page. The period matches everything in a regular expression and we use a regular expression in the search subroutine to check for matches. If the user enters nothing to search for, we assume they want everything.

Line 10: Calls the search_database subroutine and passes it the $search_for variable. The $search_for variable stores the information that the user wanted to search for.

Note that we didn't have to pass the $search_for variable to the subroutine because the $search_for variable is a global variable. I chose to pass the variable to show you how to pass variables into subroutines.

Line 11: The search_database subroutine stores the results of any matches in the @results array. This line sets the $count variable equal to the number of elements in the @results array. This retrieves an accurate count of the number of matches we had as a result of the search.

Line 12: Calls the no_match subroutine because $count is less than 1. This would mean that we had no matches so we need to tell the user.

Line 13: Calls the multiple_match subroutine because $count was greater than 1. We had more than one match so we need to handle the output accordingly.

Line 14: Calls the single_match subroutine because $count was equal to 1. We had a match, but there was only a single record found so we show it to the user.

Line 15: Exits the program, we are done at this point.

Line 16: Begins the search_database subroutine.

Line 17: Creates a "private" variable called $search_for. This variable is actually a different variable than the global variable by the same name. A my variable is only valid inside the innermost enclosing block.

We could modify the value stored in this variable, and it would have no effect on the global variable called $search_for.

I could have called this variable anything I wanted, but I wanted to show you an example of a my variable and give you some information on them.

Line 18: Opens the database and creates a file handle called DB. File handles are simply names (references) that we use to reference the file we opened. Sort of like your name is a reference to you...

Line 19: A while loop that continues looping until it reaches the end of the file. Each time through the file, the current record is stored in Perl's special variable called $_

Line 20: Begins an if ... else statement. This is the if portion and checks to see if the value of $search_field is equal to all. If it is, TRUE, then we do whatever is in its block. Otherwise, we jump down to the else statement and execute what is in its block. We use eq here because we are comparing strings.

Remember that in Perl you use eq for string comparisons and == for numeric comparisons.

Line 21: A regular expression that checks to see if whatever is in the variable $search_for is in the variable $_. In Perl, $_ is the default value for many things. This line performs a search nearly identical to the one on line 24, but notice there we are looking for something more specific, rather than the default value.

This line is doing quite a bit so I am going to break it down further.

The if(...) portion checks to see if we get a TRUE return value from the expression inside of the parenthesis. If so, it executes the code in its block.

The /$search_for/oi is a regular expression that checks to see if $_ contains the text that is stored in the variable $search_for. The i tells the regular expression to ignore case. The o tells the regular expression to only compile itself once, after that it remembers what the value of $search_for was. If we didn't put this on the end, each time through the while loop the regular expression would recompile itself. For large files, this causes a lot of extra unnecessary overhead, slowing down your program.

You should only use the o option if the value inside of the regular expression does not change. Even if you change the value, if you have the o on the end, Perl will not recompile the regular expression, which can lead to some real confusion if you have to troubleshoot it!

The push @results, $_ inside the if statements block tells Perl to add the current item onto the @results array if it evaluated to TRUE. This adds an element to the end of the array if it already exists, or it creates a new array and adds the item if it didn't exist yet.

Line 22: The else condition. If the statement on line 20 was false, $search_field was not equal to all, then we execute the code inside of this block.

Line 23: Takes the current record and splits it at the pipe symbols into the respective fields. The split function works on the $_ variable by default so we don't have to specify it.

Line 24: This is the search field, I will break it down further for you because this one is tricky.

Inside the if statement we have ${$search_field} as the variable that is bound, by the binding operator (=~) to the regular expression on the right. The ${$search_field} is kind of neat because instead of having to write an if statement for each of the items we can search on (name, email, phone, and notes) - I take the value of whatever is stored in the variable we want to search and search for its value. This is kind of difficult to understand, so let me elaborate further:

Let's pretend that I chose to search on name. The value of $search_field should now be set to name because we told Perl to get the value from the calling Web page in line 8. By placing the value of $search_field inside of the ${ }, I actually get the value of the value of the name field. Perl first "translates" the $search_field variable into its value, and then translates the result of that translation into the value of the new variable we just created.
Ok, we said that we set the value of $search_field to name, this means that after Perl's first translation - the variable looks like this: ${name} which is the same as writing $name.
After Perl's second pass, the value becomes whatever is stored in the $name variable. Remember that the $name variable was split out in the line above from the current record so it has the value of the name field that was stored in the database for this record. For example, if the current record's name field contained Brent, then the value on the left side of the binding operator would now be Brent.
Whew! I hope you followed that!
Note that the curly braces are not required. I put them in for clarity. I could have written $$search_field instead and achieved the same results.

On to the right side of the binding operator...

Now that we have the left side straight, the right side contains /$search_for/oi The $search_for is the text that we are looking for which was set by the calling Web page.

The / /, forward slashes are Perl's matching operator. They will try to match whatever is inside of them, with whatever they are bound to by the binding operator (=~). If there is no binding operator, they default to matching $_.

The i on the end tells Perl to ignore case and the o tells Perl not to recompile the expression inside the forward slashes each time through the loop. This can increase the speed of searches if you are searching a lot of records.

If we had a successful match, then we execute the code inside of the if statement's block. We tell it to: push @results, $_. This means push the value that is currently in $_ onto the end of the @results array. The @results array is where we store the successful matches.

WOW. That was A LOT of explaining for one line of code!

Line 25: Closes the else block.

Line 26: Closes the while loop.

Line 27: We are done with the database now, so we close it too.

Line 28: Ends the search_database subroutine.

Line 29: Begins the multiple_match subroutine.

Line 30: Uses the start_html function from CGI.pm to begin an HTML document and give it a title and background color.

Lines 31-37: Print out some HTML and text to explain to the user what they are seeing.

Line 38: A foreach loop that goes through each item in the @results array and prints it. Notice that I just used an empty print statement. print is another function which, if you pass it no arguments, it operates on $_ by default. Each time the foreach loop goes through the @results array, it sets $_ to the current value of the array.

Line 39: Begins a here document block. Here documents print until they encounter the ending tag that we provided. The ending tag in this example is: HTML.

Lines 40-41: Print out some text. Since they are inside the here document, I don't need a print statement.

Line 42: Closes the here document. The tag must match the tag we opened the here document with and it cannot have any white space.

Line 43: Closes the multiple_match subroutine.

Line 44: Begins the single_match subroutine.

Line 45: Uses the start_html function from CGI.pm to begin an HTML document and give it a title and background color.

Lines 46-48: Print some text to the browser to let the user know what they are looking at.

Line 49: Splits the current record into the appropriate fields. Since we had only one match, we know that it must be stored in the first element of the @results array. That is why I used element 0, I knew that the data was stored there.

Lines 50-53: Prints the values of the record to the browser in a nice format for the user to read.

Line 54: Uses the end_html function from CGI.pm to print the ending HTML tags.

Line 55: Ends the single_match subroutine.

Line 56: Begins the no_match subroutine.

Line 57: Uses the start_html function from CGI.pm to begin an HTML document and give it a title and background color.

Lines 58-59: Prints some simple HTML to tell the user that we didn't find any matches.

Line 60: Uses the end_html function from CGI.pm to print the ending HTML tags.

Line 61: Closes the no_match subroutine.

Wrapping it up

Well, the program was pretty simple - except for line 24! Perl is an amazing language and it is easy to do something very powerful in just a single line of code.

In the next article, we will be covering how to delete records. Then, we will move on to modifying the data and finally we will combine it all into a fully-functional "database" application.

I put database in quotes because it is not really a true database in the purest sense. But, since we do have some of the simple functions of a database (adding, deleting, modifying, and searching records), I am going to call it a database anyway.

The goal of the finished database is to provide you with a simple, yet powerful application that you can easily add to your Web site.

See you next week!

Source Code for Searching a Data File
View and download this week's script.

Next: Deleting Records from a Data File
Prev: Working with Data Files


Web Review copyright © 1995-99 Songline Studios, Inc. Web Techniques and Web Design and Development copyright © 1995-99 Miller Freeman, Inc. ALL RIGHTS RESERVED

Searching a Data File

Searching

Previous articles in this series:

Further articles in this series:

Diving in

Line-by-line explanation

Wrapping it up