Web Review - Quick and Simple Data Searching

	Search for:
	Jump to:

A Songline PACE Production

Script of the Week

Quick and Simple Data Searching

by Brent Michalski
July 17, 1998

Static Web pages are great, but wouldn't it be nice to give your visitors the ability to search a database or data-file and return a dynamic page with the search results? Our article this week will show you how to write a simple CGI script that will search a text file and return the results.

View the demo.

Database Searching

This week we will cover how to let users search a database or data-file, and return the results to them in a table. I am going to use the term "database" throughout this article, but to be exact though, I am referring to a text file that stores the data to be searched.

Why would you want to do this? There are many types of applications that you can create using this approach. For example, you could create a searchable phone list, a product database, or a searchable reference listing...or whatever you want.

It doesn't have to be small databases either. I am currently working on an application for an online jewelry store, and we are using text files to store all of the jewelry information. Some of the database files contain over 2,000 items! The actual jewelry information is all contained in a commercial database application, but when it is moved to the Web, we export the data to a text file.

The database we will be using here is simply a pipe (|) delimited text file. I like to use the pipe symbol because it is hardly ever used for other things. I started out using the tilde (~), but since it is a common character in URLs, I stopped using it and switched to the pipe. By using an uncommon character, or one that is never used in the data, we can be sure that the script will not get confused and present erroneous results.

Remember though, you can use anything that you want to be a delimiter for your database fields, as long as your script can identify it as a delimeter. I wouldn't recommend using something like the letter "e" though.

Here is a quick example to show you what a pipe-delimited database file looks like:

  23523423|software|Windows 98|Microsoft
  3429807s|hardware|8.4 Gb Hard Drive|Maxtor
  342314jd|hardware|32Mb EDO RAM chip|Kingston
  4-138439|software|Office 98|Microsoft

The first field of each row is the "key," I always try to have a field that is a unique identifier. It makes things much easier when you write more advanced programs and use larger databases. The second field is the category, (hardware or software). The third field is the item name, and the fourth field is the manufacturer.

I chose a text file format because most commercial databases can export to them. They are easy to maintain and everybody has the ability to use them. Plus, text files can be used on any platform.

Our program presents the most basic form of searching and presenting the results to the user. This is definitely not a powerful database application, but it should give you enough information to get your creative juices flowing. Remember, when using Perl there is always another way to do it, so if you have a better way, or even a different one, don't be afraid to use it.

Diving In

Our last article and script were very long and my head is still spinning from them so I figured that this week I'd try to keep the script short and simple. Here is the form. Let's jump right into the code.

I have numbered the lines of code. As usual, the line numbers are not part of the program. If you want to see the program without the line numbers, you can view the source code. The line numbers simply make it easier for me to talk about the program.

 1: #!/usr/local/bin/perl
 2: use CGI qw /:standard :html3/;
 3: $query = new CGI;
 4: $datafile = "database.txt";
 5: $search = $query->param('search');
 6: if($search eq ""){$search = ".";}
 7: &Do_Search;
 8: $count=@matches;
 9: &No_Matches if($count == 0);
10: &Print_Results;
11: exit;
12: sub Do_Search{
13:     open(DATAFILE,"$datafile");
14:     while(<DATAFILE>){
15:       if(/$search/i){
16:         push @matches, $_;
17:       } # End of if.
18:     } # End of while.
19:     close(DATAFILE);
20:     return;
21: } # End of Do_Search subroutine.
22: sub Print_Results{
23:     print header,
24:      start_html('Your Search Results');
25:         
26:   print<<"END_OF_TEXT";
27:     <FONT FACE=ARIAL SIZE=6>
        <CENTER>Search Results</CENTER></FONT>
28:     <HR WIDTH=80%>
29:     <CENTER>
30:     <TABLE BORDER=1 CELLSPACING=0>
31:      <TR>
32:       <TD><FONT FACE=ARIAL SIZE=2>
        <B><CENTER>Category</CENTER></B></TD>
33:       <TD><FONT FACE=ARIAL SIZE=2>
        <B><CENTER>Item Name</CENTER></B></TD>
34:       <TD><FONT FACE=ARIAL SIZE=2>
        <B><CENTER>Manufacturer</CENTER></B></TD>
35:      </TR>
36: END_OF_TEXT
37:   foreach $match (@matches){
38:     ($key,$category,$item,$mfg) = split(/\|/, $match);
39:     print Tr(td[$category,$item,$mfg]);
40:   } # End of foreach.
41:     
42:   print<<"END_OF_TEXT";
43:     </TABLE>
44:     </CENTER>
45: END_OF_TEXT
46:     print end_html;
47:     
48:     return;
49: } # End of Print_Results subroutine.
50: sub No_Matches{
51:     print header,
52:         start_html('Your Search Results'),
53:         h1("Sorry, there were no matches for $search."),
54:         "Please use your BACK button and try again.",
55:         end_html;
56:     exit;
57: } # End of No_Matches subroutine.

Line-by-line Explanation

Line 1: Tells the program where to find Perl on the Web server. This line will vary depending on where Perl is installed on your server so you need to make any necessary changes. On a UNIX server, this line is required. If you are running this program on an NT server, this line is not required but won't hurt anything if included.

Line 2: Loads the CGI.pm module into the program. The arguments in the qw/:standard :html3/ bring in more functions from CGI.pm for us to use in our program. These functions are all part of the CGI.pm module.

Line 3: Creates a new instance of the CGI object and calls it $query.

Line 4: Sets a variable called $datafile to the value database.txt. This makes it easier to modify the code to suit your needs. Instead of hard coding all of the filenames into the main body of the code, it is nice to keep whatever variables you can at the top of the program. That way, if you need to change the name of a file, you simply change the value in the variable and it is changed throughout your code.

Or even better yet, create a configuration file and reference it somewhere in your code. Using configuration files makes your programs much more portable. You don't have to change the code, just change the configuration file and you can re-use the program with different variables.

Line 5: Reads the input sent from the "search" variable on the calling Web page and stores it in a variable called $search.

Line 6: Sets the value of $search to a period (.) if $search was left blank on the calling Web page. We do this because the period will match everything in our search subroutine.

Line 7: Calls our Do_Search subroutine. This subroutine performs the search on our database file.

Line 8: We set the variable, $count to the value of the array @matches. The result stored in the variable is not the contents of the array though, it is the number of elements in the array. So, by doing this, I set the variable $count to however many matches I ended up with in @array. @array was filled with the matches resulting from our call to the Do_Search subroutine.

Line 9: Calls the No_Matches subroutine if $count was equal to zero. This then lets our user know that there were no matches to their search. It is much nicer than returning a blank page.

Line 10: Calls the Print_Results subroutine. We end up here if we had matches.

Line 11: Exits the program. We are actually done at this point! We did all of our work in the subroutines below.

Line 12: Starts the Do_Search subroutine.
Line 13: Opens the file whose name is stored in $datafile. This is our database file.

Line 14: A "while loop" that loops through the entire file, line-by-line, that we opened in line 13.

Line 15: This is the line that does our matching. The // is shorthand for the match operator in Perl. So what we are doing is searching the current line from the database for matches. The i on the end means ignore case. We don't care if it is upper- or lower-case, we just want to find the records that match. If we do find a match, we get a TRUE return value which causes our if statement to continue into it's block.

Line 16: If we did have a match, then we push the current record onto the end of our array called @matches. The push function, pushes an item onto the end of an array. The $_ is a special variable in Perl that holds the current item. There are many special variables in Perl, I have found that the more of them you understand, the easier things get. I guess Larry Wall put them in there for a good reason!

Lines 17-18: Close our if and while blocks of code.

Line 19: Closes the data file that we opened in line 13.

Line 20: Brings us back to the line after we called the subroutine.

Line 21: The end of the Do_Search subroutine.

Line 22: Begins the Print_Results subroutine. This is the subroutine that prints out the results of our search to our user.

Line 23: Prints out the standard HTML header information using the print_header subroutine contained in the CGI.pm module. For those unfamiliar with the standard HTML header, it is: "Content-type: text/html" followed by two newline characters (\n\n in Perl). This tells the Web server what kind of data it is sending. It also tells the browser what kind of data is coming. The HTML header is required. If you forget it, you will get a "Server Error."

Line 24: Prints out our beginning HTML code. The start_html function is also part of the CGI.pm module. The text in the function call will get put into the <TITLE> element of our page.

Line 26: Begins a "here" block which we use to print out multiple lines of text. This saves us from having to explicitly type print at the beginning of every line, and from having to put everything in quotes.

Lines 27-35: HTML code to help us create our table.

Line 36: The end of our "here" block. Stops printing out the text. The ending "tag" for the "here" block must be all the way to the left, otherwise you will get an error. It also should not have a semicolon at the end.

Line 37: Begins a loop that goes through each record stored in @matches. The $match in there tells Perl to store the current value in $match so that we can manipulate it.

Line 38: Splits our "record" into its "fields." In this case, we had fields called key, category, item, and mfg. The split function splits the line into the different variables that we specified on the left side of the =. We also have to tell it where to split the fields, in this case, our separator is the pipe symbol (|).

Line 39: Prints out our current table row using CGI.pm's Tr function. This simply makes printing each row easier. If I didn't use this, I would have to type each <TR> and <TD> manually.

Line 40: Ends our foreach block.

Line 42: Starts another "here" block for printing out text.

Lines 43-44: Prints out our closing table HTML code.

Line 45: Closes our "here" block, we are done printing for now.

Line 46: Prints out our closing HTML code using CGI.pm's end_html function. This is simply a shortcut in CGI.pm that prints out <BODY><HTML>.

Line 48: Returns us to the line after we called the subroutine.

Line 49: Marks the end of our Print_Results subroutine.

Line 50: Starts our No_Matches subroutine. We call this subroutine if there were no matches resulting from our search. It is nicer to tell the user that there were no matches to their query than presenting them with a blank table.

Lines 51-55: By using CGI.pm's built-in functions, I was able to print our standard header, print the beginning HTML code, print a message to the user that there were no matches and print the ending HTML code. CGI.pm is a very powerful module. For a complete reference to it, please visit the CGI.pm home page.

Look at this code, and then run the form with something like osdnhfensf. Once you get the screen up that tells you that there were no matches, take a look at the HTML that was generated by these few lines. CGI.pm can really save you some typing.

Line 56: Since we had no matches, and have told the user this, we are done with the program and can exit.

Line 57: Closes out our No_Matches subroutine.

Wrapping It Up

As I have hopefully shown you, creating a dynamic Web page by searching a database is very simple. Our example here only took around 55 lines of actual code!

You can use dynamic pages for product searching, guest books, pages personalization, grocery lists, etc. Let me know how you use them!

Source Code for Quick -n- Simple Data Searching
View and Download the source code for this week's article.

Next: Web Page Updates Through CGI


Web Review copyright © 1995-99 Songline Studios, Inc. Web Techniques and Web Design and Development copyright © 1995-99 Miller Freeman, Inc. ALL RIGHTS RESERVED