Lost Password? No account yet? Register
Home arrow Knowledge Center arrow Operating Systems arrow Others arrow Making the Windows Server 2003 Indexing Service Useful
The Most $100k+ Jobs. Over 30,000 new open positions monthly. Sign up now with TheLadders.com. Find 70,000 Jobs that pay over $100,000 � Search now at TheLadders.com
Making the Windows Server 2003 Indexing Service Useful PDF Print E-mail
Written by David Noel-Davies   
Tuesday, 21 August 2007

The Indexing Service makes searching for information on your network a whole lot faster than using the built in Windows Search tool. The only problem is that the query interface is built into the server’s Computer Management console and is therefore not readily available to users. I got around this problem on my own network by designing a Web application that acts as a query tool for users on my network. In this article, I will show you how to build such an application for your own network.

Before I Begin

Before I get started, I feel obligated to mention that creating a Web based query interface isn’t exactly my idea. Microsoft published a knowledgebase article (http://support.microsoft.com/kb/q238791/) that explains how you can use the ASP.NET ISXXO.QUERY object to query an index. Many developers have designed Web pages similar to the one that I am about to show you. Although I wrote the code found in this article myself, some of the techniques used were borrowed from Microsoft and from various other programming Web sites.

First the Basics

  

What is the Indexing Service?

As I said earlier, the indexing service tends to be poorly documented. I’m not sure that there is a “Microsoft approved” definition for the indexing services. What I can tell you is that I tend to think of the Indexing Service as a poor man’s alternative to SharePoint. Like SharePoint, the Indexing Service can be used to make all of the content on your server searchable. Unlike SharePoint though, the Indexing Service lacks such features as a document library, version control, or a cool user interface.

So where would you use the Indexing Service? My own network is a classic example of a network that can benefit from the Indexing Service. If you are reading this article, then you have probably already figured out that I am a technical writer. The problem is that I have been writing for ten years and in that time I have published over 3,000 articles. As you might expect, it can be really tough to remember weather or not I have already written about a topic or not. This is where the Indexing Service comes into play. I can have Windows to compile a dynamic index of everything that I have ever written. That way, before I propose a topic to my editors, I could do a quick search to see if I have written about the topic in the past.

Configuring the Indexing Service

As I mentioned earlier, the Indexing Service is a default service, and its startup type is set to Automatic. Therefore, you don’t have to worry about installing anything. All you have to do if you want to use the Indexing Service is to configure it. The exception to that rule is that if you have installed Windows Server 2003 Service Pack 1, then the Indexing Service is disabled by default, so you will have to set the startup type to automatic and start the service.

The easiest way to access the Index Service’s configuration options is through the Computer Management console. If you are unfamiliar with the Computer Management console, it’s located on the server’s Administrative Tools menu. Once the Computer Management console opens, navigate through the console tree to Computer Management (Local) | Services and Applications | Indexing Service. When you arrive at this point, the first thing that you will probably notice is that there are two pre-existing catalogs by default. In case you are wondering, catalogs are Microsoft speak for indexes. One of the catalogs indexes the local system and the other catalog indexes the server’s default IIS Web site. I recommend deleting these default catalogs so that you can have a clean start. To get rid of them, simply right click on the catalog and select the Delete command from the resulting shortcut menu.

Creating an Index

Now that you have deleted the default catalogs, it’s time to create a catalog of your own. As you have probably already figured out, the Indexing Service gives you a choice of either indexing the server’s file system or a Web site that is being hosted by the server. For the purposes of this article, let’s focus on indexing the local file system.

To create a new catalog, right click on the Indexing Service container in the Computer Management console and select the New | Catalog commands from the resulting shortcut menus. When you do, you will see the Add a Catalog dialog box, shown in Figure A.


Figure A: The Add a Catalog prompts you for the catalog name and location

The Add a Catalog dialog box prompts you for the name of the new catalog and for the catalog’s location. The name can be anything that you want, so long as you don’t use any spaces. I recommend using a descriptive name. The location refers to the location that you are storing the catalog file in, not the location of the files that you are indexing.

Before you select a location, there are a couple of things that you need to know. First, you should never place a catalog file into a folder that’s being indexed. The reason is because Windows monitors indexed folders for changes. If you were to add a file to the folder, Windows would update the catalog. Windows would then see that the catalog has been updated, and reindex the folder again. It’s a pretty vicious cycle, so it’s best to just choose a non indexed location.

The other thing that you need to know is that if you are indexing Web content, you shouldn’t place the index in or beneath the wwwroot folder. Doing so slows down both the Indexing Service and IIS.

Now that you have created a catalog, you must tell the Indexing Service what content you want indexed. To do so, expand the indexing service container, and you will see a container for the catalog that you have created. For example, in Figure B, I have created a catalog named Articles. Beneath the catalog’s container is a container named Directories. Right click on the Directories container and select the New | Directory commands from the shortcut menu.


Figure B: The Indexing Service will create a container bearing the name of your catalog

At this point, you will see the Add Directory dialog box that’s shown in Figure C. This dialog box needs a little explaining though. The first thing that the dialog box asks you for is the path to the directory that you want to index. This would be something like C:\mystuff. Beneath that is a field for the path’s alias. You would use this field if the path that you entered points to a network drive.


Figure C: The Add Directory dialog box allows you to specify what content gets indexed

Imagine for example that you indexed the content on the network drive X:. Later, for some reason, you changed X: to map to a different location. Doing so would really mess up your index. However, if you enter a UNC path to the content that you are indexing, then the Indexing Service can still index the content even if the drive mapping changes.

Beneath the Path and Alias fields is a section where you can enter account information. The idea is that you must enter credentials for an account that has rights to read the content that’s being indexed, and to write to the location where you have decided to create the index.

The last thing on this dialog box is a set of radio buttons that ask you if you want to include the content in an index. At first, this seems like a ridiculous question since the dialog box’s whole purpose is to add a directory to the index. However, this option does have a legitimate purpose.

Imagine for a moment that you had a folder that you want to index, but beneath that folder was a sub folder that you didn’t want to index. You could add the primary folder to the index, and then specify the sub folder and instruct the Indexing Service not to index it.

You have now created your first index. If at any time, you need to rebuild the index, you can simply right click on the listing for the directory that is being indexed, and select the All Tasks | Rescan commands from the resulting shortcut menus.

Before I show you how to use the index, there is one last dialog box that I want to talk about. If you right click on the Indexing Service container and select the Properties command from the resulting shortcut menu, you will see the Indexing Service Properties sheet, shown in Figure D. If you look at the figure, you will see that the Generation check box contains two check boxes. The first check box asks you if you want to index files with unknown extensions. Unless you’ve got a good reason for doing so, you should never select this box. By default, the Indexing Service will index things like text files, Microsoft Office documents, and HTML files. If you select this check box, it will attempt to index everything in the directories that you have chosen to index. This includes things like EXE files and DLL files.


Figure D: I don’t recommend selecting either of these check boxes

The other check box is the Generate Abstract check box. If you select this check box, then the Indexing Service will attempt to create a summary of every document that it indexes. I say that it will attempt to do this because it doesn’t do a very good job.

Using the Index

Microsoft assumes that you will build your own interface to the index, but there is a way to use the index through the Computer Management console. If you select the Query the Catalog link, you will see a mini search engine that you can use to search the index, as shown in Figure E.


Figure E: The indexing service includes a mini search engine, but Microsoft assumes that you will build your own

Conclusion

 I have explained how the Indexing Service can be used to index vast amounts of content on your network. I then went on to show you how to configure the Indexing Service.

Now...Preparing the Web Server

Even if you aren’t planning on indexing Web content, this is a Web based application, and it will have to run on an IIS server. Hopefully, your IIS Server is already up and running and you can add the Web pages that we will be creating to the default Web site. If no though, you may have to install and configure IIS.

To install IIS, open the Server’s Control Panel and select the Add / Remove Programs option. When the Add or Remove Programs dialog box appears, click the Add / Remove Windows Components button. After a brief delay, Windows will open the Windows Component Wizard.

Select the Application Server option and click the Details button. At this point, select the ASP.NET check box., as shown in Figure A. Now, highlight the Internet Information Server (IIS) option and click the Details button. As you can see in Figure B, IIS has a lot of components. At a minimum, you will need to select the Common Files, Internet Information Services Manager, and the World Wide Web Service.


Figure A: ASP.NET must be installed in order for this application to run


Figure B: At a minimum, you will need to select the Common Files, Internet Information Services Manager, and the World Wide Web Service

Now, just click OK twice, click Next, and follow the prompts. You may be prompted to insert your Windows Server 2003 installation CD to complete the installation process.

After the installation process completes, you will have to prepare IIS to run the application that we are building. Setting up Web sites is an art in and of itself. Since I’ve got a lot of material to cover, we are just going to make the new application a part of the default Web site and use a minimal configuration to get the application up and running rather than using an elaborate, high security / high performance configuration.

To configure IIS, select the Internet Information Services (IIS) Manager command from the server’s Administrative Tools menu. When you do, the IIS Manager console will open. Navigate through the console tree to your server | Web Sites | Default Web Site, as shown in Figure C.


Figure C: Navigate through the console tree to the default Web site

The first thing that we have to do is to make sure that the default Web site is running. The easiest way to do this is to right click on it and make sure that the Start option on the shortcut menu is grayed out. If the Start option is displayed in black then click Start to start the site.

Next, we have to assign the default Web site an IP address (technically, we don’t have to, but doing so makes life easier on everyone). To do so, right click on the default Web Site and select the Properties command from the resulting shortcut menu. When you do, you will see the Web site’s properties sheet. Select the server’s private IP address from the IP Address drop down list found on the properties sheet’s General tab, as shown in Figure D. It is important to use the private IP address because you don’t want to accidentally allow outsiders to index your server. Now just click OK, close the IIS Manager, and you are ready to start setting up the Web application.


Figure D: Assign the server’s private IP address to the Default Web site

The Query Form

The Web application that we are creating consists of two separate files. Both files should be saved to the server’s C:\Inetpub\wwwroot folder. By default, the IUSR_servername account has read access to this folder. When a user connects to the site anonymously, Windows Server uses the permissions associated with the IUSR_servername account to determine what the user can and can’t access.

The first file that you will need to create is a simple HTML file. This file allows the user to input a query string and then passes that query string on to an ASP file that I will discuss in a moment for processing. As you can see in the source code below, there is really nothing fancy about this file. It simply allows the user to input a text string. The text string is assigned the variable name searchstring. The Form Action command then passes the contents of the searchstring variable to the results.asp file.

QUERY.HTM

<html>
<title>Index Service Query Tool</title>
</html>
<body>
<form action="results.asp" method=post>
<p>Enter the text that you want to search for<br>
<input type=text name="searchstring" size="50" maxlength="100" value=" "><br>
<button type=submit>Submit</button>
<button type=reset>Clear Form</button>
</p>
</form>
</body>

The Results Page

The Results page is where all of the magic happens. The Results page is coded in ASP (Active Server Pages). I don’t want to turn this article into a crash course in ASP, but I will tell you that ASP pages are processed on the server and the output is sent to the user in HTML format. If you look at the code below, you will see that it contains a mixture of HTML code and ASP code. I used as much HTML as I could in an effort to simplify the page for those who may not be familiar with ASP. Blocks of ASP code are separated from HTML code by the <% and %> markers. ASP files should be saved with the .ASP extension rather than the .HTM extension so that the server knows to process them as ASP. The following file should be named RESULTS.ASP.

RESULTS.ASP

<html>
<head>
<title>
Search Results
</title>
</head>
<body>

<%
' This section sets the various configuration variables

formscope="/"
pagesize = 5000
maxrecords=5000
searchstring=request.form("searchstring")
catalogtosearch="Articles"
searchrankorder="rank[d]"
origsearch=searchstring
%>

<%
'This section performs the query

dim q
dim util
set q=server.createobject("ixsso.query")
set util=server.createobject("ixsso.util")
q.query=searchstring
q.catalog=catalogtosearch
q.sortby=searchrankorder
q.columns="doctitle, filename, size, write, rank, directory, path"
q.maxrecords=maxrecords
%>

<%
'This section displays the results

set rs=q.createrecordset("nonsequential")
rs.pagesize=pagesize
response.write"<p>Your search for <b>" & origsearch & "</b> produced "

if rs.recordcount=0 then response.write "no results"
if rs.recordcount=1 then response.write "1 result: "
if rs.recordcount>1 then response.write(rs.recordcount) & " results: "

%>

<table border=1><tr><td><b>Title</b></td><td><b>Filename</b></td><td><b>Date / Time</b></td><td><b>Size</b></td><td><b>Relevance</b></td><td><b>Directory</b></td></tr>

<%
do while not rs.EOF

response.write "<tr><td>" & rs("doctitle") & "</td><td>" & "<a href=" & "'" & rs("path") & "'" & ">" & rs("filename") & "</a>" & "</td><td>" & rs("write") & "</td><td>" & rs("size") & "</td><td>" & rs("rank") & "</td><td>" & rs("directory") & "</td></tr>"

 

rs.movenext
loop

response.write "</table>"
set rs=nothing
set q=nothing
set util=nothing
%>

</body>
</html>

The file above is broken into three sections; initialization, query, and results. The Initialization section looks like this:

<%
' This section sets the various configuration variables

formscope="/"
pagesize = 5000
maxrecords=5000
searchstring=request.form("searchstring")
catalogtosearch="Articles"
searchrankorder="rank[d]"
origsearch=searchstring
%>

In this section, we are defining the variables that will be used in the query. The Formscope variable is set to / which tells the query to start at the top of the index. The pagesize and recordsize variables tell the application the maximum number of search results to return. The searchstring variable is inherited from our QUERY.HTM file that I talked about earlier. The catalogtosearch variable specifies the name of the index that you want to search (you defined the index name in last month’s article). The searchrankorder variable is set to rank[d] which means that the rankings will be presented in a descending order with the most relevant results being displayed first. Finally, the origsearch variable keeps track of the user’s original search string.

Now, let’s talk about the query section, shown below:

<%
'This section performs the query

dim q
dim util
set q=server.createobject("ixsso.query")
set util=server.createobject("ixsso.util")
q.query=searchstring
q.catalog=catalogtosearch
q.sortby=searchrankorder
q.columns="doctitle, filename, size, write, rank, directory, path"
q.maxrecords=maxrecords
%>

In this section, there are two calls to the ixsso object. This is the indexing object that does all of the work. Notice that underneath these calls, there are a number of q.variables that are being defined (q.query, q.catalog, q.sortby, etc.). These are the variable names that the ixsso query is expecting to use. Technically, we could assign values to these variables directly. Instead, I chose to assign values to the variables that I talked about earlier in an effort to make the code easier to read. This section then sets the q. variables to reflect the values assigned to the alternate variables earlier.

One line that’s especially worth paying attention to is this one:

q.columns="doctitle, filename, size, write, rank, directory, path"

This line tells the query what information you want the indexing service to display. The Doctitle option tells the indexing service that you want to know the document’s title. The filename and size options pull the document’s filename and byte count accordingly. The write option pulls the document’s date and time stamp. The rank option pulls the document’s relevance to the search query based on a score ranging from 1 to 1,000. The directory pulls the directory that the file is found in, while the path option tells the query to pull the directory and filename as a whole. I will talk more later on about how I am using this information.

The last section, shown below, displays the query results:

<%
'This section displays the results

set rs=q.createrecordset("nonsequential")
rs.pagesize=pagesize
response.write"<p>Your search for <b>" & origsearch & "</b> produced "

if rs.recordcount=0 then response.write "no results"
if rs.recordcount=1 then response.write "1 result: "
if rs.recordcount>1 then response.write(rs.recordcount) & " results: "
%>

<table border=1><tr><td><b>Title</b></td><td><b>Filename</b></td><td><b>Date / Time</b></td><td><b>Size</b></td><td><b>Relevance</b></td><td><b>Directory</b></td></tr>

<%
do while not rs.EOF

response.write "<tr><td>" & rs("doctitle") & "</td><td>" & "<a href=" & "'" & rs("path") & "'" & ">" & rs("filename") & "</a>" & "</td><td>" & rs("write") & "</td><td>" & rs("size") & "</td><td>" & rs("rank") & "</td><td>" & rs("directory") & "</td></tr>"
rs.movenext
loop

response.write "</table>"
set rs=nothing
set q=nothing
set util=nothing
%>

This section creates a record set based on the query results. It then sets the rs.recordcount variable to reflect the number of records in the record set and displays that variable as the number of results. Next, there is some simple HTML code that creates a table header. After that there is a loop that writes the contents of each record in the recordset. When the loop completes, the variables are nulled out and the script ends.

So what does all of this code look like in action? Keep in mind that I designed this code to be simple, not to produce pretty output. You are of course free to customize the code in any way that you see fit. The query page looks like what you see in Figure E. The results screen is shown in Figure F.


Figure E: This is where the user enters the query


Figure F: These are the query results

When you run the query yourself, you may find that not all documents have titles. Some older versions of Microsoft Office did not automatically create meaningful titles, so that is the reason why.

You might also notice in the output that we are displaying the contents of the doctitle (Title), filename, write (date & time), size,  rank (relevance), and directory variables. However, as you might recall, the code also had the Indexing service to assign a value to a variable called path. The path variable contains the full path and file name of the file result being returned. The reason that I had the code to pull this variable is because I used it to hyperlink the filename in the output.

Normally, when you create an HTML hyperlink, the syntax looks something like this:

<A href=”path and filename”>text that is hyperlinked goes here</a>

I simply used the path variable to provide the path and filename for the hyperlink, and used the filename variable to display the filename. The code that accomplishes this is this line (this is an excerpt from a much longer line):

"<a href=" & "'" & rs("path") & "'" & ">" & rs("filename") & "</a>"

Conclusion

In this article, I have explained that although the Indexing Service can speed content queries, it is not easily accessible to users. I then went on to show you how you can create a Web based tool that allows users to run queries against the indexes that you create. Users can access the tool by opening a Web browser on a computer that’s connected to your corporate network and entering http://server’s IP address/query.htm

Comments
Add New Search
Write comment
Name:
Email:
 
Website:
Title:
UBBCode:
[b] [i] [u] [url] [quote] [code] [img] 
 
 
:angry::0:confused::cheer:B):evil::silly::dry::lol::kiss::D:pinch:
:(:shock::X:side::):P:unsure::woohoo::huh::whistle:;):s
:!::?::idea::arrow:
 
Please input the anti-spam code that you can read in the image.

3.26 Copyright (C) 2008 Compojoom.com / Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."

 
< Prev