Google Parser

Download source – 25.2 Kb
Introduction
Google is definitely one of the most useful websites on the net. I am using it every day and like other things I use frequently, I intended to customize it to my personal changing needs. As a developer the first thing that came to my mind to automate my activities was that I needed a class representing this search engine. When I looked into their web site I saw that they provide a very nice API which is accessible through Web Services.

For some strange reason, I couldn’t access the web service out of the company from behind the firewall… So I had to look into other options. Looking into CodeProject’s web site, I figured out that I am not the only one interested in this subject. There has been already some nice articles written about this matter. There is a nice code provided by Stuart Konen in C++ but I was looking for a .NET assembly. Also Kisilevich Slava is providing code with a much better interface and even a more powerful engine which gets its results from different search engines. But (s)he uses another HTML parser which makes it dependent on it and I wanted to have full control of the parsing of the page.
So, I came up with this parser…
Challenges
Providing a Google interface has two challenges involved. First, you should write a code to establish a connection to send your HTTP request and get your response back. This is very easy when you work at home where the permissions are granted automatically. At the office there is a proxy which blocks the unattended internet access. So the issue here is to provide the DefaultCredentials as Internet Explorer would do. Fortunately Mr. Gates has provided an easy solution for this:HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Proxy = WebProxy.GetDefaultProxy();
request.Proxy.Credentials = System.Net.CredentialCache.DefaultCredentials;
Once this is done, you can request anything IExplorer can. For more information on DefaultCredentials see the MSDN.
The second challenge was to parse the response. This is the tricky part. Some do this by converting the HTML to XML and then use the XML parsers. Others use COM and use HtmlDocument to access the tags as DOM objects. Some others like me prefer parsing it using RegularExpressions. The last approach is surely the least readable code but surely fast and elegant.

Describing Google’s Search Object
Item
Each result item has the following attributes:

  • link to the website
  • description of the found item
  • some fragment of text having one or more words from the query
  • a link to Cache
  • and a link to Similar pages
from which the last two parts are not supported in my demo.
This brings us to our first class Item which gets constructed as follows:
public Item(string link, string description, string text)
{
this._Link = link;
this._Description = description;
this._Text = text;
}
Searcher
The Searcher class provides the following properties:
  • Count
  • From
  • To
  • ItemsPerPage
  • Url and
  • Results
Of properties, only the ItemsPerPage is read/write, the rest are read-only.
It has three public methods which all help to initiate the search. The first one which should also be called first is the Search method which is to be given a query parameter.public void Search(string query)
Once you call this method, the class will get the response and parse the page to get Count and Results. The you can call subsequent search methods exactly like you would do on the Google page.
public void Search(long from)
public void SearchNext()

Parsing
As I was working with this project, I realized that Google developers are changing the HTML layout more often than they change the logo on the front page. So I had to make sure my code was still working. But then they never promised anyone to keep the same HTML layout. Which makes this project very unstable. In other words nothing guarantees that it will work after a while. That is why I decided to have a project with full control over the whole code and not using any third party library. That way, I can modify the code easily to accommodate any changes in the Google HTML format. The whole parsing process has been split into three sections. First I have to get the counts and I m doing it in the GetCounts method. Then I get the division of results and parse the items out of it in a loop implemented in GetResults. For each item, I parse the HTML to get its properties and that happens within ParseItem.
Using the Code
In my demo I have a form to query Google and show the results in a list box. I have also provided a WebBrowser control to see how it looks within IExplorer. Also there is a link on the title to initiate IExplorer outside of the application. The code using this class is pretty simple.
private Searcher google = new Searcher();

try
{
google.ItemsPerPage = nudItemsPerPage.Value ;
google.Search( txtSearch.Text);
if (google.Results==null) return;
btnNext.Enabled= true;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
lstLinks.DataSource = google.Results ;
lstLinks.DisplayMember = “Description”;

Last Words
There are plenty of ways to parse your HTML code. Depending on the problem you will need to choose one. Working with regular expressions is the most fascinating one and there is undoubtedly a lot to explore in that area.

Static Events

Introduction

I just had a conversation with one of my colleagues and he mentioned the subject of using Static Events which was new to me and I want to investigate it in this article.

Background (Why Static?)

Basically the idea is to have something shared among all the loaded instances of a class and ensure that changing the static property will cause all instances to update their content right away without the changer having to iterate through the existing objects and figure out which ones need to be updated. Kind of building the intelligence into the class so that all instances know what to do when the static property has changed.

This reminds me of the example of exchange rate, which is very well known to those implementing in banking systems: all transactions respect the current exchange rate. But I don’t recall using Static Events for that. We saw this property as a separate object and we made sure that there is only one instance of it at a time. And all instances of transactions knew where to find it when needed. There is a fine difference though. The transactions will not need to know about the changes happening on the exchange rate, rather they will use the last changed value at the time that they use it by requesting the current value. This is not enough when, for example, we want to implement an application where the user interface reacts immediately on changes in the UI characteristics like font, as if it has to happen at real-time. It would be very easy if we could have a static property in the Font class called currentFont and a static method to change that value and a static event to all instances to let them know when they need to update their appearance.

Using the code

It is clear that we need a static field that all instances respect. Let’s say we need a static field called Font that all the labels will use to refresh when this base field has been changed.

public class MyLabel : System.Windows.Forms.Label
{
// The static field all class instance will respect
private static Font font = new Font("Verdana",11 );

This field requires a static property setter to allow us to make changes to the base field.

public static Font Font
{
set {
font = value;
OnMyFontChange(new FontChangedEventArgs(font));
}
}

As you can see this is where we set the static variable but we also call the notification method to start the delegate.

    private static  void OnMyFontChange(FontChangedEventArgs e)     {         if (MyFontChanged != null)             MyFontChanged(null, e);     }

Now almost everything is set. All we need to do is to make sure that every instance subscribes to this event. And that is what we do in the constructor of the class.

public  MyLabel()
{
// Every instance subscribes to this event
MyLabel.MyFontChanged += new FontChangedEventHandler(this.ChangeBaseFont);
}

The delegated method is where we use the changed value to refresh the UI.

private void ChangeBaseFont(object  sender, FontChangedEventArgs e)
{
base.Font = e.Font ;
base.Invalidate();
}

Note:

Of course we could access the static field without the need to pass it over and introduce a new EventArgs class but it just happens to be implemented so, and it certainly has nothing to do with this subject.

In the demo, I have provided a test application using this label control and it demonstrates how it will update multiple screens by changing the base Font property.

Process Information And Notifications using WMI

I have often needed information about a process running or to see if it is running. Fortunately . NET framework comes up with System.Diagnostics name space, which is providing very useful classes like Process to access all kind of information about running process. But what if the process is not running and you need to wait idle until it completes the startup? In this case you will need an event raised to you as soon as the process has been started. That is where WMI comes very handy. You can do everything what System.Diagnostics provides to you and a little bit more by using WMI.

Background (Windows.Management Namespace)

WMI is the Microsoft implementation of Web-Based Enterprise Management (WBEM), an industry initiative to develop a standard technology for accessing management information in an enterprise environment. This initiative helps companies lower their total cost of ownership by enabling powerful management of systems, applications and devices.

This namespace provides several classes. Some of the are used to query information about a system resource like hard disk, Network Adaptor, Windows Service, Process, etc. I will use some of these classes to get the list of running processes. Query the system can take some unnecessary time of your running thread, that’s when the other set of classes come handy. You can subscribe to some system resources to get notification when your requested action takes place. I will use these classes to subscribe for process instantiation and termination.

Using the code

This sample code is just starting point to put you to the right direction and opens a hole powerful technology. You can use the same technics to query all about the Windows system. If anything is runing on your machine, like memory card, you can query it.

WQL = WMI Query Language

The WMI Query Language (WQL) is a subset of standard American National Standards Institute Structured Query Language (ANSI SQL) with minor semantic changes to support WMI.

An example of wql which will result to our ptocess list would look like this:

string queryString = "SELECT Name, ProcessId, Caption, ExecutablePath" +           " FROM Win32_Process";  

A SelectQuery can be instantiated using that string or also like the following

SelectQuery query = new SelectQuery(className, condition, selectedProperties); 

Scope Object

The scope is like the database you are sending the query to. It mentions the machine name and then the schema and then the path to get to the resource. A local machine in Microsoft platform is usually referred to by a dot. Our example to get the list of processes on the local machine will look like the following:

ManagementScope scope = new System.Management.ManagementScope(@"file://root/CIMV2"); 

Searcher Object

Now that we have our two base classes, we can create the query using the searcher class which and execute it by calling Get() method which returns us a collection of management objects.

    ManagementObjectSearcher searcher =          new ManagementObjectSearcher(scope, query);          ManagementObjectCollection processes = searcher.Get(); 

Retrieving the detail
From here we can just iterate through the processes’ properties and get our information.

foreach(ManagementObject mo in processes)  {      DataRow row = result.NewRow();      row["Name"] = mo["Name"].ToString();      row["ProcessId"] = Convert.ToInt32(mo["ProcessId"]);      if (mo["Caption"]!= null)          row["Caption"] = mo["Caption"].ToString();      if (mo["ExecutablePath"]!= null)          row["Path"] = mo["ExecutablePath"].ToString();      result.Rows.Add( row );  }  

Subscribing to an Event

So far we have just got a query to the system repository. Now the second part is even more intresting. Assume you are depending on a service running on a machine. Or you want to do an action when a service goes down. Or in this example find out when an application has been created (added to the process list).

All you need is a ManagementEventWatcher which has a delegate where you can subscribe. It has methods like Start() and Stop() which launch a different thread. And similar to the searcher object it works with a scope and a query.

string pol = "2";  string queryString =      "SELECT *" +      " FROM __InstanceOperationEvent " +      "WITHIN " + pol +      " WHERE TargetInstance ISA 'Win32_Process' " +      "   AND TargetInstance.Name = '" + appName + "'";      // You could replace the dot by a machine name to watch to that machine  string scope = @"//./root/CIMV2";  // create the watcher and start to listen  watcher = new ManagementEventWatcher(scope, queryString);  watcher.EventArrived +=  new EventArrivedEventHandler(this.OnEventArrived);  watcher.Start();  

Al of this makes it possible for us to use this class easily to figure out what happens for a process like notepad.exe

notePad = new ProcessInfo("notepad.exe");  notePad.Started +=      new Win32Process.ProcessInfo.StartedEventHandler(this.NotepadStarted);  notePad.Terminated +=      new Win32Process.ProcessInfo.TerminatedEventHandler(          this.NotepadTerminated);  

Note

It is possible to call Set the property values on the query and submit it. That is slightly more work but still very powerful.

Points of Interest

After I had installed WMI Server Explorer on my machine, I’ve got my Server Explorer on the Visual Studio extended to provide very nice tool which tels me what is correct name for different resources. And I was actually surprised how many resources I can access now.

See Also MSDN

Resource File Generator (Resgen.exe)

The Resource File Generator converts .txt files and .resx (XML-based resource format) files to common language runtime binary .resources files that can be embedded in a runtime binary executable or compiled into satellite assemblies. For information about deploying and retrieving .resources files, see Resources in Applications.

Resgen.exe performs the following conversions:

  • Converts .txt files to .resources or .resx files.
  • Converts .resources files to text or .resx files.
  • Converts .resx files to text or .resources files.
resgen [/compile] filename.extension [outputFilename.extension][…]

Option Description

/compile

Allows you to specify multiple .resx or .txt files to convert to .resources files in a single bulk operation. If you do not specify this option, you can specify only one input file argument.


/r: assembly

Specifies that types are to be loaded from assembly. If you specify this option, a .resx file with a previous version of a type will use the type in assembly.


/str: language[,namespace[,classname[,filename]]]

Creates a strongly-typed resource class file in the programming language (C# or Visual Basic) specified in the language option. You can use the namespace option to specify the project’s default namespace, the classname option to specify the name of the generated class, and the filename option to specify the name of the class file.

/usesourcepath

Specifies that the input file’s current directory is to be used for resolving relative file paths.

resgen myResources.resx myResources.resources /str:C#,Namespace1,MyClass,MyFile.vb
 

Binding Data To The User Interface

This is not the very first time I am digging this subject but this is the first time I feel like I don’t know nothing. I am going to start from scratch and explore different technics on this subject…and again, I will use Amit Kalani’s book. This book will help me explore the following concepts :

  • Simple Data Binding
  • Complex Data Binding
  • One-way and two-way Data Binding
  • The BindingContext object
  • The Data Form Wizard

Simple Data Binding

Simple data binding means connecting a single value from the data model to a single property of a control. For example, you might bind the Vendor object name from a list of vendors to the Text property of a TextBox control.

private void SimpleBindingForm_Load(object sender, System.EventArgs e)
{
// Create an array of vendor names
String [] astrVendorNames = {"Microsoft", "Rational", "Premia"};

// Bind the array to the text box
txtVendorName.DataBindings.Add("Text", astrVendorNames, "");
}

The Binding class can accept many other types of data sources, including the following:

  • An instance of any class that implements the IBindingList or ITypedList interface, including the DataSet, DataTable, DataView, and DataViewManager classes.
  • An instance of any class that implements the IList interface on an indexed collection of objects. In particular, this applies to classes that inherit from System.Array, including C# arrays.
  • An instance of any class that implements the IList interface on an indexed collection of strongly typed objects. For example, you can bind to an array of Vendor objects

Binding to a collection of strongly typed objects is a convenient way to handle data from an object-oriented data model.

private void SimpleBindingForm_Load(object sender, System.EventArgs e)
{
// Initialize the vendors array
aVendors[0] = new Vendor("Microsoft");
aVendors[1] = new Vendor("Rational");
aVendors[2] = new Vendor("Premia");
// Bind the array to the textbox
txtVendorName.DataBindings.Add("Text", aVendors, "VendorName");
}

Navigating through data, using BindingContext :

private void btnPrevious_Click(object sender, System.EventArgs e)
{
// Move to the previous item in the data source
this.BindingContext[aVendors].Position -= 1;
}
private void btnNext_Click(object sender, System.EventArgs e)
{
// Move to the next item in the data source
this.BindingContext[aVendors].Position += 1;
}

Complex Data Binding

In complex data binding, you bind a user interface control to an entire collection of data, rather than to a single data item. A good example of complex data binding involves the DataGrid control. Obviously, complex data binding is a powerful tool for transferring large amounts of data from a data model to a user interface.code

private void ComplexBindingForm_Load(object sender, System.EventArgs e)
{
// Create an array of exams
Exam[] aExams =
{
new Exam("315", "Web Applications With Visual C# .NET"),
new Exam("316", "Windows Applications With Visual C# .NET"),
new Exam("320", "XML With Visual C# .NET"),
new Exam("305", "Web Applications With VB.NET"),
new Exam("306", "Windows Applications With VB.NET"),
new Exam("310", "XML With VB.NET")
};

// Bind the array to the list box
lbExams.DataSource = aExams;
lbExams.DisplayMember = "ExamName";
lbExams.ValueMember = "ExamNumber";
// Create an array of candidates
Candidate[] aCandidates = {
new Candidate("Bill Gates", "305"),
new Candidate("Steve Ballmer", "320")};
// Bind the candidates to the text boxes
txtCandidateName.DataBindings.Add( "Text", aCandidates, "CandidateName");
txtExamNumber.DataBindings.Add( "Text", aCandidates, "ExamNumber");

// And bind the exam number to the list box value
lbExams.DataBindings.Add( "SelectedValue", aCandidates, "ExamNumber");

In this example Exam class could be any user defined type you can think of, as long as it supports String properties ExamNumber and ExamName like this one:

public class Exam
{
private String examNumber;
private String examName;

public String ExamNumber
{
get{ return examNumber;}
}
public String ExamName
{
get{ return examName;}
}
public Exam(String strExamNumber, String strExamName)
{
examNumber = strExamNumber;
examName = strExamName;
}

}
And Candidate is also something similar to the following:

public class Candidate
{
private string examNumber;
private string candidateName;
public string ExamNumber
{
get {return examNumber;}
set {examNumber = value;}
}
public string CandidateName
{
get {return candidateName;}
set {candidateName = value;}
}
public Candidate(String strCandidateName, String strExamNumber)
{
this.CandidateName = strCandidateName;
this.ExamNumber = strExamNumber;
}
}

Filtering With DataView Objects
A DataSet object contains two collections. The Tables collection is made up of DataTable objects, each of which represents a single table in the data source. The Relatios collection is made up of DataRelation objects, each of whic represents the relationship between two DataTable objects.
A DataView represents a bindable, customized view of a DataTable object.You can sort them or filter the records of a DataTable object to build a DataView Object.
private void FilterGridView_Load(object sender, System.EventArgs e)
{
// Move the data from the database to the DataSet
sqlDataAdapter1.Fill(dsCustomers1, "Customers");
// Create a dataview to filter the Customers table
System.Data.DataView dvCustomers = new System.Data.DataView(dsCustomers1.Tables["Customers"]);

// Apply a sort to the dataview
dvCustomers.Sort = "ContactName";

// Apply a filter to the dataview
dvCustomers.RowFilter = "Country = 'France'";
// and bind the results to the grid
dgCustomers.DataSource = dvCustomers;
}

This way of retrieving data is consuming lots of network resources and to bring all the records of a table to the client prior to filtering it doesn’t seem an efficient way of using (available?) resources. Therefor it is wise to create the view and on the database server and filter it before sending it to client.
A very nice example of binding is to bind a textbox to the collection within the combobox. This dead-easy sample is binding the text property of the TextBox to a combobox using its SelectedObject attribute. txtRAM.DataBindings.Add(“Text”, cmbComputers, “SelectedValue”);

Currency Manager

One new challenge I was face with recently is ability of dinamically load a grid based on the selection on one other grid. This is usefull when for example you have to tables loaded and you want to be able to navigate the first table and when move the cursor from one record to the other, the second grid which is related by a foreign key to the first one, gets filtered.This can be achieved in several ways. I just found one very easy way using the BindingContext and CurrencyManager.Let’s say you have a data view keeping the data about the parent relation and you will assign a grid to it like this:
DataView dvParent = base.DataSet.Tables[0].DefaultView ;
...
dgCodeGroup.DataSource = dvParent ;

What you need to do here is to get a reference to the ContextManager which is available through the BindingContext:

myCurrencyManager = (CurrencyManager)this.BindingContext[dvParent];
Then you subscribe to its PositionChanged event to get a notification when user moves from one record to the other on the grid:

myCurrencyManager.PositionChanged +=new EventHandler(OnCurrencyManager_PositionChanged);
Now all you need to do is to filter your second grid based on the selection. And the selection is easy to find:

DataRowView drv = (DataRowView) myCurrencyManager.Current ;

This is a collection of fields in that row. So you can use it to get into the fields and build your second data view.
Navigating the Current Position
This is where the CurrencyManager comes very handy. It has a property called Position which you can access to figure out the index and you can also assign a value to reposition it. Also the Count property is very handy to avoid moving out of boundries.

private void MoveNext(CurrencyManager myCurrencyManager)
{
if(myCurrencyManager.Count == 0)
{
Console.WriteLine("No records to move to."); return;
}
if (myCurrencyManager.Position == myCurrencyManager.Count - 1)
{ Console.WriteLine("You're at end of the records"); }
else
{ myCurrencyManager.Position += 1; }
}
private void MoveFirst(CurrencyManager myCurrencyManager)
{ if(myCurrencyManager.Count == 0)
{ Console.WriteLine("No records to move to.");
return;
}
myCurrencyManager.Position = 0;}
private void MovePrevious(CurrencyManager myCurrencyManager)
{ if(myCurrencyManager.Count == 0)
{ Console.WriteLine("No records to move to.");
return;
}
if(myCurrencyManager.Position == 0)
{ Console.WriteLine("You're at the beginning of the records."); }
else{ myCurrencyManager.Position -= 1; }}private void MoveLast(CurrencyManager myCurrencyManager){ if(myCurrencyManager.Count == 0) { Console.WriteLine("No records to move to.");
return;
}
myCurrencyManager.Position = myCurrencyManager.Count - 1;}