Sunday, 26 October 2014

CodedUI playback on HtmlTable control


This post is to share some insights in playback performance on HTML tables in your application.
A problem that we generally encounter is the extreme long wait times when we want to read data from a cell in a HTM Table. So the main thing I would like to point out here is the performance implications of not completely understanding how CodedUI gets the actual control from the browser and the fact that using a property of e.g. the HtmlTable control can significantly impact your playback performance. this is due to the fact that the property can constantly evaluate it’s value each time you call the property getter.
To show the differences, you can see a simple test page, where it has a table with 10 columns and 100 rows and each cell in the table has a different value. Now my UI test wants to pick out a cell and capture the value of that cell and assert that value. ( the red arrow pointing to an arbitrary cell I want to read the value from.
image
So the basic approach to getting this value would be, get the correct column index of the column we are interested in, next lookup the correct row that we are interested in and finally retrieve the value from the cell that we found.
Lets see the three approaches, starting with the code generally followed in actual projects that had an performance problem.
So we can do this in 3 different ways all having different performance characteristics.

The naive approach

So lets take the first approach found in our codebase. Here  you create a loop where we loop through the header columns and match the cell.FriendlyValue of that column to the name of the column I am interested in and then keep that index when found. Next we do the same but now for the rows and we do the lookup in the cell with index 0. When we found the row we capture the value. Now in this approach we use the call to Htmltable.Getcel(rowIndex, ColumnIndex), that will return the cell and then we can get the value from the cell. The code is here below.
int nIndex = 0;
for (nIndex = 0; nIndex < table.ColumnCount; nIndex++)
{
    var headerCell = table.GetCell(0, nIndex);
    if (headerCell.FriendlyName.Contains("Lookupcolumn9"))
    {
        break; // nIndex is the column index we are looking for
    }
}
string lookupValue="";
int nRowIndex = 0;
for (nRowIndex = 0; nRowIndex < table.Rows.Count; nRowIndex++)
{
    var controlCell = table.GetCell(nRowIndex, 0);
    if (controlCell.FriendlyName.Contains("rowLookup85"))
    {
        var cell = table.GetCell(nRowIndex, nIndex);
        lookupValue = cell.FriendlyName;
        break;
    }
}
If you run this code, you will notice it is pretty slow in playback and there is a statement in this code that is incurring performance impact while you probably will never notice just based on looking at the code. If you look at the red highlighted line, you see a call in the loop to table.Rows.Count. When you call the getter of this property this will invoke a call to the browser DOM. this means we do this  each time we iterate through the loop. the reason for the re-evaluation is that some javascript might have modified the table in the meanwhile and therefor it is re-evaluated each and every time we call the getter. So one simple performance improvement would be to cache the property local and use that as the test condition in the loop. The next thing that is incedibly slow, is the call to GetCell.
Running the above code against the test page costs 160443 milliseconds to complete, so that is about 2.6 minutes to find that singe cell we are looking for.

Using cached control values

Now the second approach would be to use a foreach loop on the columns and the cells in stead of using an index into the rows and the columns. Now why is that a better approach you might ask yourself? Well when you use foreach, you are using an iterator over a collection. the basic principal of an iterator is that the collection it iterates over is fixed during the iteration. so that means the values are all cached. to show you the code, have a look here below:
string lookupValue = "";
// find the index of the column we want to lookup in a row
int columnIndex = -1;
foreach (HtmlCell header in ((HtmlRow)(table.Rows[0])).Cells)
{
    if (header.FriendlyName.Contains("Lookupcolumn9"))
    {
        columnIndex = header.ColumnIndex;
        break;
    }
}

foreach (HtmlRow row in table.Rows)
{

    if (row.Cells[0].FriendlyName.Contains("rowLookup85"))
    {
        // get the value and return
        lookupValue = row.Cells[columnIndex].FriendlyName;
        break;
    }
}

When I execute this test, this results in resolving the right cell in about 28 seconds. So just using a foreach in stead of the for loop saves a lot of time. Now the actual culprit of the problem is more or less that we just can’t tell which properties are cached value and which are not. But just using the iterator will ensure you iterate over a fixed collection.

Leveraging CodedUI control search

This algorithm still has a big problem in my opinion. The problem is that if I want to get the first column out of the first row, this call function would return pretty fast, since all loops only execute once. But when I want the last column from the last row, my performance degrades tremendous!
So the third approach you can take is just by leveraging the search capabilities of CodedUI yourself. What i mean by that is that we are going to create a HtmlCell control and control it’s search properties to find the right header column right away and from that we determine the index that this cell has in the row, Next we create an HtmlRow control, and use it’s search properties to find the row that contains the value we are looking for, identifying the right row. The last step s to take the FriendlyName of the cell with the index we found in the previous search. The code below shows how this is done:
HtmlCell directcell = new HtmlCell(browser);
directcell.SearchProperties.Add(new PropertyExpression(HtmlCell.PropertyNames.InnerText,
 "Lookupcolumn9", PropertyExpressionOperator.Contains));
int cellIndex = directcell.ColumnIndex;

HtmlRow directRow = new HtmlRow(browser);
directRow.SearchProperties.Add(new PropertyExpression(HtmlRow.PropertyNames.InnerText, 
 "rowLookup85", PropertyExpressionOperator.Contains));
var lookupValue = directRow.Cells[cellIndex].FriendlyName;


The big advantage of this approach is that we leverage the capabilities of codedUI to find the stuff we are looking for and not employing our own algorithm to find stuff in the table. Next advantage is that this algorithm will have the same performance regardless of the cell we are looking for and finally the performance of this approach is incredible faster then the previous approaches.