Google
 

Thursday, October 4, 2007

Processing Interchanges in BizTalk Server 2006

In BizTalk terms, an interchange refers to a message being processed by BizTalk Server. When an interchange contains two or more documents—which is typical when receiving a batch—BizTalk parses the interchange, which results in multiple messages. How these messages are handled depends on the type of interchange processing selected on the pipeline. With BizTalk Server 2006, a developer can select to use standard interchange processing or recoverable interchange processing when developing a custom pipeline. After a pipeline has been compiled and deployed into an application, these settings can be overloaded at run time using the BizTalk Administration Console. The administration console can also be used to modify the setting of the default XML pipeline that comes with BizTalk Server 2006.
Whereas BizTalk Server 2004 supported standard interchange processing, recoverable interchange processing is a new feature introduced in BizTalk Server 2006. To fully explain the enhancements in 2006, a description of both processing options is provided as follows.

Standard Interchange Processing
Standard interchange processing was the only option in BizTalk Server 2004 and is how, by default, the flat-file and XML receive pipelines are configured in BizTalk Server 2006. When an interchange arrives at a receive location the configured pipeline will decompose the interchange into one or more messages. Messages are then individually validated by the pipeline but then collected within the EPM (End-point Manager) inside BizTalk. If at any time any message within the collection fails, the entire interchange will be suspended. The suspended message will appear as the complete interchange, not the separate parts. With BizTalk Server 2004, these messages were not able to be resumed and had to be terminated using HAT (Health & Activity Tracking Tool) or programmatically. With BizTalk Server 2006 inbound, suspended messages may be resumed and the message resumption process will be reviewed in a later section.
Standard Interchange ExampleThe easiest way to understand is to visualize the process. The following example shows how a flat-file interchange containing five valid messages and one erroneous message is handled by way of the standard interchange processing. Figures 1 through 3 walk you through the stages.




Figure 1. Standard Interchange Processing

In Figure 1, the receive location has a flat-file interchange that contains six individual flat file messages. The receive location has been configured to use a custom flat-file pipeline set for standard interchange processing. When the pipeline receives the interchange, it is broken down into six individual flat files. Each will be disassembled and validated through the pipeline and be collected by the end-point manager.

Figure 2. Standard Interchange Processing

In Figure 2 messages #1 through #4 have already have been successfully parsed and validated by the receive pipeline and are being collected by the EPM. Upon validating message #5 a parse error is found. Because message #5 failed validation in the receive pipeline all message bodies, 1 through 6, will be discarded. Instead, the entire flat-file interchange containing all six messages will be suspended in the message box as depicted in Figure 3. If failed message routing, discussed later on, was not selected for the receive port, the entire interchange will remain suspended until manually resumed or terminated by way of the administration console.

Figure 3. Standard Interchange Processing

Recoverable Interchange Processing
Recoverable interchange processing is a new feature of BizTalk Server 2006 and introduces additional flexibility in multimessage interchanges. When a new interchange arrives, and this option is selected on the active pipeline, it is broken down into individual messages and passed through the pipeline for disassembly and validation. When an interchange with bad documents is processed in recoverable mode, the bad documents will not cause a pipeline failure. Instead, they are marked as “messages to be suspended”. After the individual messages have been processed by the pipeline, the end point manager submits all the messages to the Message Box in a single transaction: good messages go to the work queue and bad messages go to the suspended queue. Failed messages will show up as individual suspended messages within the BizTalk administration console, not HAT. Just as with standard interchange processing, inbound suspended messages can be resumed if the suspending error is corrected.
Recoverable Interchange Example
Just as we did before, let’s walk through an example of how recoverable interchange processing works. In the following example we’ll look at how a flat-file interchange containing 4 valid messages and 2 erroneous messages is handled by way of recoverable interchange processing. Figures 4 through 8 walk you through these stages.

Figure 4. Recoverable Interchange Processing

In Figure 4, the receive location has a flat-file interchange that contains six individual flat file messages. The receive location has been configured to use a custom flat-file pipeline set for recoverable interchange processing. When the pipeline receives the interchange, it is broken down into six individual flat files. Each will be disassembled and validated through the pipeline and eventually routed by the end-point manager.

Figure 5. Recoverable Interchange Processing

In Figure 5, messages #1 and #2 have already successfully passed through the receive pipeline and are being collected by the EPM. Messages #1 and #2 are waiting for the rest of the interchange to be processed before being routed to the work queue within the message box. Meanwhile, message #3, which contains a parse error, is being disassembled in the pipeline.

Figure 6. Recoverable Interchange Processing

In Figure 6, messages #1, #2, and #4 have been successfully parsed and validated by the receive pipeline while message #3 failed because it contains a parse error. All four messages are being collected by the EPM and waiting for the rest of the interchange to be processed before being routed to the appropriate queues. Message #5 is currently being processed by the pipeline, but a validation error was found.

Figure 7. Recoverable Interchange Processing

In Figure 7, all six message parts have been processed. In total, four messages successfully passed and two failed for parsing and validation errors. Now that the complete interchange has been processed the EPM can route all six messages to the appropriate queues within the message box.

Figure 8. Recoverable Interchange Processing

Because messages #3 and #5 contained errors, they will be sent to the message box in their native flat-file format and placed in the suspended queue. Because messages #1, #2, #4, and #6 were successful, they will be sent to the message box as XML-formatted documents and placed in the work queue as seen in Figure 8.
As you can see from this example, the results from our recoverable interchange processing scenario vary greatly from that of the previous standard interchange example. Instead of just having a single suspended interchange, we have four valid messages waiting to be processed in the message box and two suspended messages.
How to Configure Interchange Processing
You have two ways to select the type of interchange processing for a pipeline. The first option is at design time. When a developer creates a custom pipeline, they have the choice of setting default configuration properties of the disassemble component using the Pipeline Designer within Visual Studio .NET 2005. When compiled and deployed, these settings become the default settings when the pipeline is selected within a receive location. To see the properties that can be set in Visual Studio see Figure 1.

Figure 1. Setting Pipeline Components from Visual Studio

The second option is at run time using the BizTalk Administration Console. The settings for any receive pipeline—except the default PassThruReceive pipeline, which doesn’t require configuration—can be overloaded using the console. By default, the out-of-the-box XML receive pipeline is configured for standard interchange processing. Changing these settings is just a configuration setting and does not require a recompilation of the pipeline. To see the properties that can be set from the administration console, see Figure 2.


Figure 2. Setting Pipeline Properties from the BizTalk Administration Console

When combined with failed message routing, discussed next, you can create orchestrations that can handle failed—for example, to be routed and manually reviewed by a user.
Failed Message Routing
By default, when a message fails (validation, transformation, routing failure, and so on) within a receive pipeline, the message is automatically placed into the message box as suspended. Suspended messages can be viewed using HAT, and notification of the offending message can be sent by using Microsoft Operations Manager (MOM). By default, failed messages cannot be subscribed to by endpoints such as an orchestration or send port. This was the default operation of failed messages in BizTalk Server 2004.

BizTalk Server 2006 introduces new functionality that provides additional flexibility in dealing with failed messages. When a new receive port is created, a property can be set called “Generate error report for failed message” (see Figure 3).


Figure 3. Receive Port Settings

When this property is checked, failed messages will not be suspended. Instead, they will be sent to the message box, and the following additional properties will be set:
ErrorType
FailureCode
Description
MessageType
ReceivePortName
SendPortName
InboundTransportLocation
OutboundTransportLocation
RoutingFailureReportID

All of these will be promoted properties with the exception of Description and RoutingFailureReportID. By taking advantage of these additional context properties you can now create end-point filters, on an orchestration or sent port, that subscribe to these failed messages. When used appropriately, failed message routing can be used for notifying users of failed messages or building rich error handling or message repair capabilities. You can see a sample orchestration filtering messages based on ErrorType, FailureCode, and ReceivePortName in Figure 4.

Figure 4. Sample Error Handling Orchestration

Please note that this should not have an effect on existing applications that are only subscribing to the BTS.MessageType property as that property will only be prompted when a message successfully passes a pipeline. When a message fails the ErrorReport.MessageType property will instead be promoted.



Message Resume
In BizTalk Server 2004 resume capabilities were simple. For the send side, you could select individual messages for resume using HAT. On the receive side, messages simply could not be resumed. Things have changed with BizTalk Server 2006. Nearly all receive-side messages can be resumed. In the case of messages that require in-order delivery (that is, MSMQ, MQSeries, and so on) message resume may not be possible, as the order integrity needs to be preserved. In-order message delivery is discussed in the next section.
Additionally the tools for resuming messages have changed. Message monitoring is no longer a function of HAT, though message tracking remains in HAT. Instead, the new BizTalk Administration Console contains new “hub” pages that give an administrator visibility into messaging active within BizTalk.

Figure 5. Resuming receive-side messages using the administration console

As you can see from Figure 5, you can either do individual message resume or termination or do bulk resume, or terminate, based on the error code.
In-Order Message Delivery
In BizTalk Server 2004, end-to-end, in-order processing is accomplished by way of the MSMQT as a transport. MSMQ preserves the order of the messages as they come into BizTalk, and the order in which they are added to the outbound work queue. As long as the processing in between those two points preserves this, end-to-end order will be preserved. When using orchestration, DeliveryNotification helps to simulate order delivery by making sure that the message is completely delivered before continuing in the orchestration.
In BizTalk Server 2006, this capability was expanded to support any send port that uses the same ordering semantics that outbound MSMQT does in BizTalk Server 2004 today. On the receive side, you still need an adapter that supports in order processing—MSMQ, and MQSeries adapters will support this option (along with MSMQT). Note that the file adapter can not take advantage of this because of the number of ways that people could interpret order in the files that appear on the system. For example: time stamp order, files that start with Cxxx before all files that start with Oxxx, the digits in the file name determine the order, or any other scheme people may invent.
As you can see BizTalk Server 2006 includes all the capabilities necessary to support end-to-end, in-order delivery, but care should be taken in designing logic and business processes that also take order into account.
Large Message Transformation
In previous releases of BizTalk Server, mapping of documents always occurred in-memory. While in-memory mapping provides the best performance, it can quickly eat up resources when large documents are mapped. For this reason, BizTalk Server 2006 introduced support for large message transformations. A different transformation engine is used when transforming large messages so that memory is utilized in an efficient manner. When dealing with large messages, the message data is buffered to the file system instead of being loaded into memory using the DOM (Document Object Model). This way the memory consumption remains flat as memory is used only to store the cashed data and indexes for the buffer. However, as the file system is used, there is expected performance degradation when comparing with in-memory transformation. Because of the potential performance impact, the two transformation engines will coexist in BizTalk Server 2006.
When message size is smaller than a specified threshold the in-memory transformation will be used. If message size exceeds the threshold then the large message transformation engine is used. The threshold is configurable using the registry

Technical Ref: MSDN

Monday, October 1, 2007

Performance Considerations While Using BRE

Performance Considerations While Using BRE
Introduction
This topic discusses how the rule engine performs in various scenarios and with different values for the configuration/tuning parameters.

Fact Types
The rule engine takes less time to access .NET facts compared the time it takes to access the XML and database facts. If you have a choice of using either .NET or XML or database fact in a policy, you should consider using .NET facts for higher performance.

Data Table vs. Data Connection
When the size of the data set is small (< size="4">
Fact Retrievers
You can write a fact retriever—an object that implements standard methods and typically uses them to supply long-term and slowly changing facts to the rule engine before the policy is executed. The engine caches these facts and uses them over multiple execution cycles. Instead of submitting a static or fairly static fact each time the you invoke the rule engine, you should create a fact retriever that submits the fact for the first time, and then updates the fact in memory only when it is needed.

Rule Priority
The priority setting for a rule can range on either side of 0, with larger numbers having higher priority. Actions are executed in order from the highest priority to lowest priority. When the policy implements forward-chaining behavior by using Assert/Update calls, the chaining can be optimized by using the priority setting. For example, assume that Rule2 has a dependency on a value set by Rule1. Giving Rule1 a higher priority means that Rule2 will only execute after Rule1 fires and updates the value. Conversely, if Rule2 were given a higher priority, it could fire once, and then fire again after Rule1 fires and updates the fact that Rule2 is using in a condition. This may or may not result in the correct results, but clearly would have a performance impact versus only firing once.

Update Calls
The Update function updates the fact that exists in the working memory of the rule engine and causes all the rules using the updated facts in conditions to be reevaluated. The Update function calls can be expensive especially if large set of rules need to be reevaluated because of updating the facts. There are situations where they can be avoided. For example, consider the following rules:

Rule1:
IF PurchaseOrder.Amount > 5
THEN StatusObj.Flag = true; Update(StatusObj)

Rule2:
IF PurchaseOrder.Amount <= 5 THEN StatusObj.Flag = false; Update(StatusObj)


All remaining rules of the policy use StatusObj.Flag in their conditions. Therefore, when Update is called on the StatusObj object, all the rules will be reevaluated. Whatever the value of the Amount field is, all the rules except Rule1 or Rule2 are evaluated twice, once before the Update call and once after the Update call.

Instead, you could set the value of the flag field to false prior to invoking the policy and then use only Rule1 in the policy to set the flag. In this case, Update would be called only if the value of the Amount field is greater than 5, and Update function is not called if amount is less than or equal to 5. Therefore, all the rules except Rule1 or Rule2 are evaluated twice only if the value of the Amount field is greater than 5.

Usage of Logical OR Operators
Using an increasing number of logical OR operators in conditions creates additional permutations that expand the analysis network of the rule engine. From a performance standpoint, you are better off splitting the conditions into atomic rules that do not contain logical OR operators.

Caching Settings
The rule engine uses two caches. The first one is in the update service and the second one is in each BizTalk process. The first time a policy is used, the BizTalk process requests for the policy information from the update service. The update service retrieves the policy information from the rule engine database, caches it and returns the information to the BizTalk process. The BizTalk process creates a policy object based on that information and stores the policy object in a cache when the associated rule engine instance completes executing the policy. When the same policy is invoked again, the BizTalk process reuses the policy object from the cache if one is available in the cache. Similarly, if BizTalk process requests for the information about a policy from update service, the update service looks for the policy information in its cache if it is available. The update service also checks if there have been any updates to the policy in the database every 60 seconds (1 minute). If there are any updates, the update service retrieves the information and caches the updated information.

There are three tuning parameters for the rule engine related to these caches and they are CacheEntries, CacheTimeout, and PollingInterval. You can specify the values for these parameters either in the registry or in a configuration file. The value of the CacheEntries is the maximum number of entries in the cache. The default value of CacheEntries parameter is 32. You may want to increase the value of the CacheEntries parameter to improve performance in some cases. For example, say, you are using 40 policies repeatedly; you may want to increase the value of CacheEntries parameter to 40 to improve the performance. This would allow the update service to cache details of up to 40 policies in memory. While it would cause the BizTalk service to cache up to 40 policy instances in memory. There may be more than one instance of a policy in the cache of BizTalk service.

The value of CacheTimeout is the time (in seconds) for entries to age out of the update service cache. In other words, the CacheTimeout value refers to how long a cache entry for a policy is kept in the cache without it being referred. The default value of CacheTimeout parameter is 3600 seconds (1 Hr). It means that, if the cache entry is not referenced with in an hour, it is deleted. In some cases, you may want to increase the value to a higher value to improve the performance. For example, say, the policy is invoked every 2 hrs. You could improve the performance of the policy execution by increasing the value of the CacheTimeout parameter to a value higher than 2 hrs.

The PollingInterval parameter to the rule engine defines the time in seconds for the update service to check the rule engine database for updates. The default value for the PollingInterval parameter is 60 seconds (1 minute). If you know that the policies do not get updated at all or they are updated rarely, you could change this value to a higher value to improve the performance.

Side Effects
The ClassMemberBinding, DatabaseColumnBinding, and XmlDocumentFieldBinding classes have a property named SideEffects. This property determines if the value of the bound field/member/column value is cached or not. The default value of the SideEffects property in the DatabaseColumnBinding and XmlDocumentFieldBinding classes is false. Whereas, the default value of the SideEffects property in the ClassMemberBinding class is true. Therefore, when a field of an XML document or a column of a database table is accessed for the second time or later with in the policy, the value is retrieved from the cache. Where as, when a member of a .NET object is accessed for the second time onwards, the value is retrieved from the .NET object, not from the cache. Setting the siddeffects flag of a .NET ClassMemberBinding to false will improve the performance as the value of the field is retrieved from the cache from second time onwards. You can only do this programmatically. The Business Rule Composer tool does not expose the sideeffects flag.

Instances and Selectivity
The XmlDocumentBinding, ClassBinding and DatabaseBinding classes have two properties, Instances and Selectivity. The value of Instances property is the expected number of instances of the class in working memory. The value of Selectivity property is the percentage of the class instances that will successfully pass the rule conditions. The rule engine uses these values to optimize the condition evaluation so that the lowest possible number of instances are used in condition evaluations first and then the remaining instances. If you have prior knowledge of the number of instances of the object, setting the Instances property to that value would improve the performance. Similarly, if you have prior knowledge of the the percentage of these objects passing the conditions, setting the Selectivity to that value would improve the performance. You can only set value for these parameters programmatically. The Business Rule Composer tool does not expose them.

Technical Ref: MSDN