Friday, November 20, 2009

Avoiding TCP/IP Port Exhaustion

When a client initiates a TCP/IP socket connection to a server, the client typically connects to a specific port on the server and requests that the server respond to the client over an ephemeral, or short lived, TCP or UDP port. On Windows Server 2003 and Windows XP the default range of ephemeral ports used by client applications is from 1025 through 5000. Under certain conditions it is possible that the available ports in the default range will be exhausted.

The symptoms of TCP/IP port exhaustion may vary from one client application to another but are typically manifested as an error indicating a failed network connection. To determine if failed network connections are being caused by TCP/IP port exhaustion, follow these steps on the client computer:

  1. On a computer running Windows XP or Windows Server 2003, click Start, click Run, type cmd, and then click OK to open a command prompt.

  2. Do one of the following:

    • Enter the following command in the command prompt on a Windows XP or Windows Server 2003 computer to display the active connections being used by the TCP/IP protocol on this computer:

      netstat -n
      This will list the TCP/IP addresses bound to the client computer and the ports on which the TCP/IP addresses are communicating with a remote server. If the listed ports consume all of the available ports then TCP/IP port exhaustion occurs.

    • Enter the following command in the command prompt on a Windows Server 2003-based client computer to display the active connections being used by the TCP/IP protocol:

      netstat -b
      This will list the TCP/IP address bound to the client computer, the port on which the TCP/IP address are communicating with a remote server, and the application that is using the ports. This information can help determine which client applications are consuming excessive TCP/IP ports.

Problems Associated with TCP/IP Port Exhaustion

Errors similar to the following have been observed when a client application attempts to connect to a BizTalk server using TCP/IP sockets or when a BizTalk application attempts to connect to a server using TCP/IP sockets:

System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send.

- or -

Unable to connect to the remote server
System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted.

When these errors occur, the following problems may also occur:

  • Client applications may fail to connect to the BizTalk server.

  • The BizTalk Application service may fail to connect to a remote SQL server.

  • BizTalk Server adapters may fail to connect to a remote server.

  • Each port reservation that is made by a client application consumes kernel memory. If an unusually high number of client port reservations are made then Windows kernel memory use will increase accordingly.

Cause

TCP/IP port exhaustion can occur on a client computer if the client computer is engaging in an unusually high number of TCIP/IP socket connections. This can occur if many client applications are initiating connections.

If all of the available ephemeral ports are allocated to client applications then the client experiences a condition known as TCP/IP port exhaustion. When TCP/IP port exhaustion occurs, client port reservations cannot be made and errors will occur in client applications that attempt to connect to a server via TCP/IP sockets.

TCP/IP port exhaustion is more likely to occur under high load conditions than under normal load conditions.

Resolution

Follow these steps to avoid TCP/IP port exhaustion and its associated problems:

  1. Verify that one or more client applications are not generating excessive TCP/IP socket connections. This can be checked by running netstat -n on Windows Server 2003 and Windows XP or by running netstat -b on Windows Server 2003 as described above.

    If a particular client application is engaging in an unusually high number of TCP/IP socket connections then consider redesigning the client application to make more judicious use of TCP/IP socket connections.

    Aa560610.note(en-us,BTS.20).gifNote
    If an unusually high number of client port reservations are allocated to an instance of the BizTalk Application service (BTSNTSvc.exe) then verify that any custom code configured to run in the BizTalk Application service is not making excessive TCP/IP socket connections.

  2. If a large number of client applications are initiating the expected number of TCP/IP socket connections but there are not enough available ephemeral ports to satisfy the connection requests then implement one or more of the following registry modifications.

    Aa560610.Warning(en-us,BTS.20).gifWarning
    If you use Registry Editor incorrectly, you may cause serious problems that may require you to reinstall your operating system. Microsoft cannot guarantee that you can solve problems that result from using Registry Editor incorrectly. Use Registry Editor at your own risk. Before you modify the Registry, always back up the registry, and verify that you know how to restore the backup if a problem occurs. For more information about how to back up, restore, and modify the registry, see the Microsoft Knowledge Base article "Description of the Microsoft Windows registry" at http://go.microsoft.com/fwlink/?LinkId=62729.

    Increase the upper range of ephemeral ports that are dynamically allocated to client TCP/IP socket connections.

    1. Start Registry Editor.

    2. Browse to, and then click the following key in the registry:

      HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

    3. On the Edit menu, click New, DWORD Value, and then add the following registry value to increase the number of ephemeral ports that can by dynamically allocated to clients:

      Value name

      MaxUserPort

      Value data

    4. Close Registry Editor.

      Aa560610.note(en-us,BTS.20).gifNote
      You must restart your computer for this change to take effect.

      Aa560610.note(en-us,BTS.20).gifNote
      Increasing the range of ephemeral ports used for client TCP/IP connections consumes Windows kernel memory. Do not increase the upper limit for this setting to a value higher than is required to accommodate client application socket connections so as to minimize unnecessary consumption of Windows kernel memory.

    Reduce the client TCP/IP socket connection timeout value from the default value of 240 seconds

    1. Start Registry Editor.

    2. Browse to, and then click the following key in the registry:

      HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

    3. On the Edit menu, click New, DWORD Value, and then add the following registry value to reduce the length of time that a connection stays in the TIME_WAIT state when the connection is being closed. While a connection is in the TIME_WAIT state, the socket pair cannot be reused:

      Value name

      TcpTimedWaitDelay

      Value data

    4. Close Registry Editor.

      Aa560610.note(en-us,BTS.20).gifNote
      You must restart your computer for this change to take effect.

      Aa560610.note(en-us,BTS.20).gifNote
      The valid range of this value is 30 through 300 (decimal). The default value is 240.

  3. Follow the recommendations in Avoiding DBNETLIB Exceptions for disabling the denial of service attack security feature that is implemented with Windows Server 2003 SP1 and later.

    Aa560610.Important(en-us,BTS.20).gifImportant
    This should only be done in an intranet environment where the BizTalk Server computer is not directly exposed to the Internet.

  4. Follow the recommendations in Avoiding DBNETLIB Exceptions for alleviating conditions that can cause the MessageBox database(s) to become I/O bound.

Data Connections in Browser Forms

This post will answer the question "how do I get my data connection to work on the server". Cart, meet horse.
When we started defining the server experience for InfoPath forms, one of the guiding principles we espoused was "design-once." Meaning, provided you stick within the subset of functionality supported on the server, you could design a form once and it would run in InfoPath and in Forms Services. That is basically true for data connections, but there are two special considerations that differ significantly when moving to the server, namely
  • Authorization to perform cross-domain connections
  • Multi-tier authentication
The cross-domain issue, server edition
Short version:
You can’t make a cross-domain connection from a domain security browser form unless your data connection uses a UDC file. Period.
Long version:
InfoPath has three security modes, which we refer to as restricted (AKA "super-sandbox"), domain, and full-trust. Restricted form templates don’t run on the server. Full-trust form templates are allowed to do whatever and consequently need to be administrator-approved in order to run server-side. Domain trust form templates constitute the vast majority of form templates in the enterprise, and roughly follow the Internet Explorer security model which dictates that by default, the user must approve any attempt to access resources that come from a different domain or Zone. So, if you have a form template running on http://myserver/, and it tries to connect to a web service on http://yourserver/, InfoPath will prompt you before trying the connection.
Works great in InfoPath, but when you run a form in the browser, you may be running server-side business logic. That business logic may want to execute a data query. Because HTTP is a stateless protocol, Forms Services can’t halt execution of a server-side process and return to the browser in order to ask you for permission to continue. Additionally, the user in this case may not be the right person to own the decision about whether the cross-domain connection can take place. So, this decision is placed instead in the hands of the server administrator who owns security around the form template. Depending on whether the form is administrator-approved or just published to a document library by Joe User, this ownership falls on the farm administrator (I call her Julie) or the site collection administrator (henceforth known as Bob).
The technology which allows Julie and Bob to determine whether forms can make cross-domain connections involves a system of checks and balances. Central to this system is a data connection settings file in the Universal Data Connection V2 format, called a UDC file. Bob’s UDC files live in a special type of document library called a Data Connection Library, or DCL. Julie’s UDC files live in the Microsoft Office SharePoint Server configuration database, and are managed via the "Manage Data Connection Files" page in the Application Management section of SharePoint Central Administration (I tend to refer to this as "the store").
The basic premise behind UDC files is that in InfoPath 2007 your data connection settings can live outside of the form template in one of these files, and both InfoPath and Forms Services will retrieve the current connection settings from this file at runtime before making the connection. The UDC file itself is retrieved from a URL relative to the root of the site collection where the form was opened. This enables lots of cool functionality - for example, you can now share settings across multiple forms and change them once when you move your data source.
You can also pre-deploy test and production versions of your data connection settings to your staging and production environments so that you don’t need to update the form template with new data connection settings when you go live.
For the purposes of this discussion the key takeaway is that a domain security form running in the browser will never make a cross-domain data connection unless those connection settings come from a UDC file.
In other words, you can’t make a cross-domain connection from a domain security browser form unless your data connection uses a UDC file. Period.
Why is this more secure? Well, if Bob is a good administrator, he controls access to Data Connection Libraries on his site collection. A DCL requires content approval by default, and while members of the site with Contributor access can write files to the library, nobody but the owner of the file can use the file from InfoPath unless a content approver has approved the file. By default, all members with Designer permissions have the Content Approval right. In short, Bob can set up the site collection such that only people that he designates can approve UDC files. Therefore, forms on Bob’s site collection can only make connections outside the farm unless he approves the connection first.
Julie’s central store is more secure - only users with access to SharePoint Central Administration can modify those files. Furthermore, by default only the server object model can access files in the store. In order to make these files accessible to web clients such as InfoPath, the "web accessible" flag must be set. Otherwise, the files can be used from browser forms only.
Finally, if Julie wants to have complete control over cross-domain connections for the farm, she can turn off the ability for user form templates to make cross-domain connections at all (in fact, these are disabled in a default install). When cross-domain connections are disabled for the farm, even connections that use settings from a UDC file can’t make cross-domain connections.
Solving multi-tier authentication issues
Once your form is allowed to make cross-domain connections, you’ll need to deal with getting access to the data. If your data lives on a different computer than your Forms Services site, this means figuring out an alternative set of credentials for data access. I’ve covered why this is true, and the various options for accomplishing this task, in a series of posts called "Advanced server-side authentication for data connections," but I left out one useful prototyping trick. Consider the following UDC authentication block:


myusername
mypassword

The UseExplicit element allows you to specify a username and password in plaintext. When this is present, Forms Services will impersonate the specified user using the supplied credentials. This is a great tool for prototyping - you can see immediately whether your connection can be made given valid credentials before you go to the trouble of setting up Office Single Sign-on. However, I cannot stress enough that this is for prototyping only. Because the UDC file is in clear text and is accessible to anyone with read permission on the library, this puts a windows username and password in clear text on the network, which is bad-bad-bad. You’ve been warned.
Whatever Julie wants, Julie gets
The "Manage data connection files" page (AKA "the store") gives Julie a great deal of power over the use of data connections in administrator-approved form templates. Furthermore, Julie also has a great deal of control over what Bob’s users can do. The following defaults can only be modified on the Configure forms services page:
  • User form templates cannot make cross-domain connections (not even with a UDC files)
  • User form templates cannot use authentication embedded in a database connection string.
  • User form templates cannot use the authentication section of the UDC file
  • User form templates can use credentialType BASIC or DIGEST only if the connection is made using SSL.
  • User form templates cannot make cross-domain connections (not even with a UDC files)
  • User form templates cannot use authentication embedded in a database connection string.
  • User form templates cannot use the authentication section of the UDC file
  • User form templates can use credentialType BASIC or DIGEST only if the connection is made using SSL.
The following default setting can only be modified on the Configure Web Service Proxy page:
  • User form templates cannot use the Web service proxy

Thursday, November 19, 2009

DisableLoopback Check

DisableLoopbackCheck & SharePoint: What every admin and developer should know.


What is the issue?
Windows Server 2003 SP1 introduced a loopback security check. This feature is obviously also present in Windows Server 2008. The feature prevents access to a web application using a fully qualified domain name (FQDN) if an attempt to access it takes place from a machine that hosts that application. The end result is a 401.1 Access Denied from the web server and a logon failure in the event log.

Unfortunately 401.1 is not really helpful as this error code means there is a problem with the user credentials. Of course, the HTTP spec doesn’t know about security features in a vendor’s implementation so there can’t be a HTTP error code for such a feature. This can lead to much banging of the head on the desk. It’s one of numerous causes of the 401.1 which are nothing to do with invalid credentials (e.g. attempting to use Kernel Mode Authentication with domain account in IIS7).

What this means is that when you browse a SharePoint Web Application which uses a fully qualified domain name from a WFE in the farm you will get a 401.1. This is very annoying on a development box, or when testing locally, or in other SharePoint specific scenarios (more on those later).

OK, so what’s the big deal here?
Microsoft call this a security feature. Spence calls this a security fix! There are many exploits which attempt to attack via reflection – i.e. pretending to be local as to bypass constraints. This setting should have been in the box since Windows NT4, but it wasn’t. Microsoft have done the right thing and addressed the problem based on customer feedback and exploits. They have fixed a hole and further tightened the attack surface of a Windows server. Good Job. Anyone who cares about host security or platform hygiene knows it makes sense.

But it breaks my SharePoint, dang it!
Yup, it does. But only if you are attempting to access a Web Application from a server hosting it (i.e. locally). You shouldn’t be doing that. Well, you shouldn’t be doing it in production. If you or your company admins are testing locally you have bigger problems that a pesky security fix.

The trouble is there are also scenarios where this fix will break normal operations of SharePoint.

  • Search Indexing.
    If you are hosting the WSS Web Application Service on your Indexer for the purposes of having a “Dedicated Crawl Front End” and avoiding a network hop. This is common in small scale “Medium Server Farms”. Because the Indexer is crawling itself, the crawl log will fill up with 401s and your content won’t get indexed.
  • Web Application “Warm Ups”.
    If you are running a scheduled task or timer job to hit the Web Application to avoid the start up lag after an application pool recycle, the “warm up” will fail with a 401.
  • Custom Code using SharePoint Web Services.
    If you have custom code, either in SharePoint or out with it that leverages SharePoint Web Services (such as using the ExcelService API) these requests will fail with a 401.

Okay, so get to the point what should I do?
Microsoft’s KB Article 896861 details two workarounds. One is to disable the Loopback Check entirely – and this is commonly promoted as the thing to do on all your SharePoint Servers. The second is to add a list of addresses to exclude from the check. Both of these are accomplished by means of a registry key in the LSA hive.

So which one should you use? The answer, of course is, it depends.

If you are working on a development environment or on just a single MOSS box – go for it - disable it completely. You need to debug and test locally and it’s likely you don’t know what addresses you will use ahead of time. I as a matter of course disable the check as part of my sysprep build for all my development and test machines. I never hit the problem because my base image is all sorted as I want it. I recommend you do the same.

However, for production environments, DO NOT DISABLE this feature. You are unpicking a serious security check of the OS. If that environment underwent a security audit by a competent security engineer, it would be flagged. You should add a list of addresses you wish to exclude. This makes your scenario work whilst retaining the security check (which is important if you are handing over the environment to your customer’s admins who may decide to browse the interwebz from the console :)).

It’s not that big of a deal to figure out which addresses you need to exclude, and on what machines you need to apply the change. If you can’t do this quickly, you don’t understand your topology or services.

This change can of course be applied by Group Policy, so should you need to do it on a bunch of boxes (e.g. five WFEs with custom code using SharePoint Web Services) the overhead is avoided. Of course this assumes you are using GPOs to manage your SharePoint Servers (you should be). GPOs are also useful if you need to add additional addresses later on.

Another advantage of using the list, is the change does not require a reboot of the server to take effect, just a restart of the IISAdmin service.

Of course this is a lifecycle issue and like anything else, you should weigh the security risk versus additional complexity on a per implementation basis to make the right call. And more important than this, is you need to document the configuration.

In addition as a consultant or otherwise, if you are hitting 401s that don’t make sense – check this, along with Kernel Mode AuthN and Local Intranet Zone before anything else. Just like everyone else this problem has had me scratching my head for a couple hours. So remember this issue when diagnosing the dreaded 401!

Conclusion
The loopback check is a good thing, not a bad thing, if you care about security and platform hygiene. Do not disable this feature in production. Feel free to disable it in development/test environments. Todd is also planning to spend some time on this subject in his next netcast.