March 2010 1 post

HttpWebRequest.Abort() in .NET Compact Framework 2 doesn't work when m_connection is null

Wednesday, March 31, 2010

Note: This post applies to CF 2.0 and below. It is fixed in later versions.

I literally spent days poring over our app's code to figure out what was going wrong. Eventually I gave up and blamed our failure on the Compact Framework code, and it seems I was correct. It appears that in CF 2.0, HttpWebRequest.Abort() is broken in some cases and causes very difficult-to-detect incorrectness in subsequent requests.

Create and send an HttpWebRequest:

HttpWebRequest r = (HttpWebRequest)WebRequest.Create("");
r.BeginGetResponse(new AsyncCallback(SomeFunction), null);

Now, sometime later but before the request is completed, try to abort the request:


It's obvious what this code is supposed to do, and the API also clearly states what is supposed to happen — the request is cancelled. However, due to a CF bug, that doesn't always work correctly.

If you Abort() the request early enough, it is possible that the m_connection private member of the HttpWebRequest object has not been set. In this case, the Abort() call actually fails to abort the connection. In fact, the HttpWebRequest will appear to have been aborted, but its underlying Connection object (and its Socket object) will still be sitting around. Since the Connection objects used by HttpWebRequest are pooled, this is very bad. It leaves the Connection object in an unaborted state while its owning HttpWebRequest has been aborted — meaning when the Connection object is reused by a different request, it still has old state and will behave in unexpected ways.

Consider the following scenario:

  1. Client creates a new HttpWebRequest and begins a request for HTTP resource A.
  2. Someone calls Abort() on that request before m_connection is set.
  3. Abort will silently fail and the underlying Connection and Socket will be left in an invalid state.
  4. Client creates a new HttpWebRequest and begins a request for HTTP resource B.
  5. Since HttpWebRequests are not pooled, HttpWebRequest B is a different object from the now-defunct HttpWebRequest A. However, their underlying Connection objects are pooled. HttpWebRequest B tries to get a Connection from the pool, and may end up with the Connection that A left in an invalid state.
  6. HttpWebRequest B sends things through its underlying Connection and the Socket owned by the Connection. But the Socket is still connected to resource A.
  7. HttpWebRequest B thinks it successfully sent request B. The server returns content from A since that's what the Socket's associated with.
  8. HttpWebRequest B gets a response stream from the server. It thinks that this contains data from resource B, when in fact it is data from resource A.

So you end up with random data corruption. Not only that, but it's silent: it's very difficult to detect this condition, and most of the time the data looks legitimate. For example if you're downloading 100 images and image #50 actually contains the data that was supposed to go into image #49, well, tough luck because it looks like a legitimate image.

I spent forever trying to come up with solutions to this, but could only come up with ones that don't work:

  • Do not call Abort() on HttpWebRequests whose m_connection member is null.

    Since this bug occurs iff m_connection == null when aborting, if we avoid doing that then this bug can never occur. The problem is that Abort() is the only way to kill the request, and it may not be possible for your app to completely avoid aborting requests. In addition, since the Connection objects are pooled and there are very few of them in the pool, you can't leave these requests hanging around or your app will quickly run out of connections. (If Abort() is called and this bug is triggered, the Connection will be returned to the pool and not block subsequent requests; it will just have unpredictable behavior next time.)

  • When aborting, check m_connection, and if it's null then store the Connection which is about to become invalid into a data structure somewhere. When an HttpWebRequest returns data, check to see whether it came from a broken Connection; if so disregard the data and retry.

    This would work... if it was possible to get the HttpWebRequest's Connection object. But since the bug only occurs when m_connection == null, and because the Connection is actually created deep within the innards of the code, passed around as a local variable, and is not known to the HttpWebRequest until a callback some time later, it's not possible at abort time to get a reference to the Connection object, not even with reflection.

    Also, since the HttpWebRequest aren't pooled, keeping track of those is not useful.

  • Inherit from HttpWebRequest. Tag each one with a unique ID, and when the response comes back check to see the ID is what you expect

    Since HttpWebRequests aren't pooled this doesn't work. It's the underlying Connection which is pooled and left in an invalid state, not the request itself.

  • Set a timeout on the underlying Connection or the Socket that it owns so they can dispose themselves.

    Those two objects don't have such a thing as timeouts, at least not in CF, and even if they did there's no way to get a reference to those objects until m_connection is set (but the bug is gone by that point). HttpWebRequest has a timeout but it is useless as it simply calls Abort() when the timer rings, so will still exhibit the buggy behavior.

The fix is to use CF 3.5, but I have to support 2.0. Still looking for a workaround...

Tags: cf, httpwebrequest, .net, winmo | Posted at 23:28 | Comments (1)