Jump to content

.ul files with URLs containing "&" character crash WUD


DarkShadows

Recommended Posts

I suspect that there may be a workaround for this. But i do not know what it is.

I created my own .ul and .ulz files, which I use WUD to manage a number of downloads I utilize in my Windows XP and Vista installations. The files I use come from various sources, not just from Microsoft.com.

Many websites use urls like the one shown below (which points directly to JCarle's post in the WUD > Bugs sticky thread):

See the end of this message for details on invoking 
just-in-time (JIT) debugging instead of this dialog box.

************** Exception Text **************
System.Xml.XmlException: An error occurred while parsing EntityName. Line 100, position 118.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
at System.Xml.XmlTextReaderImpl.ParseEntityName()
at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos, Char quoteChar, NodeData attr)
at System.Xml.XmlTextReaderImpl.ParseAttributes()
at System.Xml.XmlTextReaderImpl.ParseElement()
at System.Xml.XmlTextReaderImpl.ParseElementContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlTextReader.Read()
at System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace)
at System.Xml.XmlLoader.ReadCurrentNode(XmlDocument doc, XmlReader reader)
at System.Xml.XmlDocument.ReadNode(XmlReader reader)
at System.Data.DataSet.ReadXml(XmlReader reader, Boolean denyResolving)
at System.Data.DataSet.ReadXml(String fileName)
at WUD.UpdateListManager..ctor(String path)
at WUD.formMain.refreshULs()
at WUD.formMain.buttonRefreshULs_Click(Object sender, EventArgs e)
at System.Windows.Forms.Control.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Button.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)


************** Loaded Assemblies **************
mscorlib
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/Microsoft.NET/Framework/v2.0.50727/mscorlib.dll
----------------------------------------
WUD
Assembly Version: 2.30.980.0
Win32 Version: 2.30.980.0
CodeBase: file:///C:/Program%20Files/Windows%20Updates%20Downloader/WUD.exe
----------------------------------------
System.Windows.Forms
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Windows.Forms/2.0.0.0__b77a5c561934e089/System.Windows.Forms.dll
----------------------------------------
System
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System/2.0.0.0__b77a5c561934e089/System.dll
----------------------------------------
System.Drawing
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Drawing/2.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll
----------------------------------------
System.Data
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_32/System.Data/2.0.0.0__b77a5c561934e089/System.Data.dll
----------------------------------------
System.Xml
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Xml/2.0.0.0__b77a5c561934e089/System.Xml.dll
----------------------------------------
System.Web
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_32/System.Web/2.0.0.0__b03f5f7f11d50a3a/System.Web.dll
----------------------------------------
System.Configuration
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.1433 (REDBITS.050727-1400)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Configuration/2.0.0.0__b03f5f7f11d50a3a/System.Configuration.dll
----------------------------------------

************** JIT Debugging **************
To enable just-in-time (JIT) debugging, the .config file for this
application or computer (machine.config) must have the
jitDebugging value set in the system.windows.forms section.
The application must also be compiled with debugging
enabled.

For example:

<configuration>
<system.windows.forms jitDebugging="true" />
</configuration>

When JIT debugging is enabled, any unhandled exception
will be sent to the JIT debugger registered on the computer
rather than be handled by this dialog box.

Q: Is there some alternate way to enter an ampersand "&", into a .ul file where the URL will still work?

Link to comment
Share on other sites


This Tech Republic Article discusses the difficulties with using the ampersand in an XML file.

I tried using "&" in place of "&".

The good news is that WUD will not crash with such a .ul file. Also, using "&" in place of "&" works in the article="" attribute of the <UPDATE></UPDATE> tags.

The bad news is that using "&" in place of "&" does not work as part of the <URL></URL> tags--WUD will not download a file linked with such a URL correctly. It seems WUD does not translate "&" back to a "&" when it navigates to the download URL.

Edited by DarkShadows
Link to comment
Share on other sites

I tried using "&" in place of "&".

That's indeed how an ampersand is escaped in XML. It should work on anything that uses XML for any purpose, 100% transparently (the XML parser handles it all).

The bad news is that using "&" in place of "&" does not work as part of the <URL></URL> tags--WUD will not download a file linked with such a URL correctly. It seems WUD does not translate "&" back to a "&" when it navigates to the download URL.

I just tried it and it works just fine. The proper request is made. For example, here's a download URL for some sample database off of CodePlex:

http://www.codeplex.com/Project/Download/FileDownload.aspx?ProjectName=MSFTDBProdSamples&DownloadId=36602

You'd use:

<url>http://www.codeplex.com/Project/Download/FileDownload.aspx?ProjectName=MSFTDBProdSamples&DownloadId=36602</url>

in the .ul file (& -> &). And it works as intended just as I expected. Here's the actual HTTP request being made by WUD:

GET /Project/Download/FileDownload.aspx?ProjectName=MSFTDBProdSamples&DownloadId=36602 HTTP/1.1
Host: www.codeplex.com
Connection: Keep-Alive

So the problem isn't caused by WUD. What's likely happening here, is the web server checks for a "proper" HTTP referer (yes, "referer" and not "referrer", I know... blame RFC2616!) and if it's not there, it returns an error html page instead of your download (smaller file). Some pages also force you to POST (with form data) instead of GET to download a file. Similarly, they can check for a cookie. Try with the URL above, as it does exactly that: you'll download a file, but it's really a web page, not the .msi file we were expecting. They want to force you to click on "i agree" to download a file.

Try that URL in a web browser, it probably won't work either. Or rename the downloaded file to .htm and open it.

Link to comment
Share on other sites

Hmmm...okay, it is more than likely a web site login issue. (I get a small file that is really an .html doc for login screen.) The strange thing is, I can paste the same url (without changing the ampersands) in Internet Download Manager and download the file just fine. But if I use the same .url in WUD (after changing the ampersands), it does not work; I get the smaller html (renamed as my target download).

Link to comment
Share on other sites

You can PM me the URL (or paste it here), and I'll see exactly why it works or doesn't. It could be a few other things, like redirects (which a browser or "advanced" download managers would follow, but WUD probably not).

Link to comment
Share on other sites

Looks like Crahak got to you before I could... :P

Although I should put in some error handling to deal with the crash caused by the ampersand, he's correct that it is indeed invalid. The XML standard does indeed state that you must escaped ampersands and use their escaped equivalent, thus, &.

If either one of you figure out what's wrong with the download URL, I can make the relevant changes to WUD. Let me know, I'll be glad to help accomodate.

Link to comment
Share on other sites

Although I should put in some error handling to deal with the crash caused by the ampersand

Either some error handling for XML parser errors (letting you know your XML isn't well formed i.e. catch (XmlException e) {MessageBox.Show("Your XML is teh suck!"); } ), or perhaps trying escape all ampersands that aren't followed immediately by "amp;" using a regular simple expression (i.e. anything matched by &[^a] isn't escaped for sure, although that isn't perfect). Even blindly replacing all "&" by "&" and immediately after all "&amp;" by "&" (fixing those that might already have been good) would work in most cases, as ghetto as that may be. Or then again doing it right and checking if they all are escaped before replacing. Lots of options :)

If either one of you figure out what's wrong with the download URL, I can make the relevant changes to WUD. Let me know, I'll be glad to help accomodate.

His download is a forum attachment, and you need valid cookie infos in the HTTP request to download it (gotta be logged in) :( Adding the "Cookie: ..." part to the request is trivial (using CookieContainer), but the data has to come from somewhere. You could try to get the cookie data from the browsers, but that might be tricky. IE, Firefox, and Opera all store them in different places/formats. And even just for IE, v6, 7 and 8 likely store them in different places. XP/Vista use different locations for sure, and in protected mode on Vista, it's a different folder too. Similarly, Fx 2 and 3 uses different formats (in a sqlite DB as of v3). And that still wouldn't work, unless you already logged in on the site on that computer previously, using a supported browser (and perhaps a specific version of it), and that the cookie hasn't expired since.

Most of those issues can't easily be worked around, unless you're willing to really make your XML schema (and code) more complicated so your app can send HTTP referer, cookies, user agents, HTTP POST with form data, etc. Personally, I don't think it's worth making it all over-complicated, just for a handful of special cases where people went out of their way to make something not easily downloadable. But ultimately it's your call...

Supporting 30x server redirects (HttpStatusCode.Moved, HttpStatusCode.MovedPermanently, HttpStatusCode.Redirect, etc) would be a good idea though, if you're looking for something to keep you busy ;)

Link to comment
Share on other sites

I just wanted to add purely user perspective about the URL I was trying to use. It was always my intention and understanding, that I would first have to log into the forum web site (saving my login information inside of a cookie), before trying to use WUD to download a number of files from that forum.

As I stated above, the same technique worked in Internet Download Manager (even after the browser window was closed). But, if handling such a thing is bigger than a breadbox, then don't bother with it.

Link to comment
Share on other sites

Yea, I think I'll code in some exception handling for XML malformations.

As for cookie support, I'm affraid crahak's pretty much nailed the complications associated with doing that (in terms of retrieving an already existing cookie for, say, the forum login). The reason why (to my understanding, maybe crahak could confirm this) download managers achieve this is because they're basically browser plugins, so they have access to the browser cookies.

Finally, handling individual status codes is not necessary as it is per default behavior of the HttpWebRequest class to automatically follow URL redirection.

Link to comment
Share on other sites

The reason why (to my understanding, maybe crahak could confirm this) download managers achieve this is because they're basically browser plugins, so they have access to the browser cookies.

I can't really confirm or deny anything unfortunately. It would depend a lot on which download manager it is. Some might have BHO's or plugins or such, some are standalone, and some are merely extensions for a browser.

Finally, handling individual status codes is not necessary as it is per default behavior of the HttpWebRequest class to automatically follow URL redirection.

Nice, that's good to know :) I hadn't actually tested it (I don't write any downloaders either as there's already hundreds of them, many of which fill my specific needs, e.g. wget, DownThemAll!, etc)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...