Jump to content

Recommended Posts

Posted

I guess the only problem with that string is that it'd stick out like a sore thumb to Web sites that "fingerprint" their visitors (the better to track them). But hopefully it'll be pretty compatible. Let us know if you run into any problems with specific Web sites using it.

One of the things I liked about Opera (at least Opera 12; I haven't tried this with the newer Chromium-based versions) is that you could set your user agent string on a site-by-site basis to report as Opera, Firefox, or IE, so you could work around sites that insist on IE or Firefox because they never heard of Opera. Come to think, I wouldn't be surprised if there's a Firefox or Chrome add-in that does something similar (although I haven't looked).


Posted

One example of how changing the user agent string doesn't help, when it comes to spoofing for later versions of IE, if your browser isn't IE10 or IE11, it doesn't do anything with this HTML:

<meta http-equiv="X-UA-Compatible" content="IE=10"/>

Which is an instruction to the browser (if IE10, IE11 or Edge) to handle things according to IE10 rules, whatever they are. Ideally, this is not some throwaway code to have on a web page, but as we all know, many websites are not written very well. It is impossible to know for sure whether or not the page author added this because it was required for something to work, or because they just copied it from some example, or used a Generator that added it. BUT, in the cases where it is a required thing in order for a page to work properly in Internet Explorer, then this alone may be responsible for delivering non-working markup to your non-IE browser with an IE user agent.

The content value on that meta tag can specify a wide variety of IE versions, including Edge or even IE5.5! :wub:

Posted
2 hours ago, Tripredacus said:

One example of how changing the user agent string doesn't help, when it comes to spoofing for later versions of IE, if your browser isn't IE10 or IE11, it doesn't do anything with this HTML:


<meta http-equiv="X-UA-Compatible" content="IE=10"/>

Which is an instruction to the browser (if IE10, IE11 or Edge) to handle things according to IE10 rules, whatever they are. Ideally, this is not some throwaway code to have on a web page, but as we all know, many websites are not written very well. It is impossible to know for sure whether or not the page author added this because it was required for something to work, or because they just copied it from some example, or used a Generator that added it. BUT, in the cases where it is a required thing in order for a page to work properly in Internet Explorer, then this alone may be responsible for delivering non-working markup to your non-IE browser with an IE user agent.

The content value on that meta tag can specify a wide variety of IE versions, including Edge or even IE5.5! :wub:

Wouldn't the code above mean that the page would ONLY work with IE?

If so, wouldn't that be foolish of a web developer to only code for IE?

Posted

No it doesn't mean that. It means that if the browser is IE, then force that ruleset. With modern IE, it has the compatibility mode. If you ever went to a website and it only worked in compatibility mode, then likely that website did not use that tag.

Posted
On 7/7/2017 at 3:25 PM, Tripredacus said:

No it doesn't mean that. It means that if the browser is IE, then force that ruleset. With modern IE, it has the compatibility mode. If you ever went to a website and it only worked in compatibility mode, then likely that website did not use that tag.

I have already found a "glitch" (or a feature). With the standard IE 11 string, http://detectmybrowser.com/ detects the browser as Mozilla 11.

Why would this happen? That is the user agent for IE 11, not Mozilla 11.

dmb1.jpg

dmb2.jpg

Posted

It is a problem with that site and how it identifies the browser! For example, for me, it says:

"You're using browser version 1 on Operating System"

We can see exactly how it is giving you the information, from their .js: http://detectmybrowser.com/javascripts/browser.js

		{
			string: navigator.userAgent,
			subString: "Gecko",
			identity: "Mozilla",
			versionSearch: "rv"
		},

It shows Mozilla, simply because it sees Gecko in the string. I am not certain if it is doing a "select" or if the function immediately exits after finding the first match. It could be tested by having two of the things it is looking for in the user agent. See the substring values in the DataBrowser section of the js to see the values it recognizes.

BUT, this is how a website uses a User Agent in most cases. They look for certain text and may provide different code (whole pages, includes, css, whatever) based on what it finds. And it will vary on a site by site basis. As in this example, we only see it says "Mozilla" because that is what that particular web author has classified any user agent with Gecko to be. It will be different for other websites and site authors.

And we can probably guess why your example returns the results it did for me. For testing a user agent they way that you have, you would not use a site like you have put because it isn't telling you what you want. You would use a site that actually shows you your user agent. For two purposes only to use these sites:

1. You want to verify that your browser is actually reporting the user agent correctly, or to verify against typos. (Because obviously, you can see it where you edited it on the client)

2. You want to know what some other device's user agent is because you want to copy its string for use on another browser.

Here is a site that tells you exactly what you want to know:

http://whatsmyuseragent.org/

(It is an example)

Posted

dencorso has a valid point ... but I am interested in keeping this discussion going for awhile ... some interesting information being posted.

I went to What is my Browser.com and got this with my XP setup. I have three different UAgents : FFox XP, FFox Win7 and DuckDuckBot.

-------------------------------------------------------------------------
 

Whoa! We can't figure out what browser you're using!

We're working hard to write detection code for all the different types of web browsers, but it looks like we haven't figured yours out yet.

And occasionally, either because of a problem or a changed configuration, sometimes web browsers don't provide the necessary information for us to detect exactly what you're using.

Hopefully soon we can detect your web browser, until then; check out your User Agent string:

DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)

...

Posted
1 hour ago, monroe said:

dencorso has a valid point ... but I am interested in keeping this discussion going for awhile ... some interesting information being posted.

No probs there: but the hijacking posts would be reasonably on-topic in @sdfox7's thread about browser spoofing...  :angel

On 7/4/2017 at 6:00 PM, sdfox7 said:

@Mathwiz

Did you see my thread n May about spoofing Firefox on Windows 2000/XP? 

Most of these "obsolete" browsers run fine, and will not be blocked, if you just change the user agent. @jumpertested it with Windows 98: 

Posted

I wasn't clear in my last post ... I wanted the User Agent discussion to continue for awhile but not in this thread started by Roffen. He was posting about something else.

Thanks for the links to the other threads dealing with user agents ... that should work just fine.

Posted
17 hours ago, monroe said:

dencorso has a valid point ... but I am interested in keeping this discussion going for awhile ... some interesting information being posted.

I went to What is my Browser.com and got this with my XP setup. I have three different UAgents : FFox XP, FFox Win7 and DuckDuckBot.

-------------------------------------------------------------------------
 

Whoa! We can't figure out what browser you're using!

We're working hard to write detection code for all the different types of web browsers, but it looks like we haven't figured yours out yet.

And occasionally, either because of a problem or a changed configuration, sometimes web browsers don't provide the necessary information for us to detect exactly what you're using.

Hopefully soon we can detect your web browser, until then; check out your User Agent string:

DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)

...

So that website is doing the same thing as the earlier one. It is looking at user agents on a list of possible responses and returning the information according to the list. For that specific one you tried, it did not have it and then listed the "unknown" type response. BUT here is the other problem. That website will never put a browser response to that User Agent because it is not a browser. It is a spider identifier.

So we see here perhaps a bit of laziness. In an ideal situation, a spider shouldn't receive that information in return from the website. It should have something else that would be benefitting to a search engine spider (such as a site description or something) instead of saying it doesn't know what browser it is. Since the design for a spider is that there is no person on the other end, there would be no reason why you would expect that a spider would want to know what their browser is.... because they likely aren't using a browser in the first place.

I can't think of any reason to use a search engine spider user agent unless you were testing something on your own website regarding them.

Posted

I decided last year to try some bots. I read this article 'Top 10 Web Crawlers and Bots' and just discovered it has been updated from last year to June 2017.

The Google bot didn't seem to work very well ... many sites or web pages refused my attempt to connect but DuckDuckBot seems to work just fine. I have only had two sites (as I remember) refuse to let me read something. When that happens, I just change my UA to 'Firefox Win 7' or 'XP'. I just don't like 'giving out' a lot of information. if I don't have to. I also use Proxomitron but sometimes (rarely) have to set it on 'Bypass'. I just tried this last year to see how this might work at some web sites ... if they might be collecting information ... I have no idea if it really works. I set up one K-Meleon browser with DuckDuckBot and the other one with Win 7 or Win XP.

In the article below I just lifted some portion of it for here. It has a lot of information.

https://www.keycdn.com/blog/web-crawlers/

Web Crawlers and User-Agents – Top 10 Most Popular

Brian Jackson  |  Updated: June 6, 2017

When it comes to the world wide web there are both bad bots and good bots. The bad bots you definitely want to avoid as these consume your CDN bandwidth, take up server resources, and steal your content. Good bots (also known as web crawlers) on the other hand, should be handled with care as they are a vital part of getting your content to index with search engines such as Google, Bing, and Yahoo. Read more below about some of the top 10 web crawlers and user-agents to ensure you are handling them correctly.

Web Crawlers

Web crawlers, also known as web spiders or internet bots, are programs that browse the web in an automated manner for the purpose of indexing content.

Crawlers can look at all sorts of data such as content, links on a page, broken links, sitemaps, and HTML code validation.

Search engines like Google, Bing, and Yahoo use crawlers to properly index downloaded pages so that users can find them faster and more efficiently when they are searching. Without web crawlers, there would be nothing to tell them that your website has new and fresh content. Sitemaps also can play a part in that process. So web crawlers, for the most part, are a good thing. However there are also issues sometimes when it comes to scheduling and load as a crawler might be constantly polling your site. And this is where a robots.txt file comes into play. This file can help control the crawl traffic and ensure that it doesn’t overwhelm your server.

Top 10 Web Crawlers and Bots

There are hundreds of web crawlers and bots scouring the internet but below is a list of 10 popular web crawlers and bots that we have been collected based on ones that we see on a regular basis within our web server logs.

1. GoogleBot

Googlebot is obviously one of the most popular web crawlers on the internet today as it is used to index content for Google’s search engine. Patrick Sexton wrote a great article about what a Googlebot is and how it pertains to your website indexing. One great thing about Google’s web crawler is that they give us a lot of tools and control over the process.

User-Agent

User-agent: Googlebot

Full User-Agent String

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

4. DuckDuckBot

DuckDuckBot is the Web crawler for DuckDuckGo, a search engine that has become quite popular lately as it is known for privacy and not tracking you. It now handles over 12 million queries per day. DuckDuckGo gets its results from over four hundred sources. These include hundreds of vertical sources delivering niche Instant Answers, DuckDuckBot (their crawler) and crowd-sourced sites (Wikipedia). They also have more traditional links in the search results, which they source from Yahoo!, Yandex and Bing.

User-Agent

DuckDuckBot

Full User-Agent String

DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)

--- I just chose # 1 and 4 ... there are still 8 more there in the article.

Maybe this isn't a 'good thing' to do ... Win XP doesn't get any credit when I go to a web page I guess ... but on my end, I want to be under the radar as much as possible ... if that really works anyway.

Posted

Under normal circumstances, the user-agent string is more or less fixed: it only changes when your browser gets updated. So it doesn't leak a lot of info to the Web pages you visit; basically just your browser and OS.

So, to stay "under the radar" as much as possible, I'd say you want to choose a common user-agent string rather than a rare one. You want to look just like millions of other folks browsing the Web.

Also, for maximum compatibility, you probably shouldn't misidentify your browser too much. Web sites use the user-agent string to figure out what Javascript code, e.g., to send to your browser.

So given the above, I'd probably lie about my OS (e.g., say it's Windows 10 or at least 7 instead of XP). The only place that would likely matter is microsoft.com. But I'd mostly tell the truth about my browser, unless it's a rare one. I might tell Opera to pretend it's Chrome or Seamonkey to pretend it's Firefox, for instance; and probably report the latest version of those browsers, since most users would be running the latest version.

The only pitfall would be if a Web site sent code intended for the latest version, that doesn't run correctly on the actual browser version I'm running. But even if I reported my correct browser version, I suspect most of those Web sites wouldn't send compatible code anyway - they'd probably just tell me to upgrade my browser!

Posted

Back in the day, many websites offered a simple, non-interactive, "printer-friendly" version. Using a bot UA is one way to request that now. Using a mobile UA is another way to request a simpler page. Most sites probably don't even react to the UA, but for those that do "GoogleBot" or "BingBot" might be the best way to get a simple or browser-agnostic page.

Posted
On 7/17/2017 at 5:38 PM, Mathwiz said:

I'd say you want to choose a common user-agent string rather than a rare one. You want to look just like millions of other folks browsing the Web.

 Mathwiz ... that holds merit ... sort of like a large 'school of small fish' darting around, in and out. It's supposed to be confusing to a larger predator ... but I'm sure some get eaten ... but 'safety' is just keep moving and hope with the 'large numbers' of small shiny fish.

Either at the Proxomitron forum or the K-Meleon forum ... there was a discussion about user agents some years back and not giving out any more information than you actually have to. So maybe having several User Agents available ... a 'barebones' type and a few loaded with some information to work at certain trouble sites.

One of the members posted the UA he uses (barebones) and I used it but along the way I have lost it. It may not even work that good in today's world. Still wish I could locate that old UA.

So in reference to your earlier post ... what do you think would be a good common 'school of fish' User Agent worth trying.

 jumper ... that's good information about a BOT UA. I had more trouble with sites when I used the Google Bot UA ... with the DuckDuckGo Bot ... only once in awhile am I refused entry ... like once or twice a month. Then I just switch to a common FFox Win7 UA for that site.

...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...