The user-agent header
As mentioned in our overview of HTTP requeset headers,
the user-agent header specifies information about the browser or robot that made
the request. Here are some examples of user-agent strings:
1. Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)
2. Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/418.9 (KHTML, like Gecko) Safari/419.3
3. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0
4. Opera/9.27 (Windows NT 5.1; U; en)
5. Opera/9.50 (J2ME/MIDP; Opera Mini/4.1.11320/534; U; en)
6. BlackBerry8100/4.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/125
7. T-Mobile Dash Mozilla/4.0 (compatible; MSIE 4.01; Windows CE; Smartphone; 320x240)
8. msnbot/1.1 (+http://search.msn.com/msnbot.htm)
9. libwww-perl/5.800
As you can see, there is some variety in the format of user-agent
headers. However, the following generally apply:
- Browser, client and sometimes module names can be suffixed with a slash followed by
the version number, e.g. Opera/9.27, Safari/419.3;
- In general, we can take the first substring outside brackets of the form xxxx/yyyy as being the client and version except in the case of Mozilla/yyyy, which many agents insert anyway;
- Examples like [2] are a problem, however: we need to know in advance
that "Safari" is the name of the browser and not "AppleWebKit";
- Robots may insert + and a URL that is intended to be the "home page" of the
robot, giving information about its purpose etc; genuine robots generally provide this;
- Elements are generally semicolon separated;
- There's no strict interpretation we can make of the brackets (compare [6] and [7] above,
or [2];
- There's no easy way to tell a plugin/module name from an operating system name– we
just have to know some common cases to help us make a guess (see the list below).
Common client strings in the user-agent header
For reference, the following commonly occur in the non-bracketed section
of the user-agent string. I've indicated the approximate frequency of given
agents in requests to my server (taken over a sample of requests from approximately 12,000 unique
clients). Obviously, these figures could differ according to the target audience of your site.
Agent name | % occurrence | Type | Explanation |
Mozilla | 99.5 % | Compatibility |
Essentially meaningless nowadays. Originally, this agent name denoted either a version of
Netscape or the version of Netscape with which another agent wanted to claim compatibility.
Nowadays, even the most trivial home-made bot tends to include Mozilla/4.0 or
Mozilla/5.0 at the start of the user-agent string.
|
Gecko | 24.8 % | Compatibility |
Gecko is the rendering engine used by Firefox and a handful of
other Mozilla spin-offs such as SeaMonkey. In principle, you could take notice of this
field to alter page contents to work around some rendering bug, or offer enhanced content
if you were aware of some feature that Gecko rendered better than other engines. In practice,
I suspect practically all servers ignore this field.
The Safari web browser amusingly includes the string (KHTML, like Gecko) in its
user-agent string.
|
Firefox | 24.4 % | Browser |
Identifies the Firefox web browser.
|
Safari | 7.0 % | Browser |
Apple's Mac OS browser. More information on the user-agent string sent by Safari is
given in Apple's Safari FAQ.
|
AppleWebKit | 7.0 % | Compatibility |
What Apple describe as their "web technology framework", used by the Safari browser.
|
Version | 6.1 % | Compatibility |
Used by version 3.0 or greater of the Apple's Safari browser to indicate the "Safari family version
number" (see entry for Safari above).
|
msnbot | < 1 % | Robot |
Microsoft's search bot for their MSN Live Search engine.
|
Profile | < 1 % | Compatibility |
Sent by some Java-copmpatible mobile devices to indicate which Java Mobile Information Device Profile (MIDP) they support. A typical specification would be Profile/MIDP-2.0. BlackBerries and Nokia and Sony Ericcson mobile phones (possibly among others) send this tag.
|
Configuration | < 1 % | Compatibility |
Sent by some Java-copmpatible mobile devices to indicate which version of the Java Connected Limited Device Configuration they support.
|
BlackBerryXXXX | < 1 % | OS / Browser |
BlackBerry devices appear to send as their browser/OS identifier a string that includes the model
number (e.g. a BlackBerry 8310 with OS version 4.3 would send BlackBerry8310/4.3.0).
|
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.