Advanced Application-Level OS Fingerprinting: Practical Approaches and Examples
29 Oct. 2008
Dan presents an alternate approach to application-level OS fingerprinting. The general idea is that there are certain requests that can be made to cross-platform applications which result in OS-dependant responses.
The current state of OS fingerprinting involves, for the most part, layer 3 & 4 requests and responses. This includes tools like nmap, nessus, p0f, and sinFP. These tools make specific queries and examine the response for things like TCP/IP stack settings, TCP option support, the number of syn+ack packets sent before a timeout, the time interval between syn+ack packets, and the presence of a rst packet sent at timeout. Many of these things are manipulated or deformed by intermediary devices, and much of it can be fairly easily spoofed using freely available tools like IP Personality and Security Cloak.
Application-level OS fingerprinting, on the other hand, relies on data gleaned from an application running on the target host. The current methods include the identification of OS-specific applications, such as Remote Desktop, and "banner grabbing", which involves connecting to a cross-platform application and recording what operating system it claims to be running, usually in the welcome banner (thus the name). Banner grabbing is trivial to fool, usually requiring nothing more complicated than a change in a configuration file. It's not unheard of, for instance, for Apache servers to report that they are running on a Commodore 64.
Considering this, Dan presents an alternate approach to application-level OS fingerprinting. The general idea is that there are certain requests that can be made to cross-platform applications which result in OS-dependent responses. This can be in the form of the data returned by the application, the lack of response, or timing differences in the response, although this is almost certainly not a comprehensive list. One example of existing usage of this technique is in . Based on known differences in responses and known OSes associated with specific responses, one can determine the OS of a host running a service for which such signatures exist.
There are, of course, many differences in operating systems, but only some of them are actually feasible to measure remotely (in general scenarios). Here's a non-comprehensive list:
0x0d -> Classic Mac OS
0x0d0a -> win32, DOS, OS/2, SymbianOS
0x0a -> *nix
0x15 -> EBCDIC-based
Data alignment, word size, the existence of OS-specific binaries, and processor feature support are more criteria that can be used, but are less commonly determinable remotely.
Additionally, there are differences that are specific to an application that can be used. This varies widely from application to application, but often, these differences take one of a couple of forms. They can be vulnerabilities, or code written to patch those vulnerabilities that exist only on certain platforms (Fyodor has already discussed exploit chronology in  but does not discuss testing the existence of mitigation code). Certain features of programs will not be supported on some OSes, and will on others. The presence of these features eliminates unsupported OSes as a possibility. Finally, certain applications will have features which give out system details very freely. As a part of a default Apache installation, a test perl cgi script, printenv.pl, is placed in the cgi-bin directory. This script, when run, prints all environment variables. This is more or less advanced banner grabbing, but it makes a nice example of leaky features that can be found in certain applications.
But enough talk. Let's get to some real examples!
Example 1 - Apache 2.2.9
-URL must not be URL-encoded: PuTTY or an intercepting http-proxy can be used to ensure this
-Will return a 404 Not Found error
-Again, no URL-encoding
-Will return 200 OK and load front page
(This is an example of mitigation code that exists only on Win32 installations of Apache. Apache, when compiled for Windows, will convert backslashes to slashes. On *nix, it will not. This would work even if there wasn't specific mitigation code for Win32, but the fact that Apache on *nix doesn't change the backslashes to slashes means that backslashes will be interpreted as a part of a filename, and will just be extraneous slashes to Win32, resulting in \\\. being interpreted as a filename on *nix, and as a reference to the root dir of the website on a win32 machine.)
Example 2 - Apache 2.2.9
-Returns 404 Not Found
-Not a windows based system
(This works because Apache won t have read access to the bit bucket nothing should! On DOS-style systems, special files like nul and con exist and can be accessed from all directories. This should also work on other cross-platform web servers, ftp daemons, etc but Dan hasn't tested it)
Example 3 - Apache 2.2.9
-404 Not Found
(Apache on Win32 doesn't appreciate EOF markers being stuck into URIs Funny enough, it doesn't like anything but GL codes(0x20-0x7f). Dan doesn't know where in the code this happens. Dan thinks it is likely a limitation of the system-level functions failing with characters outside a given range. Other operating systems aren't as picky.)
-Has drive and pathnames in file
-No drive or pathnames in file
-Unique to XP Media Center
-Auto-generated image thumbnail database
-Exists in every dir with images (or certain other files) viewed in Windows Explorer with thumbnails on (even if images are later deleted)
-Generated on 98, ME, 2K, XP, 2003 (Maybe more, documentation is very sparse)
-Differs in contents between 98/ME/2K and XP/2003
-Win2k will use alternate data streams for thumbnail storage on NTFS volumes and thumbs.db on FAT partitions
-Windows XP Media Center Edition will also create ehthumbs.db for videos
(table expanded from )
System | Win98 | WinME | Win2k | WinXP | Win2k3
Drive | Yes | Yes | Yes | No | No
Filename| Yes | Yes | Yes | Yes | Yes
Path | Yes | Yes | Yes | No | No
Last Mod| Yes | Yes | Yes | Yes | Yes
(One note about this: Dan was confronted with a problem with this method by Mike Eddington of Leviathan Security, who pointed out the possibility that a thumbs.db file could be uploaded to a webserver along with the corresponding images. After some thought, Dan realized that you could check the last modified date of the thumbs.db file as sent by the web server and the last modified date as recorded in the file. If they match, it was updated by the server itself!)
Example 5 - IIS 
This one's pretty simple. IIS versions correspond to specific versions of Windows. Enumerate the IIS version, and you get the OS version.
IIS version | OS version
1.0 | NT 3.51 SP3
2.0 | NT 4.0
3.0 | NT 4.0 SP3
4.0 | NT 4.0 SP3
5.0 | Win2k
5.1 | XP Pro
6.0 | Server 2003
7.0 | Vista, Server 2008
(Dan hasn't figured out ways to determine between IIS versions, although exploit chronology and feature support would likely be good candidates, and the version of IIS being reported should give a good indication. Dan suspects, however, that it should be easy for an admin to spoof.)
Example 6 - default FTP daemons
Run the raw ftp commands rnfr . and then rnto . against a default FTP daemon on a writable directory. It will generally spit back "350 File exists, ready for destination name" followed by a message about what happened with the operation
550 rename: Is a directory.
250 RNTO command successful
550 rename: Invalid argument.
Ubuntu 8.04 server
550 rename: Device or resource busy.
More quick examples
Win32 Apache doesn't like colons in urls, other OSes don't care so much
Win32 Apache will accept, for example, /BLAHBL~1.HTM as a valid replacement for blahblahblah.html where it will not on other OSes
(This one kinda sounds like security hell)
Other (less useful) signatures
The presence of $MFT in the root of a volume suggests an NTFS volume
Accessing a directory as a file will work on BSD systems
-FreeBSD and NetBSD will spit back binary data
-OpenBSD will return nothing but won t complain
Windows can only use one audio input source at a time
(Dan came up with these too and wanted to include them for completeness, but they're so case-specific that Dan doesn't want to give them too much time.)
If you'd like to find some signatures yourself, try grepping for #ifdef and #ifndef in the sources of any given application (if you have sources and they're C). "Linux", "BSD", "MacOS", etc are also nice candidates. Additionally, the basis for many of the examples I've given here should apply to many different applications which do similar operations or make similar system calls. Requesting nul from any application that serves or queries for files should quickly identify a Windows system.
In conclusion, application-level OS fingerprinting using multi-platform applications is possible and plausible without using banner-grabbing. OS identity info leakage in applications is not well considered (Dan based this on the fact that Dan was able to find 3 leaks with the latest version of Apache in 1 hour of searching). This method is user, and can be combined with other fingerprinting methods, which is where Dan feels it would be most effective, though with enough signatures, this method could stand entirely on its own.
As for further work that can be done in this field, there are definitely timing differences in application responses that could be used for OS fingerprinting. It should also be possible to find OS-version-specific responses in platform-specific applications, much like the IIS version-to-OS version example. Also, these techniques could be coded into a tool, but I'm not a great coder by any means. Dan hopes to release some python scripts for a few of these examples shortly after the paper is released. And finally, there's more applications out there to use for fingerprinting!