How to Detect Joomla Websites

Have you ever wondered if there are ways to detect Joomla websites? Or in other words: are there any reliable indications of an arbitrary website being powered by Joomla? And if so, how would these look like?

This post is about exactly that.

The reason for this is explained in my post about scanning the Internet for CMS websites. There is also an according post for WordPress.

Markup

The first thing that comes to mind is analyzing the actual markup of a particular website. And yes, there are quite a few things that might give away whether or not the website is a Joomla website.

Generator meta Element

The easiest catch is the generator meta element. Just like Drupal and WordPress, Joomla by default renders an HTML element with meta data about the software used to generate the document (i.e., Joomla 😉 ). The according markup could look like something from the following.

Without version information:

<meta name="generator" content="Joomla! - Open Source Content Management" />

With version information (old versions only):

<meta name="generator" content="Joomla! 1.5 - Open Source Content Management" />

With version information (more recent versions):

<meta name="generator" content="Joomla! - Open Source Content Management - Version 3.6.5" />

If you spot any of these somewhere, you can assume the website is powered by Joomla.

However, the generator is not an absolute truth as it is easily possible to either provide a customized or empty content value, or not render the meta element at all.

Assets

Analyzing the asset files referenced in the markup might also hint at Joomla. Common path fragments are, for example, the following:

  • /media/jui/(css|js)/
  • /media/media/(css|images|js)/
  • /media/system/(css|images|js)/
  • /templates/*/(css|images|js)

In the above list, the asterisk, *, is to be seen as wildcard for a single folder name, and the expression (this|that) means either “this” or “that” (i.e., any of the included words separated by a pipe, |). I did not use regular expressions because people might not be used to these anyway. Valid file paths included in the given patterns are, for example, /media/jui/js/jquery.min.js or /templates/protostar/css/template.css.

Of course, finding one or more of these path fragments in the markup does not necessarily mean the respective website is powered by Joomla. Any other (custom) CMS might use, for example, a /templates/ folder to manage templates/themes, and thus have references to some /templates/*/css/*.css or /templates/*/js/*.js files. Also, there might be Joomla websites that use different names for one or more of the (base) folders, and/or do not reference any system assets at all.

Last but not least, template authors are not forced to have the exact same subfolders from the above list. They can easily decide for /templates/*/img/ instead of /templates/*/images/, or favor /templates/*/javascript/ over /templates/*/js/.

body Classes

Unless heavily customized or disabled, most Joomla websites have several of the following HTML classes assigned to the body element:

  • site
  • com_*
  • view-*
  • (no-)layout(-*)
  • (no-)task(-*)
  • itemid-*

Just like with the assets, these class names are no hard evidence. There are Joomla websites that do not have any of the classes, and there are also lots of non-Joomla websites that have one or more of them. However, if you happen to find a combination of the above classes—and even more if com_*, which denotes a specific component, is included—this might mean the website is powered by Joomla.

Update: while I was at JoomlaCamp 2017, I learned that the body classes are generated by the template itself. In other words, Joomla does not include something like the body_class() function that you might know from WordPress. As a result, the above list of classes is not really of any help. If you happen to find several of these classes, you might be in luck and have landed on a Joomla website with a template that follows the example of the default one. But one shouldn’t expect too many websites out there with these classes.

Deep Link URLs

The previous section was not aimed at specific pages or files, but rather at the front end in general. In addition, there are several files—or folders, but in the end it is some index file anyway—that might help in detecting whether or not a website is powered by Joomla.

Admin Panel

The Joomla site administration can usually be reached under /administrator. If the request to this URL is successful—this means you are on the login page—you can do some further investigation. One thing is to check the path of the included assets. In addition to files referenced on the front end, the login page also might use administrator-specific asset files. Path fragments you might want to test are described by the following pattern:

  • /administrator/templates/*/(css|images|img|js)/

Classes on the body element are, by default, these:

  • site
  • com_login
  • view-login
  • layout-default
  • task-
  • itemid-

Interesting is the missing name and ID in the task-* and itemid-* class, respectively.

In the markup, the default login page has the following HTML code (fragments) that you can test for:

  • <input name="username" ... type="text" ... />
  • <a href="*?option=com_users&view=remind" ... >
  • <input name="passwd" ... type="password" .../>
  • <a href="*?option=com_users&view=reset" ... >
  • <input type="hidden" name="option" value="com_login"/>
  • <input type="hidden" name="task" value="login"/>
  • <input type="hidden" name="return" value="*"/>

Of course, you might also find references to “joomla” and/or “Joomla”.

Last but not least, you can also try to request /administrator/help/helpsites.xml. If it exists and if you are allowed to access it, you might find a <joshelp> tag inside, and again references to “joomla” and/or “Joomla”.

Component Files

In the components/ directory, there are lots of XML files that you can access. This might change soon as XML files are to be blocked via .htaccess (for Apache web servers, that is). Also, most of the files are almost empty and without a reference to Joomla.

However, as long as you can access the files, the following two are quite interesting, because they contain quite a few references to “joomla” and/or “Joomla”, amongst other things:

  • components/com_mailto/mailto.xml
  • components/com_wrapper/wrapper.xml

.htaccess Sample File

Directly in the root folder, there is an htaccess.txt file waiting to either be renamed (into .htaccess), or removed. Oftentimes, however, the file stays where it is, like it is. This means you can request and read it. And if this is a Joomla website, you will find references to “Joomla”.

Language Files

Similar to the component XML files, there are also language-specific files that you might want to try. For example, language/en-GB/en-GB.inilanguage/en-GB/en-GB.xml and language/en-GB/install.xml might contain useful information, if the files exists, of course. As mentioned before, XML might get inaccessible from remote soon, and maybe INI files as well.

RSS Feed

By default, Joomla responds with an RSS/Atom feed to URLs that include format=feed in the query string (e.g., https://developer.joomla.org/?format=feed). Maybe you already know, there might as well be other content management systems that provide an RSS feed for /?format=feed, so a successful request is no proof that the website is powered by Joomla.

However, most of the feeds come with a generator comment, like so:

<!-- generator="Joomla! - Open Source Content Management" -->

And maybe you even get the version:

<!-- generator="Joomla! - Open Source Content Management - Version 3.6.5" -->

If you happen to find this in a feed response, you can safely assume the website to be powered by Joomla.

In case the above request does not result in an XML response, this might be due to the front page being no content page. News feeds are only available for category blog, category list and featured articles. But we can get around this by manually querying a (random) category by requesting /?option=com_content&view=category&id=1&format=feed, with 1 being a random category ID.

Template Details

Depending on the template, you may be able to access further information via a dedicated file, templateDetails.xml. The file, if exists, is located in the template root, so for example, at /templates/protostar/templateDetails.xml. Trying to access and then analyze the file is helpful if you were able to detect one or more /templates/* links, but could not reliably verify the website to be powered by Joomla.

You should be able to find a few references to Joomla in this file, one of which is usually the document type declaration, like so:

<!DOCTYPE install PUBLIC "-//Joomla! 2.5//DTD template 1.0//EN" "https://www.joomla.org/xml/dtd/2.5/template-install.dtd">

Web.config

Joomla comes with a file that is to be used with Microsoft IIS servers: the Web.config. Several Joomla websites running on a Linux server, however, still have the web.config.txt file, so you can try to request it (e.g., like so).

Of course, this wouldn’t be a big deal if the file wasn’t giving away some concrete indications on Joomla being used. It does, by default. The config file includes two rules that each have a name including (or rather: starting with) “Joomla!”, so you can simply look for <rule name="Joomla!. If the file exists, and if you find the above string, you can assume this is a Joomla website.

HTTP Headers

In general, HTTP headers do not really give away if a website is powered by Joomla. However, some might send an X-Content-Encoded-By HTTP header that includes “Joomla!”, and maybe even the version number. Of course, this depends both on the Joomla version and the configuration of the web server.

In addition, there might be (popular) Joomla extensions that add one or more relevant headers indicating the according website is a Joomla website. If you happen to know good examples, please tell me. 🙂

In case you want to try this out yourself, you can easily display a website’s HTTP headers in your browser.

Update: I just learned that, since Joomla 3.5, there is a hint to Joomla included in the response headers, somewhat hidden in plain sight. It is the (second) Expires header field that always has the value Wed, 17 Aug 2005 00:00:00 GMT, which is the date Joomla got founded. 🙂

Something Else?

Did I miss something here?

Are there any things that lack mentioning the necessary circumstances?

Please, let me know.


Thanks to David Jardin and all the other participants in my discussion at JoomlaCamp 2017 for providing valuable feedback. Thanks to Brian Teeman for the hint about the Expires header.

Leave a Reply

Your email address will not be published. Required fields are marked *