Since version 8.5 OpenCms contains two features for the purpose of Search Engine Optimization: Aliases and SEO files. Aliases allow to have alternative URLs for pages that don't correspond to actual paths in the VFS. SEO files can be used to automatically generate XML sitemaps and robots.txt files for your site.

Aliases

1.1 Kinds of aliases

There are two types of aliases which can be defined: simple aliases and rewrite aliases.

1.1.1 Simple aliases

Simple aliases consist of an alias path, a target resource, and an action that determines what happens when the path is requested from OpenCms. Simple aliases are directly connected to their target resource, so they will continue to point to that resource even if it is moved or renamed. Valid alias paths consist of one or more segments which each consist of a / and one or more characters. This means that they may not end with a trailing /. There are three possible actions which can be performed if the path of an alias is requested from OpenCms:

Actions available for simple aliases
Temporary redirect (302)

A temporary redirect will be sent to the browser, with the current link to the target resource as the new URL.

Permanent redirect (301)

A permanent redirect will be sent to the browser, with the current link to the target resource as the new URL.

Show page

OpenCms will try to load the target resource directly in the current request.

1.1.2 Rewrite aliases

While a simple alias maps a single path to a single resource, rewrite aliases can be used to define aliases for whole classes of paths by specifying a regular expression to match a path, and a replacement string to apply if the pattern matches. OpenCms will test an incoming path against rewrite alias patterns and apply the first matching rewrite alias. There is no order defined for the matching, so you should not define rewrite aliases with overlapping patterns.

The pattern for a rewrite alias follows standard Java regular expression syntax, while the  replacement string follows the syntax for the parameter of the method java.util.regex.Matcher.replaceFirst(), i.e., the content of capture groups from the regular expressions can be accessed using dollar syntax ($1, $2,...). The pattern will always be matched against the whole path. Rewrite aliases have precedence over simple aliases, i.e., if a rewrite alias matches the current request's path, a simple alias for that path will not match.

There are three possible actions for rewrite aliases:

Actions available for rewrite aliases
Temporary redirect (302)

A temporary redirect will be sent to the browser, with the substitution result as the new URL.

Permanent redirect (301)

A permanent redirect will be sent to the browser, with the substitution result as the new URL.

Passthrough

The path resulting from the pattern replacement will be passed to the next resource handler configured in the <resourceinit>-element after the alias handler.

1.2 Internals on aliases

All alias matching for incoming requests is handled by the class org.opencms.main.CmsAliasResourceHandler, which is by default configured in the opencms-system.xml configuration file in the <resourceinit>-element. Even if this handler is removed from the configuration, it is still possible to edit aliases through the user interface, although they will have no effect.

For paths which actually exist in the VFS, and for paths that start with /system/, the alias resource handler will not be used. Note that aliases are only active for a single site. For example, it is possible to use the same alias path for different simple aliases on different sites. Note that, due to the limitations of redirects, request parameters will not be preserved if a POST request's path matches an alias path and a redirect action is performed.

1.3 The SEO options dialog

A "SEO options" context menu entry is available in the page editor's context menu and in the context menu for sitemap entries.

When this item is selected from the Context menu, the SEO options dialog opens:

Fig. [SEO options dialog]: The SEO options dialog

The SEO options dialog allows you to change the Title, Description and Keywords properties, and to edit the simple aliases for the resource. Note that changes will only be applied once you click the "Save" button.

You can edit an existing alias by changing its path in the text box or changing the action using the action select box. The "Delete" button can be used to remove the alias. You can add a new alias by entering the alias path in the text box for new alias paths, selecting the action from the select box, and clicking the "Add" button ().

1.4 The edit aliases dialog

The "Edit aliases" dialog allows you to view or edit all aliases (simple and rewrite) for the current site. You can open it from the sitemap editor's main context menu.

Fig. [Open edit aliases dialog]: Opening the edit aliases dialog

When this item is selected from the Context menu, the SEO options dialog opens. This dialog allows you to edit new aliases, edit existing aliases, and import and export aliases for the current site to/from a CSV file.

Fig. [Edit aliases dialog]: The edit aliases dialog

1.4.1 Creating a new alias

You can create a new alias by entering the alias path and target page (or pattern and replacement, respectively), selecting an action and clicking the "Add" button.

1.4.2 Editing existing aliases

Existing aliases (either simple or rewrite aliases) can be edited in their respective tables. You can change an existing alias' attributes by clicking on the corresponding table entry. If you enter invalid alias paths, the "Status" field of the corresponding table will display an "Error" label; you can get a more detailed message when hovering your mouse over this label.

You can delete existing aliases by selecting the checkboxes in the "X" column of the corresponding table rows, and then clicking the "Delete" button above the table.

1.4.3 Exporting and importing aliases

You can click the "Export" button to download a CSV file containing the currently defined aliases for the site. Note that this will not include any unsaved alias changes from the dialog.
Clicking the "Import" button will open another dialog that allows you to import a previously exported CSV file.

Fig. [Import aliases dialog]: The import aliases dialog

First click the "Select File" button to select a CSV file to import. Then click the "Import" button. It is possible that the CSV file was generated by a program which uses a different field separator than ,, so you can also set the field separator to use when parsing the file before clicking on "Import". The aliases will then be imported, and status messages will be displayed in the text field on the bottom of the dialog. In the example, the import CSV file format looks like the following:

"/news-alias-1","/demo/about/index.html","redirect"
"/news-alias-2","/demo/about/index.html"
"/demo/([a-z]+).html","/demo/$1-old.html","permanentRedirect","rewrite"

Each line consists of two, three or four fields. If the line contains 3 fields, it is interpreted as a simple alias, with the following field definitions:

  • Field 1: The alias path
  • Field 2: The site path of the alias target
  • Field 3: The action for the rewrite.
    • page: Show the page
    • permanentRedirect: Send permanent redirect
    • redirect: Send temporary redirect

Lines with 2 fields will also be interpreted as simple alias paths, with the alias mode implicitly set to "permanentRedirect".

If the line contains 4 fields, it is interpreted as a rewrite alias, with the following field definitions:

  • Field 1: The alias pattern
  • Field 2: The replacement string
  • Field 3: The action for the rewrite
    • passthrough: Pass the rewritten path to the next resource handler
    • permanentRedirect: Send permanent redirect
    • redirect: Send temporary redirect
  • Field 4: Must be the constant value "rewrite"

Automatic robots.txt and XML sitemap generation

OpenCms provides automatic generation of XML sitemaps and robots.txt files which reference those sitemaps. Both of these are provided by the new seo_file resource type. When a resource of this type is requested from OpenCms, OpenCms will dynamically generate the sitemap or robots.txt data and send them to the client.

2.1 XML sitemap generation

Create a new resource of the type "SEO file" in the folder for which you want to generate the XML sitemap (via the workplace's editor, the "SEO file" is found under "Structured content" in the "New"-dialog). Open this file with the content editor. Make sure the mode "XML sitemap" is selected.

Fig. [SEO XML content]: Editing a SEO XML content

By default, the XML sitemap will contain all navigation entries in and below the current folder, the URLs for all detail pages whose base container pages are in or below the current folder, and all aliases whose alias paths are below the current folder and whose action is defined as "Show page".

"Include folder" entries allows you to define folders whose whole contents, including resources from subfolders, will be included in the XML sitemap.

"Exclude folder" entries allow you to define folders whose whole contents, including resources from subfolders, will be excluded from the XML sitemap.

If you check the "Container page dates" option, the last modification date of container pages will be computed from the last modification date of their contents.

Note that the XML sitemap output will always correspond to the online project state of resources, and will only contain pages which are visible to the Guest user.

2.2 Generation of robots.txt

To use this feature, create a new file of type "SEO file" in the root folder of your site. Using this feature only makes sense if your host is set up so that the OpenCms site is mapped to the root of your domain, i.e., that http://yourhost.com/robots.txt will be retrieved from the OpenCms site's root folder.

Edit the SEO file and set the mode to "robots.txt". By default, when the robots.txt file is requested from OpenCms, it will only contain references to the existing XML sitemaps of your site. To add additional entries, you need to add them to the "robots.txt content"-field of your SEO file.