April 22, 2015

Apache 2 ProxyPassReverse URL Replacement

It is very common to use Apache virtual hosts as a proxy to get to other application servers running in your environment. I have such a setup with Jira for my open source projects.  I have Jira for my open source projects at this URL: http://ferris-jira.ddns.net. This domain name gets resolved to my server, hits an Apache virtual host for that domain, then ProxyPasses the requests along to the Jira instance. The response from Jira gets ProxyPassReversed back through Apache and to the user's browser. This works fine, as long as links generated by the application are all relative.

Unfortunately, Jira generates some fully-qualified links which started with http://localhost:8080.  Once this fully-qualified link gets to the user's browser it obviously won't work.  To solve this problem I used a combination of different Apache directives to search through the response from Jira and replace any of the localhost URLs.  Let's take a look.

# Turn compression off in order for the Substitute to work.
RequestHeader unset Accept-Encoding
# http://httpd.apache.org/docs/2.4/mod/mod_proxy.html#examples
ProxyPass / http://localhost:8080/
ProxyPassReverse / http://localhost:8080/
Substitute "s|http://localhost:8080|https://ferris-jira.ddns.net|n"
# In order for the substitute module to work we have to add it to the filter chain.
FilterDeclare Substitute
FilterProvider Substitute SUBSTITUTE "%{REQUEST_URI} =~ m#^/#"
FilterChain +Substitute

The first thing we need to do is use RequestHeader and turn off compression (#2). If the response from Jira is zipped, we can't do any search and replace so the response needs to be plain text.  Next are the typical ProxyPass (#4)  and ProxyPassReverse (#5). These get the request to Jira and the response from Jira. Next we define a Subsitute (#6) for what we want to replace. You'll recognize this syntax from the sed command. Finally we have a filter chain to get the Substitute to run on the response (#8-10). The filter matches on the URI of the request made to Apache (#9), and is typically configured to the application's context.  In the example above, I have Jira running as the root context / which is why ProxyPass and ProxyPassReverse have http://localhost:8080/ and the FilterProvider matches the regular expression m#^/#.  If Jira was running on the "jira" context, the configuration would be ProxyPass and ProxyPassReverse with http://localhost:8080/jira and FilterProvider with the regular expression m#^/jira#.

So that's it.  put this into your Apache <VirtualHost>, restart Apache and you'll be good to go.

References
ArtemGr. (Oct 15, 2013). Apache 2.4 reverse proxy with URL substitution. Retrieved April 10, 2015 from https://gist.github.com/ArtemGr/6993113

Enjoy!

Unix find tip: Excluding files

Sometimes you want to find files on a Unix file system but exclude files either by specific name or by pattern.  The best way I found to do this is by using a combination of find and grep.  Let's take a look.

$ find . -type f | grep -P '^(.(?!Thumbs\.db))*$'

This example uses the find command in the current directory (.) and limits the search to just files (-type f) instead of both files and directories. The result is then piped to grep. The regular expression (-P '^(.(?!Thumbs\.db))*$')  filters out all results from the find command that end with Thumbs.db.

So this is a quick and easy way to exclud files by a specific name.  Next let's take a look at how to exclude files by pattern.

find . -type f | grep -P '^(.(?!\.ffs_db))*$'

This example uses a little more complicated regular expression (-P '^(.(?!\.ffs_db))*$') to filter all results from the find command that end with .ffs_db. This example is essentially exclude by file name extension.

Now what if you want to exclude multiple file names or file types?  Just pipe together more grep statements.

find . -type f | grep -P '^(.(?!Thumbs\.db))*$' | grep -P '^(.(?!\.ffs_db))*$' | grep -P '^(.(?!\.ffs_gui))*$'

In this final example, I am excluding files named Thumbs.db, files that end with .ffs_db and files that end with .ffs_gui.

References
Retrieved April 13, 2015 from http://fineonly.com/solutions/regex-exclude-a-string

Enjoy!