<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Frank Hecker &#187; site</title>
	<atom:link href="http://frankhecker.com/category/site/feed/" rel="self" type="application/rss+xml" />
	<link>http://frankhecker.com</link>
	<description>Trying to unite civility and truth in a few long blog posts</description>
	<lastBuildDate>Wed, 11 Jan 2012 13:03:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='frankhecker.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Frank Hecker &#187; site</title>
		<link>http://frankhecker.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://frankhecker.com/osd.xml" title="Frank Hecker" />
	<atom:link rel='hub' href='http://frankhecker.com/?pushpress=hub'/>
		<item>
		<title>Changing my (blog) name, plus Plus</title>
		<link>http://frankhecker.com/2011/10/30/changing-my-blog-name-plus-plus/</link>
		<comments>http://frankhecker.com/2011/10/30/changing-my-blog-name-plus-plus/#comments</comments>
		<pubDate>Sun, 30 Oct 2011 04:13:12 +0000</pubDate>
		<dc:creator>hecker</dc:creator>
				<category><![CDATA[art]]></category>
		<category><![CDATA[blosxom]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[howardcounty]]></category>
		<category><![CDATA[misc]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[music]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[site]]></category>

		<guid isPermaLink="false">http://frankhecker.com/?p=5165</guid>
		<description><![CDATA[For those following this blog, note that I&#8217;ve changed the canonical site name from blog.hecker.org to frankhecker.com. Any links and feed URLs referencing the previous domain name will still work for the foreseeable future, but if and when you have time you may want to update your bookmark list, RSS newsreaders, and related information to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=5165&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>For those following this blog, note that I&#8217;ve changed the canonical site name from blog.hecker.org to frankhecker.com. Any links and feed URLs referencing the previous domain name will still work for the foreseeable future, but if and when you have time you may want to update your bookmark list, RSS newsreaders, and related information to reflect the new name.</p>
<p>A little history by way of background: I was around when the Internet was first being commercialized, and I had the opportunity to register hecker.com for myself if I really wanted to. However I passed because I didn&#8217;t have a server to associate with it and I thought I needed to be running an actual server in order to register the name (though I&#8217;m not sure that was the case even then). When I finally got around to having a personal server in the late 1990s I found that hecker.com had already been taken by a company that registered thousands of surname domains so that they could offer a shared domain service in which multiple people could have their own personal subdomains under a top-level domain: jane.smith.com, john.smith.com, and so on. So I settled on the next best thing and registered hecker.org instead for use as my primary domain, at the same time registering frankhecker.com (as well as the .org and .net variants) to prevent anyone else from getting it.</p>
<p>When I first started a blog I hosted it at hecker.org using custom blogging software. I later got tired of the management hassles involved, and moved my blog to WordPress.com, using the subdomain blog.hecker.org because I was still hosting other things at hecker.org and couldn&#8217;t afford to dedicate the entire domain just to my blog. Since then though the blog has assumed more importance as my public face to the world, and I regretted having a somewhat unusual domain name for it. I&#8217;ve therefore decided to adopt the conventional approach and use frankhecker.com as my primary blog name. (As noted above the old name of blog.hecker.org will continue to work, thanks to the magic of HTTP redirects.)</p>
<p>Note that my primary personal email address remains hecker@hecker.org; I have no plans to change that. However I can also receive email at frankhecker.com, so for example sending email to frank@frankhecker.com will get to the same inbox as hecker@hecker.org. I may switch over completely to frankhecker.com for all uses in future, but in the meantime there&#8217;s no need to update your address book.</p>
<p>In other news, I&#8217;m now on Google Plus so you can add me to one of your circles if you&#8217;d like. I&#8217;ve been meaning to try Google Plus out before now, but I use Google Apps for my email and related services, and Google Plus wasn&#8217;t added to Google Apps until this week. I&#8217;ll publish notices of new blog posts to Google Plus, and maybe some other stuff from time to time.</p>
<br />Filed under: <a href='http://frankhecker.com/category/art/'>art</a>, <a href='http://frankhecker.com/category/blosxom/'>blosxom</a>, <a href='http://frankhecker.com/category/education/'>education</a>, <a href='http://frankhecker.com/category/howardcounty/'>howardcounty</a>, <a href='http://frankhecker.com/category/misc/'>misc</a>, <a href='http://frankhecker.com/category/mozilla/'>mozilla</a>, <a href='http://frankhecker.com/category/music/'>music</a>, <a href='http://frankhecker.com/category/politics/'>politics</a>, <a href='http://frankhecker.com/category/site/'>site</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hecker.wordpress.com/5165/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hecker.wordpress.com/5165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hecker.wordpress.com/5165/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=5165&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://frankhecker.com/2011/10/30/changing-my-blog-name-plus-plus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/523287496a16cae22d6337ab1aae4491?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hecker</media:title>
		</media:content>
	</item>
		<item>
		<title>URI rewriting and canonical URIs</title>
		<link>http://frankhecker.com/2004/11/18/uri-rewriting-and-canonical-uris/</link>
		<comments>http://frankhecker.com/2004/11/18/uri-rewriting-and-canonical-uris/#comments</comments>
		<pubDate>Thu, 18 Nov 2004 07:10:19 +0000</pubDate>
		<dc:creator>hecker</dc:creator>
				<category><![CDATA[blosxom]]></category>
		<category><![CDATA[site]]></category>

		<guid isPermaLink="false">http://blog.hecker.org/2004/11/18/uri-rewriting-and-canonical-uris/</guid>
		<description><![CDATA[NOTE: This post refers to the older Blosxom-based version of my blog. I&#8217;ve left the post as is in case it&#8217;s of interest to anyone running Blosxom. Here I document the way in which I use URI rewriting (along with redirection and a couple of Blosxom plugins) to help implement my personal design philosophy for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=145&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>NOTE: This post refers to the older Blosxom-based version of my blog. I&#8217;ve left the post as is in case it&#8217;s of interest to anyone running Blosxom.</p>
<p>Here I document the way in which I use URI rewriting (along with redirection and a couple of Blosxom plugins) to help implement my personal <a href="http://hecker.org/site/design-philosophy">design philosophy</a> for my web site. My goal is to create a unified URI space within which static and dynamic content can transparently co-exist, with publicly-visible URIs for human-readable content (i.e., HTML pages) having a canonical form that omits file extensions or other content type specifiers.</p>
<p>Achieving this goal requires solving two separate problems: intermixing dynamic and static content (in my case, Blosxom-generated and non-Blosxom content) in the same URI hierarchy, and recognizing and enforcing preferred canonical forms for URIs.</p>
<h3>Intermixing dynamic and static content</h3>
<p>As an example of freely intermixing dynamically-generated content and static content (e.g., images) within the same URI hierarchy, consider a Blosxom-based web site where <code>http://www.example.com/foo</code> is a Blosxom category, <code>/foo/bar.html</code> is the HTML page displayed for an individual Blosxom entry in that category, and <code>/foo/baz.jpg</code> is an image referenced by that entry. (We assume that the site is already using one of the <a href="http://www.blosxom.com/faq/cgi/hide_cgi_bit.htm" title="How do I hide the /cgi-bin/blosxom.cgi bit of my URL?">suggested techniques</a> for hiding the <code>/cgi-bin/blosxom.cgi</code> part of the URI.) With Blosxom there are at least two possible approaches to support such intermixing.</p>
<p>One approach would be to invoke Blosxom (i.e., <code>blosxom.cgi</code>) for each and every URI processed, and then to use the <a href="http://www.blosxom.com/plugins/display/binary.htm">binary plugin</a> or the <a href="http://www.blosxom.com/plugins/files/static_file.htm">static_file plugin</a> based on it; these Blosxom plugins check to see if a requested URI corresponds to an existing file in the file system and, if so, they return the file&#8217;s contents as the output (as opposed to trying to generate a Blosxom page).</p>
<p>The second possible approach is the reverse: Have Apache serve up existing files (and indices for existing directories), and invoke Blosxom only when the URI references a file (or directory) that doesn&#8217;t exist. This is the approach I&#8217;ve taken, for various reasons; in particular, I wanted to avoid the overhead of invoking Blosxom for each and every URI. This approach is also compatible with a strategy of converting all or part of the Blosxom-managed content into static files. For example, one could run <code>blosxom.cgi</code> in static mode to generate files and directories under the Apache document root; Apache<br />
would then serve up those files and directories just as it would non-Blosxom content.</p>
<h3>Recognizing and enforcing canonical URIs</h3>
<p>We want to enforce the following rules for how URIs should be represented:</p>
<ul>
<li>URIs used to access HTML content for directories, Blosxom categories, and Blosxom date-based archive pages should have no <code>index.html</code> component and one (and only one) trailing slash:
<pre><code>http://www.example.com/foo/

http://www.example.com/foo/2004/11/14/

</code></pre>
</li>
<li>URIs used to access HTML content for static web pages and individual Blosxom entries should have no <code>.html</code> extension and no trailing slash:
<pre><code>http://www.example.com/foo/bar
</code></pre>
</li>
<li>URIs used to access all other (non-HTML) content should have filename extensions; <code>index.*</code> components should be included for directories, Blosxom categories, and Blosxom date-based archives,<br />
etc.:</p>
<pre><code>http://www.example.com/foo/baz.png

http://www.example.com/foo/index.rss

http://www.example.com/foo/2004/11/14/index.rss

</code></pre>
</li>
</ul>
<p>Enforcing these conventions is relatively straightforward: if a requested URI does not follow the above rules then we simply force a redirect to the canonical URI. However determining whether a URI is already in the proper canonical form or not is less straightforward, especially when we intermix Blosxom and non-Blosxom content.</p>
<p>For example, if the URI <code>http://www.example.com/foo/bar</code> is requested then we have to do the following checks:</p>
<ul>
<li>Is <code>/foo/bar</code> an existing directory?</li>
<li>Is there an existing HTML file <code>/foo/bar.html</code>?</li>
<li>Is <code>/foo/bar</code> a Blosxom category?</li>
<li>Is there an individual Blosxom entry <code>/foo/bar.html</code>?</li>
</ul>
<p>In practice we do these checks in the order shown, both to eliminate ambiguity (for example, if there&#8217;s a directory <code>/foo/bar</code> and also a Blosxom entry <code>/foo/bar.html</code>) and also because it&#8217;s easier to implement: The first two checks are done by Apache in the course of doing URI rewriting, and the second two checks are done by a Blosxom plugin once the URI has been passed off to Blosxom for processing.</p>
<h3>Strategy</h3>
<p>Here&#8217;s the overall strategy we follow in doing URI rewriting:</p>
<ul>
<li>We divide requests into those that can be satisfied by returning a static file or directory index and those that are for<br />
dynamically-generated content; the latter are rewritten into URIs that invoke the necessary CGI script (<code>blosxom.cgi</code>). Note that in some cases we can make an immediate determination as to whether content is static or dynamic, while in other cases we have to wait for the results of subsequent URI rewriting rules.</li>
<li>If a request references an existing directory then if necessary we force a redirect to the canonical URI for the directory, namely a URI with one (and only one) trailing slash after the directory name.</li>
<li>If a request is for an existing <code>index.html</code> file then we force a redirect to the canonical URI for the directory in which that file is located.</li>
<li>If a request is for any other existing HTML file then we allow and require that the <code>.html</code> file extension be omitted. If the <code>.html</code> file extension is included in the requested URI then we force a redirect to the canonical URI for the file, namely a URI that omits the <code>.html</code> file extension and has no trailing slashes.</li>
<li>If a request is for an existing file (or a symbolic link to a file) then it is handled by Apache in the normal way. Any other requests are passed to Blosxom for processing.</li>
<li>Blosxom (actually, a Blosxom plugin) then performs its own set of checks on URIs, and rewrites URIs and/or forces redirects if needed.</li>
</ul>
<h3>URI rewriting quirks</h3>
<p>Before I discuss the URI rewriting rules themselves, here are various points to keep in mind when reading the rules; some of these points are not necessarily immediately apparent from reading the documentation for the <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html"><code>mod_rewrite</code> module</a> and the Apache <a href="http://httpd.apache.org/docs-2.0/misc/rewriteguide.html">URL Rewriting Guide</a>:</p>
<ul>
<li>In our case we have root access to the system and can put our rewriting rules in the master Apache configuration file (<code>httpd.conf</code>). These rules will <em>not</em> work as is if you do not have root access and have to put your rewriting rules in a <code>.htaccess</code> file. (I don&#8217;t have time to revise the rules to work for the <code>.htaccess</code> case, but perhaps someone else can do so; unfortunately URI rewriting in a .htaccess file is much more complicated than when done in <code>httpd.conf</code>.)</li>
<li><a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewriterule">RewriteRule</a> directives are evaluated in order looking for a match of the URI against the left-hand side of the RewriteRule directive. <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritecond">RewriteCond</a> directives are looked at <em>only</em> if the corresponding RewriteRule directive matches. If the left-hand side of the RewriteRule directive matches and the corresponding RewriteCond directives (if any) evaluate true, then the URI is rewritten into the form specified by the right-hand side of the RewriteRule directive. Otherwise Apache just goes on to the next RewriteRule.</li>
<li>The URI matched against the RewriteRule directives is not the full URI but rather just the path component of the URI; it does <em>not</em> include either the hostname part or any query string. Thus, for example, if the original request was for<br />
<code>http://www.example.com/foo/bar?flav=baz</code> then the RewriteRule directives will be matched against <code>/foo/bar</code>. (At least this is true in my case; this works slightly differently if you have to put your rules in a <code>.htaccess</code> file.)</li>
<li>Rule matching is done using regular expressions modeled on those in Perl. However Apache doesn&#8217;t support the full range of Perl regular expressions, and in particular doesn&#8217;t support any options to address the &#8220;greedy matching&#8221; problem. For example, if you match URIs against an expression like <code>^(.*)//?$</code> (for example, to detect excess trailing slashes) then for a URI like <code>/foo//</code> Apache will assign <code>$1</code> to be <code>/foo/</code> instead of <code>/foo</code> as we&#8217;d like. We have to hack around this problem as described in the next section.</li>
<li>Once we have rewritten a URI to our satisfaction we normally want to stop URI rewriting at that point, and use the the <code>L</code> (&#8220;last&#8221;) flag to do this; otherwise Apache would continue evaluating RewriteRule directives looking for further matches. However using <code>L</code> by itself is normally not sufficient to get Apache to do the right thing; typically we have to include other flags as discussed below.</li>
<li>Normally once rewriting ends Apache will simply take the (possibly rewritten) path component of the URI and append it to the defined &#8220;document root&#8221; value (i.e., the directory where static web content is located). This causes a problem if the URI actually refers to a path for which an Apache <a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html#alias">Alias</a> or <a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html#scriptalias">ScriptAlias</a> directive is defined. (For example, on my server <code>/cgi-bin</code> is not in the main document root, but is located elsewhere as defined by a ScriptAlias directive.) To fix this we use the <code>PT</code> (&#8220;pass through&#8221;) flag where needed to tell Apache to first pass the URI through to the <a href="http://httpd.apache.org/docs-2.0/mod/mod_alias.html"><code>mod_alias</code> module</a>.</li>
<li>If the URI refers to an existing directory then after the first round of rewriting ends the Apache <code>mod_dir</code> module will initiate so-called &#8220;subrequests&#8221; for URIs with <code>index.html</code>, etc., appended, and each subrequest will start a new round of rewriting.  We don&#8217;t want to have any of our rewriting rules invoked in that case (among other things, this can lead to problems with looping), so we also use the <code>NS</code> (&#8220;no subrequest&#8221;) flag to note that our rules should not be invoked for such internal subrequests.</li>
<li>In some cases we don&#8217;t simply want to rewrite the URI, we want to correct perceived mistakes in how the URI was originally requested. (For example, we want all URIs referring to existing directories to end in one&#8211;and only one&#8211;trailing slash.) For these cases we use the <code>R</code> flag to tell Apache to redirect the user&#8217;s browser to a new and corrected URI. (We actually use <code>R=301</code> to specify the HTTP code returned, in this case a code meaning &#8220;the URI has permanently moved&#8221;.) In combination with the <code>L</code> flag this immediately ends rewriting for the original URI; a new round of rewriting starts once the browser requests the new URI.</li>
</ul>
<h3>The rewriting rules</h3>
<p>Here are the actual Apache URI rewriting rules we use, in order of evaluation and application:</p>
<ol>
<li>We first enable the rewriting engine and specify a location for logging rewriting actions:
<pre><code>RewriteLog logs/hecker_error_log
RewriteLogLevel 0
RewriteEngine on
</code></pre>
<p>Note that <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewriteloglevel">RewriteLogLevel</a> should be set to a non-zero value to enable logging; a loglevel value of 9 produces lots of output and can be very useful when debugging rewriting rules and/or learning how rewriting works.</p>
<p>Also note that the <a href="http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritebase">RewriteBase</a> directive is not needed here because we are specifying rewriting rules in the <code>httpd.conf</code> configuration file; RewriteBase is typically needed only when specifying rewriting rules in a <code>.htaccess</code> file.</li>
<li>We start the process of handling trailing slashes by stripping all but the last two trailing slashes off of all URIs:
<pre><code>RewriteRule  ^(.*)///  $1//  [N,NS]
</code></pre>
<p>The form of this rule is a hack to get around the &#8220;greedy matching&#8221; problem described above, which prevents us from using the regular expression <code>^(.*)//+$</code> for this purpose. For any URI ending in at least three slashes, we rewrite the URI to remove a trailing slash and then use the <code>N</code> (&#8220;next round&#8221;) flag to restart rewriting from the beginning. Since this is the first rule this simply repeats the rule until no more than two trailing slashes remain on the URI.</p>
<p>Why stop at two, as opposed to removing all but one slash? Because we want to keep track of URIs with excess trailing slashes and force a redirect further on down; hence we leave at least one excess slash on the URI.</li>
<li>We protect against potential XST cross-site scripting attacks by rejecting all HTTP requests with either the TRACE or TRACK methods. (See the relevant <a href="http://www.whitehatsec.com/press_releases/WH-PR-20030120.pdf">WhiteHat Security press release</a> and a related <a href="http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0035.html">VulnWatch list posting</a>.) The <code>F</code> flag causes Apache to immediately send back an HTTP response of 403 (&#8220;FORBIDDEN&#8221;). (The <code>L</code> flag is not needed here because the <code>F</code> flag causes rewriting to stop automatically.)
<pre><code>RewriteCond  %{REQUEST_METHOD}  ^(TRACE|TRACK)
RewriteRule  .*  -  [F,NS]
</code></pre>
</li>
<li>We eliminate duplicate trailing slashes at the end of URIs by redirecting to a new URI with only one trailing slash, to enforce canonical forms for directory URIs and also get rid of some potential problems when requesting Blosxom pages.Note that the first rule above guarantees that by the time we get to this rule we&#8217;ll never see more than two trailing slashes on a URI.
<pre><code>RewriteRule  ^(.*)//$  $1/  [L,R=301,NS]
</code></pre>
</li>
<li>We don&#8217;t attempt to do rewriting of URIs that are handled separately and do not correspond to either Blosxom content or content in our document root. In particular, we don&#8217;t attempt to rewrite <code>/cgi-bin/...</code> or <code>/usage/...</code> URIs (for which we define aliases for the virtual host associated with the site) or <code>/icon/...</code> URIs (for which <code>httpd.conf</code> defines a server-wide alias). Instead we just pass these URIs through to be handled by<br />
<code>mod_alias</code>.Note that there is also a server-wide alias for <code>/manual/</code> but we ignore this since we aren&#8217;t serving up a copy of the Apache documentation.  (We include <code>/icons/</code> because it is needed for auto-generated directory listings.)</p>
<pre><code>RewriteRule  ^/(cgi-bin|icons|usage)(/.*)?$  -  [L,PT,NS]
</code></pre>
</li>
<li>We have Blosxom override the directory index for <code>/</code> (the home page) and other existing directories that should be treated as Blosxom categories instead.  (References to other stuff under those directories, including <code>index.html</code> pages, is handled below.)Note that these rewrite rules must come first in order to handle these special cases before we check for existing directories in general. The URIs may or may not have a trailing slash; we need to handle both cases. We pass any trailing slash on to Blosxom, which can then force a redirect if the trailing slash is omitted; we do it this way (rather than forcing a redirect here) so that such redirection can be done consistently for all Blosxom categories.
<p>Also note that in order for Blosxom to produce the correct results the directories in question should <em>not</em> have any <code>index.*</code> files within the directory. Further on down we explicitly handle the case where an <code>index.html</code> page is requested (by redirecting to the canonical URI without <code>index.html</code>) but references to other <code>index.*</code> pages (e.g., <code>/misc/index.rss</code>) will not return the corresponding Blosxom page if a file of that name is already present in the directory.</p>
<pre><code>RewriteRule  ^(/?)$  /cgi-bin/blosxom.cgi$1  [L,PT,NS]
RewriteRule  ^/(blosxom|misc|mozilla)(/?)$
        /cgi-bin/blosxom.cgi/$1$2  [L,PT,NS]
</code></pre>
</li>
<li>We next check to see if the URI corresponds to an existing directory under the document root. If so and the URI has a trailing slash then we simply stop rewriting and use the URI as is; otherwise we add a trailing slash and force a redirect to the new URI.Note that we can&#8217;t use <code>%{REQUEST_FILENAME}</code> in the directory existence check because that variable has not yet been set; hence we have to explicitly append the URI to the document root pathname.
<pre><code>ReWriteCond  %{DOCUMENT_ROOT}/$1  -d
RewriteRule  ^/(.*)/$  -  [L,NS]

ReWriteCond  %{DOCUMENT_ROOT}/$1  -d
RewriteRule  ^/(.*)$  /$1/  [L,R=301,NS]
</code></pre>
</li>
<li>If the URI is explicitly requesting an index.html file for an existing directory (e.g., <code>/foo/index.html</code>) then we force a redirect to the canonical URI for that directory (e.g., <code>/foo/</code>).Note that we catch the case where the URI (incorrectly) includes a trailing slash (e.g., <code>/foo/index.html/</code>), and redirect that to the canonical URI as well. Note also that skipping this rule on internal subrequests is particularly important; otherwise we&#8217;d cause major looping problems with URIs requesting directory indices.
<pre><code>ReWriteCond  %{DOCUMENT_ROOT}$1  -d
RewriteRule  ^(.*)/index.html(/?)$  $1/  [L,R=301,NS]
</code></pre>
</li>
<li>We check to see if the URI is explicitly requesting an existing HTML file (e.g., <code>/foo/bar.html</code> where <code>bar.html</code> exists). If so then we force a redirect to the canonical URI for the file, without the <code>.html</code> file extension (e.g., <code>/foo/bar</code>).As with the previous rule, we properly handle the case where the URI (incorrectly) includes a trailing slash, and we make sure to avoid problems with internal subrequests for directory index URIs.
<p>Finally, note a minor bug in the rule: It incorrectly rewrites a URI like <code>/foo/.html</code> that references a hidden file named <code>.html</code>; I chose to ignore this uncommon (and arguably nonsensical) case.</p>
<pre><code>ReWriteCond  %{DOCUMENT_ROOT}/$1.html  -f
RewriteRule  ^/(.*).html/?$  /$1  [L,R=301,NS]
</code></pre>
</li>
<li>We check to see if the URI corresponds to an existing HTML file after adding a <code>.html</code> extension. If so, we rewrite the URI to include the extension and pass it through to Apache.Again we properly handle the case where a trailing slash has been incorrectly added. Also note that we use the <code>OR</code> flag on the RewriteCond directive because we are checking to see if the URI references a regular file <em>or</em> a symbolic link; by default all RewriteCond directives must evaluate true in order for the<br />
corresponding rewriting rule to be invoked.</p>
<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1.html  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1.html  -l
RewriteRule  ^/(.*)/$  /$1.html  [L,R=301,NS]

RewriteCond  %{DOCUMENT_ROOT}/$1.html  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1.html  -l
RewriteRule  ^/(.*)$  /$1.html  [L,NS]
</code></pre>
</li>
<li>We check to see if the URI corresponds to any other (non-HTML) existing file or symbolic link and, if so, we force a redirect if a trailing slash is present.
<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1  -f [OR]
RewriteCond  %{DOCUMENT_ROOT}/$1  -l
RewriteRule  ^/(.*)/$  /$1  [L,R=301,NS]
</code></pre>
</li>
<li>Finally, we pass the URI on to Blosxom if it does not correspond to an existing (non-HTML) file or symlink.
<pre><code>RewriteCond  %{DOCUMENT_ROOT}/$1  !-f
RewriteCond  %{DOCUMENT_ROOT}/$1  !-l
RewriteRule  ^/(.*)$  /cgi-bin/blosxom.cgi/$1  [L,PT,NS]
</code></pre>
</li>
</ol>
<p>This concludes the rewriting rules.</p>
<h3>After (Apache) rewriting ends</h3>
<p>Once Apache has concluded all rewriting (including further rewriting done for new requests due to redirects) all URIs should be in one of the following forms, and are handled as indicated:</p>
<ul>
<li>Canonical URIs for existing directories (e.g., <code>/foo/</code>). Each such URI is subsequently processed by the <code>mod_dir</code> module, which will attempt to look for a directory index file as specified by the <a href="http://httpd.apache.org/docs-2.0/mod/mod_dir.html#directoryindex">DirectoryIndex</a> directive. The <code>mod_dir</code> module may generate internal subrequests, e.g., for <code>/foo/index.html</code>, but these will<br />
not invoke our rewriting rules.If <code>mod_dir</code> cannot find an index file then a directory listing will be generated by the <a href="http://httpd.apache.org/docs-2.0/mod/mod_autoindex.html"><code>mod_autoindex</code> module</a> if the <code>Indexes</code> option is specified for the <a href="http://httpd.apache.org/docs-2.0/mod/core.html#options">Options</a> directive.</li>
<li>URIs for existing HTML files (e.g., <code>/foo/bar.html</code>) rewritten from the canonical form of such URIs (e.g., <code>/foo/bar</code>). Each such URI is handled normally by Apache (<em>without</em> going through <code>mod_alias</code>, since there&#8217;s no need to do so).</li>
<li>Canonical URIs for existing non-HTML files (e.g., <code>/foo/baz.png</code>). Each such URI is handled normally by Apache (again without going through <code>mod_alias</code>).</li>
<li>URIs to be handled by Blosxom (e.g., <code>/cgi-bin/blosxom.cgi/foo/</code>). Each such URI is passed to <code>mod_alias</code> (to determine the location of the <code>cgi-bin</code> directory) and then the <code>blosxom.cgi</code> script is invoked with the path specified (e.g., <code>/foo</code>). Note that the path in question is not necessarily (yet) in canonical form, except that it is guaranteed not to have more than one trailing slash.</li>
<li>Other (non-Blosxom) <code>cgi-bin</code> URIs or other URIs needing further translation (e.g., <code>/icon</code> URIs for images in auto-generated directory listings). Each such URI is passed to <code>mod_alias</code>.</li>
</ul>
<h3>Canonical URIs in Blosxom</h3>
<p>Once a URI is passed to Blosxom then we have to do the same sorts of URI checks, rewriting, and/or redirection done by the Apache URI rewriting rules. We divide this work into two separate plugins:</p>
<ul>
<li>The <a href="http://hecker.org/blosxom/extensionless">extensionless plugin</a> checks requests for which the requested URI lacks a flavour extension (e.g., <code>/foo/bar</code>) and adds the appropriate flavour extension (<code>.html</code> in our case) to the variable <code>$blosxom::path_info</code> if there is an individual entry corresponding to that URI.</li>
<li>The <a href="http://blog.hecker.org/2004/11/19/enforcing-canonical-uris-for-blosxom-pages/">canonicaluri plugin</a> checks URIs to see if they are in canonical form and forces a redirect if necessary.</li>
</ul>
<p>For more information see the documentation for those plugins.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/hecker.wordpress.com/145/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/hecker.wordpress.com/145/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hecker.wordpress.com/145/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hecker.wordpress.com/145/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hecker.wordpress.com/145/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=145&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://frankhecker.com/2004/11/18/uri-rewriting-and-canonical-uris/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/523287496a16cae22d6337ab1aae4491?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hecker</media:title>
		</media:content>
	</item>
		<item>
		<title>Accessibility statement for www.hecker.org</title>
		<link>http://frankhecker.com/2004/10/20/accessibility-statement-for-wwwheckerorg/</link>
		<comments>http://frankhecker.com/2004/10/20/accessibility-statement-for-wwwheckerorg/#comments</comments>
		<pubDate>Wed, 20 Oct 2004 12:00:41 +0000</pubDate>
		<dc:creator>hecker</dc:creator>
				<category><![CDATA[site]]></category>
		<category><![CDATA[accessibility]]></category>

		<guid isPermaLink="false">http://hecker.wordpress.com/?p=187</guid>
		<description><![CDATA[NOTE: This post refers to the old Blosxom-based version of my blog. I’ve left it unchanged since it may still be of historical interest. I&#8217;ve tried to make this site accessible to as many people as possible; here I describe the accessibility features of this site. (This statement is based on Mark Pilgrim&#8217;s accessibility statement.) [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=187&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>NOTE: This post refers to the old Blosxom-based version of my blog. I’ve left it unchanged since it may still be of historical interest.</p>
<p>I&#8217;ve tried to make this site accessible to as many people as possible; here I describe the accessibility features of this site. (This statement is based on <a href="http://diveintomark.org/about/accessibility/">Mark Pilgrim&#8217;s accessibility statement</a>.) If you have any questions or comments about the accessibility of this site, feel free to email me at <a href="mailto:hecker@hecker.org">hecker@hecker.org</a>.</p>
<h4>Access keys</h4>
<p>Most browsers support jumping to specific links by typing special key combinations defined on the web site.  On Windows, you can press <acronym>ALT</acronym> + an access key; on Macintosh, you can press <acronym>Control</acronym> + an access key.</p>
<p>The home page and all archives define the following access keys:</p>
<dl>
<dt>Access key 1</dt>
<dd>Home page</dd>
<dt>Access key 4</dt>
<dd>Search box</dd>
<dt>Access key 9</dt>
<dd>Feedback</dd>
<dt>Access key 0</dt>
<dd>Accessibility statement</dd>
</dl>
<p>(Note that I didn&#8217;t define an access key to skip to the main content because the main content is already the first thing on the page.)</p>
<h4>Standards compliance</h4>
<ol>
<li>I intend to ensure that all pages are Bobby AAA approved. More on that later as I complete the necessary work.</li>
<li>I intend to ensure that all pages are Section 508 compliant. More on that later as I complete the necessary work.</li>
<li>The home page and blog archives validate as HTML 4.01 Strict. (Some older pages on the site have not yet been modified to validate properly.)</li>
<li>The home page, blog archives, and other pages use structured semantic markup.  For example, on pages with more than one entry H2 tags are used for individual post titles, so that JAWS users can skip to the next post with ALT+INSERT+2.</li>
</ol>
<h4>Navigation aids</h4>
<ol>
<li>All blog archive pages have <code>rel=home</code> links to aid navigation in text-only browsers and screen readers; I may add <code>rel=previous</code>, <code>next</code>, and <code>up</code> links in the future. (Unfortunately <code>prev</code> and <code>next</code> in particular are not simple to implement in Blosxom, the blogging system I&#8217;m using.) Mozilla users can take advantage of this feature by selecting the View menu, Show/Hide, Site Navigation Bar, Show Only As Needed (or Show Always).  Opera 7 has similar functionality.</li>
<li>The home page and all archive pages include a search box (access key 4).</li>
</ol>
<h4>Links</h4>
<ol>
<li>Many links have title attributes which describe the link in greater detail, unless the text of the link already fully describes the target (such as the headline of an article).</li>
<li>Whever possible, links are written to make sense out of context. Many browsers (such as JAWS, Home Page Reader, Lynx, and Opera) can extract the list of links on a page and allow the user to browse the list, separately from the page.</li>
<li>Link text is never duplicated; two links with the same link text always point to the same address.</li>
<li>There are no <q><code>javascript:</code></q> pseudo-links.  All links can be followed in any browser, even if scripting is turned off.</li>
<li>There are no links that open new windows without warning.</li>
</ol>
<h4>Images</h4>
<ol>
<li>With one exception (a photo for my biography page) this site does not use images at all.</li>
</ol>
<h4>Visual design</h4>
<p>This site and all its archives use cascading style sheets for visual layout.</p>
<ol>
<li>The style sheets for this site do not specify a base font size, and use relative font sizes to specify the appearance of headings and related text. Text on this site should be resizable in any browser that permits text resizing.</li>
<li>If your browser or browsing device does not support stylesheets at all, the content of each page is still readable.</li>
</ol>
<h4>References</h4>
<p>In creating this site I made use of Mark Pilgrim&#8217;s <q><a title="30 days to a more accessible web site" href="http://diveintoaccessibility.org/">Dive Into Accessibility</a></q> book and related materials. See the book for a complete list of other references.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/hecker.wordpress.com/187/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/hecker.wordpress.com/187/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hecker.wordpress.com/187/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hecker.wordpress.com/187/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hecker.wordpress.com/187/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=frankhecker.com&amp;blog=262099&amp;post=187&amp;subd=hecker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://frankhecker.com/2004/10/20/accessibility-statement-for-wwwheckerorg/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/523287496a16cae22d6337ab1aae4491?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">hecker</media:title>
		</media:content>
	</item>
	</channel>
</rss>
