Browsing the archives for the SEO tag.


SEO Task #2 Completed: Remove Duplicate URLs

Blogging, SEO, Software

Coming off the heels of my SEO Task #1, Complete Permalinks, knowledgeable commenter Vanessa Fox said:

You want exactly one URL to every post. Pages can have multiple URLs pointing them. You want to avoid [this] by making sure there’s a unique URL for a page and that any other potential URLs for that page 301 redirect.

In my case the following two URLs pointed to the exact same content:

http://thepursuitofalife.com/2008/11/16/two-great-pictures/
and
http://thepursuitofalife.com/two-great-pictures/

I had enabled permalinks in WordPress, with a custom structure of /%postname%/. For some reason though, the version of the URL with year/month/day/postname/ redirected to /postname/

The following custom 404 Perl script fixed the problem: (whitespace removed for HTML purposes)

$qs = $_SERVER['QUERY_STRING'];
$pos = strrpos($qs, '://');
$found = 0;
$pos = strpos($qs, '/', $pos + 4);
$uri = substr($qs, $pos);
$pattern = '/(\d{4})\/(\d{2})\/(\d{2})(?P(\/.*))/';<br /> if (preg_match($pattern, $uri, $groups)) {<br /> $uri = $groups['title'];<br /> $found = 1;<br /> }<br /> if ($found == 1) {<br /> $host = $_SERVER['HTTP_HOST'];<br /> $host = 'http://' . $host;<br /> $uri = $host . $uri;<br /> $loc = 'Location: ' . $uri;<br /> header( "HTTP/1.1 301 Moved Permanently" );<br /> header( "Status: 301 Moved Permanently" );<br /> header( $loc ) ;<br /> exit(0); // This is Optional but suggested, to avoid any accidental output<br /> } else {<br /> $_SERVER['REQUEST_URI'] = $uri;<br /> $_SERVER['PATH_INFO'] = $_SERVER['REQUEST_URI'];<br /> include('index.php');<br /> }<br /> ?></code></p> <p>Originally I hadn’t put in the <strong>if (found == 1)</strong> part; but without it Firefox told me I was in an endless loop and couldn’t complete the request.</p> <p>Now the incoming URL for the year/month/day/postname version has a 301 redirect to the “correct” version.</p> <p>Fairly pleased so far.</p> <p><script type="text/javascript">SHARETHIS.addEntry({ title: "SEO Task #2 Completed: Remove Duplicate URLs", url: "http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/" });</script></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><a href="http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/#respond" title="Comment on SEO Task #2 Completed: Remove Duplicate URLs">No Comments</a> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/canonicalization/" rel="tag">canonicalization</a>, <a href="http://thepursuitofalife.com/tag/permalinks/" rel="tag">Permalinks</a>, <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a>, <a href="http://thepursuitofalife.com/tag/urls/" rel="tag">urls</a>, <a href="http://thepursuitofalife.com/tag/wordpress/" rel="tag">WordPress</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <!-- post --> <div class="post" id="post-923"> <div class="post-title"> <h1><a href="http://thepursuitofalife.com/pathable-needs-to-work-on-seo/" rel="bookmark">Pathable Needs To Work On SEO</a></h1> </div> <div class="post-sub"> <div class="post-date"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/24.png" width="16" height="16" align="left" alt="" title="Date" border="0" style="margin-right:4px;"> Nov 23, 2008 </div> <!-- // post author, remove comments if you want it displayed <div class="post-author"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/39.png" width="16" height="16" align="left" alt="" title="Author" border="0" style="margin-right:3px;"> <a href="http://thepursuitofalife.com/author/admin/" title="Posts by admin">admin</a> </div> --> <div class="post-cat"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/34.png" width="16" height="16" align="left" alt="" title="Category" border="0" style="margin-right:4px;"> <a href="http://thepursuitofalife.com/category/seo/" title="View all posts in SEO" rel="category tag">SEO</a> </div> </div> <div class="post-text"> <p>Does anyone else think it’s weird that you can enter the terms</p> <blockquote><p>pathable seattle mind camp</p></blockquote> <p>into Google, and get four pages of results, NOT ONE of which is this page?</p> <p><a href="http://pathable.com/events/seattle-mind-camp">http://pathable.com/events/seattle-mind-camp</a></p> <p>The Pathable home page for Seattle Mind Camp 5 has had over 100 attendees sign up, probably thousands of visits over the last couple weeks, and STILL doesn’t show up in Google – at all.</p> <p><strong>UPDATE:</strong> It gets stranger. A search on the terms “pathable seattle bar camp” returns the Pathable BarCamp page as the #1 result. Weird. I wonder what the difference is?</p> <p><script type="text/javascript">SHARETHIS.addEntry({ title: "Pathable Needs To Work On SEO", url: "http://thepursuitofalife.com/pathable-needs-to-work-on-seo/" });</script></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><a href="http://thepursuitofalife.com/pathable-needs-to-work-on-seo/#comments" title="Comment on Pathable Needs To Work On SEO">1 Comment</a> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/pathable/" rel="tag">Pathable</a>, <a href="http://thepursuitofalife.com/tag/seattle-mind-camp/" rel="tag">Seattle Mind Camp</a>, <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <!-- post --> <div class="post" id="post-210"> <div class="post-title"> <h1><a href="http://thepursuitofalife.com/good-article-on-sitemap-xml/" rel="bookmark">Good Article on Sitemap XML</a></h1> </div> <div class="post-sub"> <div class="post-date"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/24.png" width="16" height="16" align="left" alt="" title="Date" border="0" style="margin-right:4px;"> Jun 1, 2008 </div> <!-- // post author, remove comments if you want it displayed <div class="post-author"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/39.png" width="16" height="16" align="left" alt="" title="Author" border="0" style="margin-right:3px;"> <a href="http://thepursuitofalife.com/author/anthony-stevens/" title="Posts by Anthony Stevens">Anthony Stevens</a> </div> --> <div class="post-cat"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/34.png" width="16" height="16" align="left" alt="" title="Category" border="0" style="margin-right:4px;"> <a href="http://thepursuitofalife.com/category/web/" title="View all posts in Web" rel="category tag">Web</a> </div> </div> <div class="post-text"> <p>The <a href="http://www.techstars.org/community/2008/05/the-importance-of-submitting-a-complete/">article itself, written by Bryan Crow</a>, doesn’t describe a lot, but the comments section is really helpful and Bryan includes some great links.</p> <p>It’s nice to get pointers from people who really know this stuff.</p> <p>(h/t <a href="http://andrewhyde.net/techstars-community/">Andrew Hyde</a>, via Twitter)</p> <p><script type="text/javascript">SHARETHIS.addEntry({ title: "Good Article on Sitemap XML", url: "http://thepursuitofalife.com/good-article-on-sitemap-xml/" });</script></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><a href="http://thepursuitofalife.com/good-article-on-sitemap-xml/#respond" title="Comment on Good Article on Sitemap XML">No Comments</a> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a>, <a href="http://thepursuitofalife.com/tag/sitemaps/" rel="tag">Sitemaps</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <!-- post --> <div class="post" id="post-653"> <div class="post-title"> <h1><a href="http://thepursuitofalife.com/visual-jquery-meet-seo/" rel="bookmark">Visual JQuery, meet SEO</a></h1> </div> <div class="post-sub"> <div class="post-date"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/24.png" width="16" height="16" align="left" alt="" title="Date" border="0" style="margin-right:4px;"> Feb 13, 2008 </div> <!-- // post author, remove comments if you want it displayed <div class="post-author"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/39.png" width="16" height="16" align="left" alt="" title="Author" border="0" style="margin-right:3px;"> <a href="http://thepursuitofalife.com/author/anthony-stevens/" title="Posts by Anthony Stevens">Anthony Stevens</a> </div> --> <div class="post-cat"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/34.png" width="16" height="16" align="left" alt="" title="Category" border="0" style="margin-right:4px;"> <a href="http://thepursuitofalife.com/category/web/" title="View all posts in Web" rel="category tag">Web</a> </div> </div> <div class="post-text"> <p>What I would give for an SEO-friendly <a href="http://visualjquery.com">Visual JQuery website</a>. You always get hits when you’re searching Google for JQuery terms; that’s not the problem. Rather, when you click on a link to take you to the website, you start out at the home page EVERY TIME.</p> <p>*urge to throttle* <img src='http://thepursuitofalife.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p> <p>mod_rewrite anyone?</p> <p><script type="text/javascript">SHARETHIS.addEntry({ title: "Visual JQuery, meet SEO", url: "http://thepursuitofalife.com/visual-jquery-meet-seo/" });</script></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><a href="http://thepursuitofalife.com/visual-jquery-meet-seo/#comments" title="Comment on Visual JQuery, meet SEO">3 Comments</a> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a>, <a href="http://thepursuitofalife.com/tag/visual-jquery/" rel="tag">Visual JQuery</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <!-- post --> <div class="post" id="post-69"> <div class="post-title"> <h1><a href="http://thepursuitofalife.com/65/" rel="bookmark">Google is finally indexing me</a></h1> </div> <div class="post-sub"> <div class="post-date"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/24.png" width="16" height="16" align="left" alt="" title="Date" border="0" style="margin-right:4px;"> Nov 8, 2007 </div> <!-- // post author, remove comments if you want it displayed <div class="post-author"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/39.png" width="16" height="16" align="left" alt="" title="Author" border="0" style="margin-right:3px;"> <a href="http://thepursuitofalife.com/author/anthony-stevens/" title="Posts by Anthony Stevens">Anthony Stevens</a> </div> --> <div class="post-cat"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/34.png" width="16" height="16" align="left" alt="" title="Category" border="0" style="margin-right:4px;"> <a href="http://thepursuitofalife.com/category/computing/" title="View all posts in Computing" rel="category tag">Computing</a> </div> </div> <div class="post-text"> <p>I just checked Google for a term from a recent post: “Spire Viro” – and came up second in the search results. Not bad! I had unwitting turned off the indexing in my blog when I created it, and turned it back on only a couple days ago. I am steeling myself for the inevitable onslaught of trolls, flames, and Viagra advertisements.</p> <p><img src="http://xidey.files.wordpress.com/2007/11/110907-0332-1.png" /></p> <p><script type="text/javascript">SHARETHIS.addEntry({ title: "Google is finally indexing me", url: "http://thepursuitofalife.com/65/" });</script></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><a href="http://thepursuitofalife.com/65/#respond" title="Comment on Google is finally indexing me">No Comments</a> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/google/" rel="tag">Google</a>, <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <div class="post"> <div style="float:left;"></div> <div style="float:right;"></div> </div> <!-- /main column --> </div> <div class="c3"> <!-- right sidebar --> <div id="sidebar2"> <br/><br/> <ul id="widgets2"> <li id="search-2" class="widget widget_search"><h2>Search</h2> <form method="get" id="searchform" action="http://thepursuitofalife.com/"> <input type="text" onfocus="if (this.value == 'Search this blog') {this.value = '';}" onblur="if (this.value == '') {this.value = 'Search this blog';}" value="Search this blog" name="s" id="s" /></form></li> <li id="recent-posts-3" class="widget widget_recent_entries"> <h2 class="widgettitle">Recent Posts</h2> <ul> <li><a href="http://thepursuitofalife.com/stories-are-made-but-they-dont-make-me/" title="Stories Are Made, But They Don’t Make Me">Stories Are Made, But They Don’t Make Me </a></li> <li><a href="http://thepursuitofalife.com/photo-meme-11/" title="Photo Meme">Photo Meme </a></li> <li><a href="http://thepursuitofalife.com/knock-it-off/" title="Knock It Off">Knock It Off </a></li> <li><a href="http://thepursuitofalife.com/log4net-silent-failures-make-me-yell/" title="log4net Silent Failures Make Me Yell">log4net Silent Failures Make Me Yell </a></li> <li><a href="http://thepursuitofalife.com/hugo-house-laws-of-attraction/" title="Hugo House: Laws of Attraction">Hugo House: Laws of Attraction </a></li> </ul> </li> <li id="meta-2" class="widget widget_meta"><h2 class="widgettitle">Meta</h2> <ul> <li><a href="http://thepursuitofalife.com/wp-login.php">Log in</a></li> <li><a href="http://thepursuitofalife.com/feed/" title="Syndicate this site using RSS 2.0">Entries <abbr title="Really Simple Syndication">RSS</abbr></a></li> <li><a href="http://thepursuitofalife.com/comments/feed/" title="The latest comments to all posts in RSS">Comments <abbr title="Really Simple Syndication">RSS</abbr></a></li> <li><a href="http://wordpress.org/" title="Powered by WordPress, state-of-the-art semantic personal publishing platform.">WordPress.org</a></li> </ul> </li> <li id="recent-comments-2" class="widget widget_recent_comments"> <h2 class="widgettitle">Recent Comments</h2> <ul id="recentcomments"><li class="recentcomments"><a href='http://thepursuitofalife.com' rel='external nofollow' class='url'>anthonyrstevens</a> on <a href="http://thepursuitofalife.com/thursday-recap-2/comment-page-1/#comment-3046">Thursday Recap</a></li><li class="recentcomments">Logan on <a href="http://thepursuitofalife.com/thursday-recap-2/comment-page-1/#comment-3039">Thursday Recap</a></li><li class="recentcomments">Michael Petty on <a href="http://thepursuitofalife.com/the-linq-firstordefault-method-and-null-resultsets/comment-page-1/#comment-3038">The Linq FirstOrDefault() Method and Null Resultsets</a></li><li class="recentcomments"><a href='http://weizenspr.eu' rel='external nofollow' class='url'>Mea Culpa</a> on <a href="http://thepursuitofalife.com/escape-characters-in-windows-cmdexe/comment-page-1/#comment-3023">Escape Characters in Windows’ CMD.EXE</a></li><li class="recentcomments"><a href='http://thepursuitofalife.com' rel='external nofollow' class='url'>anthonyrstevens</a> on <a href="http://thepursuitofalife.com/knock-it-off/comment-page-1/#comment-3021">Knock It Off</a></li></ul> </li> </ul> </div> <!-- /right sidebar --> </div> </div> </div> <div id="footer"> Powered by <a href="http://www.wordpress.org/" target="_blank">WordPress</a>  ·  <a href="http://wnw.blogwarhammer.net/themes/disciple" target="_blank">Disciple</a> theme </div> </body> </html> <!-- no logged in Facebook user --><script type="text/javascript"> FBConnect.init('8cebc81cef3e95b4f4a8da8d50951620', 'http://thepursuitofalife.com/wp-content/plugins/D:\Smalliron\WordPress\2.7\wp-content\plugins\wp-fbconnect/', '41752753818', 'http://thepursuitofalife.com', 0, FBConnect.appconfig_reload); </script><script src="http://stats.wordpress.com/e-201011.js" type="text/javascript"></script> <script type="text/javascript"> st_go({blog:'5543875',v:'ext',post:'0'}); var load_cmc = function(){linktracker_init(5543875,0,2);}; if ( typeof addLoadEvent != 'undefined' ) addLoadEvent(load_cmc); else load_cmc(); </script>