Browsing the archives for the urls tag.


  • Anthony Stevens

SEO Task #2 Completed: Remove Duplicate URLs

Blogging, SEO, Software

Coming off the heels of my SEO Task #1, Complete Permalinks, knowledgeable commenter Vanessa Fox said:

You want exactly one URL to every post. Pages can have multiple URLs pointing them. You want to avoid [this] by making sure there’s a unique URL for a page and that any other potential URLs for that page 301 redirect.

In my case the following two URLs pointed to the exact same content:

http://thepursuitofalife.com/2008/11/16/two-great-pictures/
and
http://thepursuitofalife.com/two-great-pictures/

I had enabled permalinks in WordPress, with a custom structure of /%postname%/. For some reason though, the version of the URL with year/month/day/postname/ redirected to /postname/

The following custom 404 Perl script fixed the problem: (whitespace removed for HTML purposes)

$qs = $_SERVER['QUERY_STRING'];
$pos = strrpos($qs, '://');
$found = 0;
$pos = strpos($qs, '/', $pos + 4);
$uri = substr($qs, $pos);
$pattern = '/(\d{4})\/(\d{2})\/(\d{2})(?P(\/.*))/';<br /> if (preg_match($pattern, $uri, $groups)) {<br /> $uri = $groups['title'];<br /> $found = 1;<br /> }<br /> if ($found == 1) {<br /> $host = $_SERVER['HTTP_HOST'];<br /> $host = 'http://' . $host;<br /> $uri = $host . $uri;<br /> $loc = 'Location: ' . $uri;<br /> header( "HTTP/1.1 301 Moved Permanently" );<br /> header( "Status: 301 Moved Permanently" );<br /> header( $loc ) ;<br /> exit(0); // This is Optional but suggested, to avoid any accidental output<br /> } else {<br /> $_SERVER['REQUEST_URI'] = $uri;<br /> $_SERVER['PATH_INFO'] = $_SERVER['REQUEST_URI'];<br /> include('index.php');<br /> }<br /> ?></code></p> <p>Originally I hadn’t put in the <strong>if (found == 1)</strong> part; but without it Firefox told me I was in an endless loop and couldn’t complete the request.</p> <p>Now the incoming URL for the year/month/day/postname version has a 301 redirect to the “correct” version.</p> <p>Fairly pleased so far.</p> <p><span class='st_facebook_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Facebook'></span><span class='st_twitter_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Twitter'></span><span class='st_linkedin_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='LinkedIn'></span><span class='st_email_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Email'></span><span class='st_sharethis_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='ShareThis'></span><span class='st_fblike_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Facebook Like'></span><span class='st_plusone_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Google +1'></span><span class='st_pinterest_buttons' st_title='SEO Task #2 Completed: Remove Duplicate URLs' st_url='http://thepursuitofalife.com/seo-task-2-completed-remove-duplicate-urls/' displayText='Pinterest'></span></p> </div> <div class="post-foot"> <div class="post-comments"> <img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/18.png" width="16" height="16" align="left" alt="" border="0" style="margin-right:4px;" /><span>Comments Off</span> </div> <span class="post-edit"></span> <span class="post-tags"><img src="http://thepursuitofalife.com/wp-content/themes/disciple/images/36.png" width="16" height="16" align="left" title="Tags" alt="Tags" border="0" style="margin-right:4px;" /> <a href="http://thepursuitofalife.com/tag/canonicalization/" rel="tag">canonicalization</a>, <a href="http://thepursuitofalife.com/tag/permalinks/" rel="tag">Permalinks</a>, <a href="http://thepursuitofalife.com/tag/seo/" rel="tag">SEO</a>, <a href="http://thepursuitofalife.com/tag/urls/" rel="tag">urls</a>, <a href="http://thepursuitofalife.com/tag/wordpress/" rel="tag">WordPress</a></span> </div> </div> <div class="sep"></div> <!--/post --> <div class="post" id="comments"> </div> <div class="post"> <div style="float:left;"></div> <div style="float:right;"></div> </div> <!-- /main column --> </div> <div class="c3"> <!-- right sidebar --> <div id="sidebar2"> <br/><br/> <ul id="widgets2"> <li id="search-2" class="widget widget_search"><h2>Search</h2> <form method="get" id="searchform" action="http://thepursuitofalife.com/"> <input type="text" onfocus="if (this.value == 'Search this blog') {this.value = '';}" onblur="if (this.value == '') {this.value = 'Search this blog';}" value="Search this blog" name="s" id="s" /></form></li> <li id="recent-posts-3" class="widget widget_recent_entries"> <h2 class="widgettitle">Recent Posts</h2> <ul> <li> <a href="http://thepursuitofalife.com/just-an-fyi/" title="Just an FYI">Just an FYI</a> </li> <li> <a href="http://thepursuitofalife.com/teaching-reflections/" title="Teaching Reflections">Teaching Reflections</a> </li> <li> <a href="http://thepursuitofalife.com/teamcity-7-1-4-error-java-io-ioexception-failed-to-start-teamcity-build-agent-service-please-check-teamcity-build-agent-service-user-have-enough-permissions-to-stop-and-start-the-service-rd/" title="TeamCity 7.1.4 Error: “java.io.IOException: Failed to start TeamCity build agent service. Please check TeamCity build agent service user have enough permissions to stop and start the service.”">TeamCity 7.1.4 Error: “java.io.IOException: Failed to start TeamCity build agent service. Please check TeamCity build agent service user have enough permissions to stop and start the service.”</a> </li> <li> <a href="http://thepursuitofalife.com/nexus-7-usb-debugging-setup/" title="Nexus 7: USB Debugging Setup">Nexus 7: USB Debugging Setup</a> </li> <li> <a href="http://thepursuitofalife.com/unsupportedclassversionerror-with-teamcity-and-jtds-driver/" title="UnsupportedClassVersionError with TeamCity and JTDS driver">UnsupportedClassVersionError with TeamCity and JTDS driver</a> </li> </ul> </li> <li id="meta-2" class="widget widget_meta"><h2 class="widgettitle">Meta</h2> <ul> <li><a href="http://thepursuitofalife.com/wp-login.php">Log in</a></li> <li><a href="http://thepursuitofalife.com/feed/" title="Syndicate this site using RSS 2.0">Entries <abbr title="Really Simple Syndication">RSS</abbr></a></li> <li><a href="http://thepursuitofalife.com/comments/feed/" title="The latest comments to all posts in RSS">Comments <abbr title="Really Simple Syndication">RSS</abbr></a></li> <li><a href="http://wordpress.org/" title="Powered by WordPress, state-of-the-art semantic personal publishing platform.">WordPress.org</a></li> </ul> </li> <li id="recent-comments-2" class="widget widget_recent_comments"><h2 class="widgettitle">Recent Comments</h2> <ul id="recentcomments"><li class="recentcomments"><a href='http://www.nerostorm.com' rel='external nofollow' class='url'>Alan Jones</a> on <a href="http://thepursuitofalife.com/nexus-7-usb-debugging-setup/comment-page-1/#comment-20558">Nexus 7: USB Debugging Setup</a></li><li class="recentcomments"><a href='http://justaprogrammer.net' rel='external nofollow' class='url'>Justin Dearing</a> on <a href="http://thepursuitofalife.com/open-plan-offices/comment-page-1/#comment-19355">Open Plan Offices?</a></li><li class="recentcomments"><a href='http://www.justaprogrammer.net' rel='external nofollow' class='url'>Justin Dearing</a> on <a href="http://thepursuitofalife.com/responsibility/comment-page-1/#comment-18804">Responsibility</a></li><li class="recentcomments">Brad on <a href="http://thepursuitofalife.com/update-on-kilnmercurial-extension-issue-solved/comment-page-1/#comment-18396">Update on Kiln/Mercurial Extension Issue: Solved</a></li><li class="recentcomments">codepoke on <a href="http://thepursuitofalife.com/arrrgghh/comment-page-1/#comment-18186">Arrrgghh</a></li></ul></li> </ul> </div> <!-- /right sidebar --> </div> </div> </div> <div id="footer"> Powered by <a href="http://www.wordpress.org/" target="_blank">WordPress</a>  ·  <a href="http://wnw.blogwarhammer.net/themes/disciple" target="_blank">Disciple</a> theme </div> </body> </html> <div style="display:none"> </div> <script type='text/javascript' src='http://s0.wp.com/wp-content/js/devicepx-jetpack.js?ver=201321'></script> <script type='text/javascript' src='http://s.gravatar.com/js/gprofiles.js?ver=2013Mayaa'></script> <script type='text/javascript'> /* <![CDATA[ */ var WPGroHo = {"my_hash":""}; /* ]]> */ </script> <script type='text/javascript' src='http://thepursuitofalife.com/wp-content/plugins/jetpack/modules/wpgroho.js?ver=3.5.1'></script> <script src="http://stats.wordpress.com/e-201321.js" type="text/javascript"></script> <script type="text/javascript"> st_go({v:'ext',j:'1:2.2.5',blog:'5543875',post:'0',tz:'-7'}); var load_cmc = function(){linktracker_init(5543875,0,2);}; if ( typeof addLoadEvent != 'undefined' ) addLoadEvent(load_cmc); else load_cmc(); </script>