SEO Werkt!

Seo werkt!

Archive for the ‘search ranking’ tag

Online Marketing News: Olympics Have Gone Social, Facebook Clicks From Bots, Improve Search Ranking Via Social Media, Download Twitter History

without comments

OTL: Twitter Olympics

The broadcasting of this years Olympics has been nothing short of disasterous.  NBC is not reporting the Olympic games in real time and part of the reason is to protect the interest of advertisors and sponsors.  The term #NBCFAIL has even begun trending on Twitter.  Social media is riddled with the mistakes and blunders made by the company in the past weeks. There are now discussions to place restrictions placed on Olympian’s tweets – to avoid the “cat getting out of the bag” before the news is broadcasted.

Top Online News of the Week

Company Says 80% of Their Facebook Ad Clicks Were From Bots
After becoming frustrated with Facebook advertising Limited Run has decided to ditch their Facebook page altogether.  Limited Run found that they could only verify about 20% of the clicks that were supposedly being converted to users showing up on it’s website.  Via Cnet.

How Microsoft Will Profit Off Webmail Without Reading Your Inbox
Microsoft is taking on Gmail!  In order to effectively target users Gmail scans emails and presents ads related to the content of the messages.  This practice has alienated some potential users.  Find out what Microsoft is is doing to battle the web-mail giant.  Via Wired.

LinkedIn on Password Theft 
According to LinkedIn CEO Jeff Weiner the recent password thefts of users haven’t hurt the company’s growth.  Their numbers seem to be backing up the CEO’s claims.  Within the last quarter alone LinkedIn has added 13 million new members, bringing their total to 174 million members.  Via TechCrunch.

Facebook: 85% of Our Users Are Creating Content
Facebook is now claiming to have roughly 955 million active monthly users.  In a recent interview, Robert D’Onofrio shared that about 85% of those monthly active users are creating some form of content.  Via AdAge.

Gold Medal News From The TopRank Team

Mike Yanke – A Trojan Horse to Infiltrate Facebook?  Google Purchases Wildfire
As the overall perception of Google+ continues to position it in the ‘social media also-ran’ category, and as Google continues to be unable to connect to the API’s of popular social networks, the search giant may have found another way to play socially.  In this story by Kate Kaye of ClickZ, we learn more about Google’s acquisition of social media campaign management firm Wildfire.  While it is not clear yet exactly what will come of this relationship, Kaye suggests that Wildfire could serve as a Trojan horse into Facebook, helping act as the bridge between Google and the social data it craves. Via ClickZ.

Sam Giehll – Winning the Election is Just a Matter of Four Screens
In an election year, we all tend to be more aware of the news buzz than usual, no matter what our political affiliations are. Find out more about how that “buzz” translates into our digital world in an age of tablets, smart phones and laptops and how political combatants should be using them to their advantage in a constantly connected era.  Via Marketing Pilgrim.

Shawna Kenyon – Facebook testing ‘Save for Later’ Feature
Facebook is testing a new feature for both desktop and mobile, Save for Later will allow a user to add stories to a ‘Save’ folder similar to adding a tweet to your Favorites on Twitter however Facebook users won’t be notified when and if something they post is saved.  As with other features Facebook has tested, it looks like there is no assurance this will be rolled out beyond the test phase at this point.  Via Mashable.

Kodi Osmond – Grammar, Social Media, and You:  11 Common Mistakes to Avoid
I admit that in the heat of feverish text messaging and sharing, I go against my own grain and use improper grammar and punctuation and bank on people getting the gist.  This epidemic has been the subject of several news stories lately.  Not only has the age of texting shown itself on grammar tests in schools, but bad grammar can prevent you from getting a job, overshadows the level of knowledge we as professionals should convey, and can show (or result in) a lack of respect from your audience.  Let’s take a moment to reflect on what we learned in school and make our English teachers proud!  Via InformationWeek.

Thom Craver – Facebook Casebook: A Free Facebook Case Study Resource
Lat week Facebook reported that 84% of its overall revenues come from advertising.  As Facebook strives to add products and services to meet demands of advertisers there are many brands that are utilizing Facebook in a variety of ways to engage customers, generate sales leads, and boost sales revenue.  Via ClickZ.

Brian Larson – Tips for Improving Search Ranking Through Social Media
Post Panda traditional link acquisition efforts have been de-emphasized by many online marketers in place of social links. The question for many is ‘how do I use social to improve your search rankings?’ Social Media Examiner aims to answer that question in this post. Spoiler alert: quality content is the foundation.  Via Social Media Examiner.

Jolina Pettice – Twitter Home to Majority of 10 Million+ Monthly Global Big Brand Mentions [Report]
Find out which 10 brands with the most mentions worldwide, which channels are experiencing the most growth among the Fortune 100 and why companies are starting to create several accounts within the same network.  Via Search Engine Watch.

Sara Duane-Gladden – You’ll Soon Be Able to Download ALL Of Your Tweets Directly From Twitter
Do you have a lot tweets? If you’re a prolific tweeter like I am, you may have noticed that you can’t view more than a few thousand of your tweets. Soon, Twitter will be offering a way for you to download all of your tweets directly from Twitter.  Via mediabistro.

Time to Weigh In: Do you think that there should be a ban placed on Olympian’s which would restrict them from posting information about wins or losses?  If you’re an avid Gmail user, are you bothered by the ads that appear in the banner of your inbox?  Will you be taking advantage of Twitters new option to download your tweets, what do you see as the benefits of this option?


Email Newsletter
Gain a competitive advantage by subscribing to the
TopRank® Online Marketing Newsletter.

© Online Marketing Blog, 2012. |
Online Marketing News: Olympics Have Gone Social, Facebook Clicks From Bots, Improve Search Ranking Via Social Media, Download Twitter History | http://www.toprankblog.com

3 Ways to Use Social Media to Improve Your Search Rankings

without comments

social media how toAre you happy with the traffic coming to your blog or website?

Have you kept up with the changes in search engine optimization (SEO)?

This is essential for most businesses today.

Keep reading to learn how Google changes are making social media more important and what you can do about it.

Recent Google Changes Put Spotlight on Social Media

On April 24, 2012, webmasters around the world were dinged by the Google “Penguin” update—one of the latest in a series of algorithm modifications designed to weed out low-value results from the natural search engine results page (SERPs).

google penguin

Google's Penguin algorithm update was a recent search engine change designed to eliminate spam from the natural search results.

Whether or not your website’s traffic flow was affected by this update, it’s important to understand what this change entails, why it came about and how you can compensate for it using social media marketing—as Google’s heightened focus on web spam is likely to result in similar updates in the future.

The Google Penguin update had three primary objectives to minimize the SEO impact of the following web marketing elements:

  • Low-quality and manufactured website backlinks
  • On-page over-optimization
  • “Black hat” (or illicit) SEO techniques

Of primary interest to webmasters who use social networking platforms to promote their websites is the first item on this list—the presence of artificially created backlinks. In a post-Penguin environment, building natural backlinks should be a primary objective for webmasters.

Adapt your social media marketing with the following techniques to improve the SEO of your website.

#1: Use Social Networks to Generate Strong Content Ideas

Digital marketing today needs to be “natural.” Ideally, instead of building manufactured backlinks (as was a primary focus of past SEO best practices), your site should acquire links in a natural way—as it would if you did absolutely no promotional work whatsoever.

The key to building natural backlinks is to publish high-quality content that people will be inclined to share naturally on social media networks. After all, if you produce mediocre content, there’s no incentive for readers to link back to or share your website, making it even harder to get the backlinks needed to rank well in the new natural search results.

Social networking websites are a great place to find ideas for future content marketing pieces. Charlene Kingston outlines a great process for uncovering content ideas in your retweets in her “8 Ways to Discover Content Ideas From Your Readers” post, but you can also enter a series of prompts into Twitter’s search bar to uncover the topics your community has demonstrated an interest in.

results search

Results for a search "How do I SEO?"

As an example, the above image shows a few of the results displayed for the search query “How do I SEO?” which highlights several possible topics for future website articles.

Further ideas could be generated by entering any of the following prompts into Twitter search:

  • “How to” + industry keyword
  • “Why is” + industry keyword
  • “Question” + industry keyword
  • “?” + industry keyword

Pay special attention to search results that don’t include outbound links, as these are often—but not always—self-promotional in nature. In addition, look for results that have been retweeted by others, as these demonstrate audience appeal.

Once you’ve identified a few potential article topics, produce quality content on the subject and promote your posts using the technique described in Step #2.

#2: Encourage Backlink Building Via Social Media Presence

Social networking websites are a great place to build natural backlinks to your website, as both Google and Bing confirmed that they track publicly shared links on Facebook and Twitter.

To encourage the creation of these valuable links, you’ll want to take the following actions:

  • Build your networks on Facebook and Twitter, as more users results in more opportunities for link shares.
  • Tie your blog to your Facebook and Twitter accounts (or use a tool designed specifically for this purpose) so that a link to each of your new posts is automatically created on your profile.
  • Use an update-scheduling tool like Buffer to create repeat announcements of new blog posts that will allow you to reach users who are active at different times.

In addition, take the time to set up your Google+ page. There’s some indication that the number of “+1″ votes your articles receive is a factor that’s weighted in the natural search ranking algorithms. Boosting your presence here may help to ensure stronger SEO for your website.

google plus pages

Google+ pages can be created for different types of businesses.

#3: Build a Social Media Following to Reduce Reliance on Natural Search Traffic

One final thing to consider when using social media websites to protect yourself from future Penguin-like Google updates is the potential for social networking traffic to minimize your reliance on visitors from Google and the other search engines.

To see just how important this can be, take a look at the following list of some of the biggest post-Penguin losers (in terms of natural search visibility), as compiled by web data firm Searchmetrics:

penguin losers

Losers in Google's Penguin update experienced significant drops in search visibility.

In particular, take a look at the impact on great-quotes.com, which experienced a 94% decline in SEO visibility. If natural search traffic from Google was the site’s primary source of visitors, this single algorithm update could have dramatically decreased the company’s revenue.

To diversify your traffic base and increase the number of website visitors you receive from social media platforms, take the following actions:

  • Include prominent social sharing buttons at both the top and bottom of each blog post on your website (or use a scrolling option that moves down the page alongside your readers).
  • Add a direct appeal to your readers at the end of each blog post or email newsletter to encourage them to share your articles on their social networking profiles if they’ve found them useful.
  • Use social media to brand yourself as an authority figure within your industry.
call to action and sharing button

A blog article includes both a call to action and social sharing buttons.

This final recommendation serves two purposes. In the article referenced earlier on how Google and Bing weight social signals in their ranking algorithms, both search engines assert that they attempt to quantify the relative “authority” of a user as a part of their measurement of social signals.

However, being recognized as one of the “go-to” resources in your industry offers potent benefits from a traffic generation standpoint as well. When you’re recognized as a niche thought leader, your site will naturally attract repeat visitors and referrals—both of which can be great sources of traffic that aren’t subject to Google’s changing whims.

What do you think? What social media marketing actions are you taking to minimize the impact of search engine updates? Share them in the comments section below!

Strong words: How to craft premium content

without comments

Penguins and pandas, oh my!

There’s no way around the fact that the Google Panda and Penguin updates were a big blow to the way that many businesses executed their online marketing strategies. The move took many businesses by surprise, especially those that thought they only needed to bang out a few keyword-rich articles a day in order to stay up on the Google search rankings.

Web content now has to be more focused and substantial, featuring a minimum amount of links and a lesser emphasis on SEO. It’s Google’s way of leveling the playing field for online businesses and websites that didn’t stand a chance against bigger players before. Now anyone who can produce and market solid content in their niche has a chance at upping their Google search ranking. That’s a good thing, and the sooner you believe that the sooner you can get started on a new content production model.

Strong and engaging web content looks like this

Before Panda and Penguin, businesses could get by with a lot of questionable content. A post only needed to be keyword rich and uploaded on the right sites, and voila! Your ranking would get bumped up. But those days are over, and it’s time to recognize the power of strong content.

I’m talking about posts that engage the reader on real and pressing issues related to the particulars of your business. Strong web content is all about striking a chord with your customer base, and that means writing about topics that they will care about. As a business, you should be on top of all the current events and news alters pertaining to your field—this is the kind of information that makes for strong content.

Let’s consider an example of strong content production in action.

Say you run an online tech business that specializes in apps and software for financial management. In terms of content production, you’d want to post articles in a site blog or guest blog on related tech sites about the latest innovations in your business and in the industry at large. New features about your business’s financial management app; commentary on the financial crises in Europe or the handwringing in the US; how-to guides on money management and saving strategies—these are all completely viable ways of generating content. The point is that you write substantial material on topics that directly relate to your business model and its scope field of interest.

Where you’ll get in trouble

What you DON’T want to do is write about subjects that have no bearing on your business whatsoever. Writing filler content just for the sake of posting material online with your company’s name on it will not only get you in trouble with Google, it’ll ultimately drive down your ranking on the search engine. To continue the above example with the online tech company, let’s say that you try to produce more content by throwing caution to the wind and writing about whatever interests you at that moment. Celebrity gossip, pet care, college sports, whatever: you think that by writing about popular subjects outside of your business, more people might see your articles and follow the backlinks to your main site.

Unfortunately, the Penguin and Panda updates have done away with such practices. That example would be flagged and regarded as spam. And why would readers treat it as anything else anyway? After all, you’re a tech company writing about celebrity gossip, why should anyone take you seriously?

In Conclusion

Success in the post-Penguin era will be defined by those businesses and blogs that produce great content about their niche for people who care about their niche. It’ll be a big change for some businesses, but you should just accept the change as another test of the strength of your enterprise. If you can master the art of content creation in this era, then your business will surely go far.

Have any comments or questions? Please let me know! Katheryn Rivas is a freelance education writer and blogger. She loves to dabble in a variety of education topics, although her main interests include online learning and trends. She welcomes your comments at katherynrivas87@gmail.com.

How To Optimize Your Headline for Better Search Ranking

without comments

seo (1).jpgWhen matching search terms to articles, Google looks more closely at headlines than the rest of the text, so it’s important to know how to make your headline more noticeable.

First, make sure your keywords are in the headline — and as close to the beginning as possible. ”The importance of a keyword exponentially decays the further to the right it shows up in a title,” said David Wolf, an SEO expert and CEO of InBusiness, Inc.

Next, make sure your headlines and subheads are specific and on-point. “Headlines that are not specific enough do not come up in searches,” said Lisa Hickey, publisher of The Good Men Project, an online men’s magazine. “You want your headline to communicate one simple idea, specific enough so that people know what the post is about. This will not only help SEO, but will also help make the article more sharable. And the more it’s shared, the more search engines will see it as a post worthy of showing up in searches.”

Get more tips in 5 Ways to Improve Your Article’s SEO. [subscription required]

New Career Opportunities Daily: The best jobs in media.



Google chairman visits European Commissioner to discuss antitrust allegations

without comments

Google chairman Eric Schmidt paid a visit to Brussels today to meet with European Commissioner Joaquín Almunia in person.

Google has been prancing in a complicated minuet with the European Commission over its business practices, which some European companies say violate antitrust laws.

While some sources say the EC is getting ready to slap the search giant with a 400-page nastygram, Google is still wearing its poker face and stating it does not expect any formal objections from regulators.

The EC started a wide-ranging investigation of the search giant’s business practices in November 2010. At that time, several parties were alleging that Google was taking unfair advantage of what they called “a dominant position in online search.”

These parties stated that Google was “lowering the search ranking of unpaid search results of competing services” that competed with Google’s own offerings (for example, lowering the ranking of a shopping and product search website while raising the ranking of Google Shopping results). Another allegation is that Google set a lower Quality Score for its competitors’ sponsored links (Quality Scores help the company to set its ad prices; a lower score would mean a lower ad price) and that Google “imposes exclusivity obligations on advertising partners, preventing them from placing certain types of competing ads on their web sites, as well as on computer and software vendors, with the aim of shutting out competing search tools.”

The EC has previously said that although its investigation is quite formal and the allegations quite specifically corresponding to EU laws, “This initiation of proceedings does not imply that the Commission has proof of any infringements. It only signifies that the Commission will conduct an in-depth investigation of the case as a matter of priority.”

So far, Google has already turned over thousand of documents in compliance with the investigation.

Some sources, such as the Financial Times, are reporting that the EC’s investigation is about to yield a formal statement of objection, which would significantly escalate the import of the proceedings.

However, to date, Google has had no clear indication that a statement of objection is forthcoming.

A Googler told us that Schmidt’s visit with the Commissioner was “uneventful.”

“We frequently meet with policy makers and regulators around the world. We’re always happy to discuss issues affecting our industry and explain how our business works,” the company said in an official statement.

Image courtesy of Jolie O’Dell.

Filed under: VentureBeat



Introduction To URL Rewriting

without comments







 



 


Many Web companies spend hours and hours agonizing over the best domain names for their clients. They try to find a domain name that is relevant and appropriate, sounds professional yet is distinctive, is easy to spell and remember and read over the phone, looks good on business cards and is available as a dot-com.

Or else they spend thousands of dollars to purchase the one they really want, which just happened to be registered by a forward-thinking and hard-to-find squatter in 1998.

They go through all that trouble with the domain name but neglect the rest of the URL, the element after the domain name. It, too, should be relevant, appropriate, professional, memorable, easy to spell and readable. And for the same reasons: to attract customers and improve in search ranking.

Fortunately, there is a technique called URL rewriting that can turn unsightly URLs into nice ones?—?with a lot less agony and expense than picking a good domain name. It enables you to fill out your URLs with friendly, readable keywords without affecting the underlying structure of your pages.

This article covers the following:

  1. What is URL rewriting?
  2. How can URL rewriting help your search rankings?
  3. Examples of URL rewriting, including regular expressions, flags and conditionals;
  4. URL rewriting in the wild, such as on Wikipedia, WordPress and shopping websites;
  5. Creating friendly URLs;
  6. Changing pages names and URLs;
  7. Checklist and troubleshooting.

What Is URL Rewriting?

If you were writing a letter to your bank, you would probably open your word processor and create a file named something like lettertobank.doc. The file might sit in your Documents directory, with a full path like C:\Windows\users\julie\Documents\lettertobank.doc. One file path = one document.

Similarly, if you were creating a banking website, you might create a page named page1.html, upload it, and then point your browser to http://www.mybanksite.com/page1.html. One URL = one resource. In this case, the resource is a physical Web page, but it could be a page or product drawn from a CMS.

URL rewriting changes all that. It allows you to completely separate the URL from the resource. With URL rewriting, you could have http://www.mybanksite.com/aboutus.html taking the user to …/page1.html or to …/about-us/ or to …/about-this-website-and-me/ or to …/youll-never-find-out-about-me-hahaha-Xy2834/. Or to all of these. It’s a bit like shortcuts or symbolic links on your hard drive. One URL = one way to find a resource.

With URL rewriting, the URL and the resource that it leads to can be completely independent of each other. In practice, they’re usually not wholly independent: the URL usually contains some code or number or name that enables the CMS to look up the resource. But in theory, this is what URL rewriting provides: a complete separation.

How Does URL Rewriting Help?

Can you guess what this Web page sells?

http://www.diy.com/diy/jsp/bq/nav.jsp?action=detail&fh_secondid=11577676

B&Q went to all the trouble and expense of acquiring diy.com and implementing a stock controlled e-commerce website, but left its URLs indecipherable. If you guessed “brown guttering,” you might want to considering playing the lottery.

Even when you search directly for this “miniflow gutter brown” on Google UK, B&Q’s page comes up only seventh in the organic search results, below much smaller companies, such as a building supplier with a single outlet in Stirlingshire. B&Q has 300+ branches and so is probably much bigger in budget, size and exposure, so why is it not doing as well for this search term? Perhaps because the other search results have URLs like http://www.prof…co.uk/products/brown-miniflo-gutter-148/; that is, the URL itself contains the words in the search term.

screenshot

Almost all of these results on Google have the search term in their URLs (highlighted in green). The one at the bottom does not.

Looking at the URL from B&Q, you would (probably correctly) assume that a file named nav.jsp within the directory /diy/jsp/bq/ is used to display products when given their ID number, 11577676 in this case. That is the resource intimately tied to this URL.

So, how would B&Q go about turning this into something more recognizable, like http://www.diy.com/products/miniflow-gutter-brown/11577676, without restructuring its whole website? The answer is URL rewriting.

Another way to look at URL rewriting is like a thin layer that sits on top of a website, translating human- and search-engine-friendly URLs into actual URLs. Doing it is easy because it requires hardly any changes to the website’s underlying structure?—?no moving files around or renaming things.

URL rewriting basically tells the Web server that
/products/miniflow-gutter-brown/11577676 should show the Web page at: /diy/jsp/bq/nav.jsp?action=detail&fh_secondid=11577676,
without the customer or search engine knowing about it.

Many factors (or “signals”), of course, determine the search ranking for a particular term, over 200 of them according to Google. But friendly and readable URLs are consistently ranked as one of the most important of those factors. They also help humans to quickly figure out what a page is about.

The next section describes how this is done.

How To Rewrite URLs

Whether you can implement URL rewriting on a website depends on the Web server. Apache usually comes with the URL rewriting module, mod_rewrite, already installed. The set-up is very common and is the basis for all of the examples in this article. ISAPI Rewrite is a similar module for Windows IIS but requires payment (about $100 US) and installation.

The Simplest Case

The simplest case of URL rewriting is to rename a single static Web page, and this is far easier than the B&Q example above. To use Apache’s URL rewriting function, you will need to create or edit the .htaccess file in your website’s document root (or, less commonly, in a subdirectory).

For instance, if you have a Web page about horses named Xu8JuefAtua.htm, you could add these lines to .htaccess:

RewriteEngine On
RewriteRule   horses.htm   Xu8JuefAtua.htm

Now, if you visit http://www.mywebsite.com/horses.htm, you’ll actually be shown the Web page Xu8JuefAtua.htm. Furthermore, your browser will remain at horses.htm, so visitors and search engines will never know that you originally gave the page such a cryptic name.

Introducing Regular Expressions

In URL rewriting, you need only match the path of the URL, not including the domain name or the first slash. The rule above essentially tells Apache that if the path contains horses.htm, then show the Web page Xu8JuefAtua.htm. This is slightly problematic, because you could also visit http://www.mywebsite.com/reallyfasthorses.html, and it would still work. So, what we really need is this:

RewriteEngine On
RewriteRule   ^horses.htm$   Xu8JuefAtua.htm

The ^horses.htm$ is not just a search string, but a regular expression, in which special characters?—?such as ^ . + * ? ^ ( ) [ ] { } and $ —?have extra significance. The ^ matches the beginning of the URL’s path, and the $ matches the end. This says that the path must begin and end with horses.htm. So, only horses.htm will work, and not reallyfasthorses.htm or horses.html. This is important for search engines like Google, which can penalize what it views as duplicate content —?identical pages that can be reached via multiple URLs.

Without File Endings

You can make this even better by ditching the file ending altogether, so that you can visit either http://www.mywebsite.com/horses or http://www.mywebsite.com/horses/:

RewriteEngine On
RewriteRule   ^horses/?$   Xu8JuefAtua.html  [NC]

The ? indicates that the preceding character is optional. So, in this case, the URL would work with or without the slash at the end. These would not be considered duplicate URLs by a search engine, but would help prevent confusion if people (or link checkers) accidentally added a slash. The stuff in brackets at the end of the rule gives Apache some further pointers. [NC] is a flag that means that the rule is case insensitive, so http://www.mywebsite.com/HoRsEs would also work.

Wikipedia Example

We can now look at a real-world example. Wikipedia appears to use URL rewriting, passing the title of the page to a PHP file. For instance…

http://en.wikipedia.org/wiki/Barack_obama

… is rewritten to:

http://en.wikipedia.org/w/index.php?title=Barack_obama

This could well be implemented with an .htaccess file, like so:

RewriteEngine On
#Look for the word "wiki" followed by a slash, and then the article title
RewriteRule   ^wiki/(.+)$   w/index.php?title=$1   [L]

The previous rule had /?, which meant zero or one slashes. If it had said /+, it would have meant one or more slashes, so even http://www.mywebsite.com/horses//// would have worked. In this rule, the dot (.) matches any character, so .+ matches one or more of any character?—?that is, essentially anything. And the parentheses?—?( ) —?ask Apache to remember what the .+ is. The rule above, then, tells Apache to look for wiki/ followed by one or more of any character and to remember what it is. This is remembered and then rewritten as $1. So, when the rewriting is finished, wiki/Barack_obama becomes w/index.php?title=Barack_obama

Thus, the page w/index.php is called, passing Barack_obama as a parameter. The w/index.php is probably a PHP page that runs a database lookup?—?like SELECT * FROM articles WHERE title='Barack obama' —?and then outputs the HTML.

screenshot

You can also view Wikipedia entries directly, without the URL rewriting.

Comments and Flags

The example above also introduced comments. Anything after a # is ignored by Apache, so it’s a good idea to explain your rewriting rules so that future generations can understand them. The [L] flag means that if this rule matches, Apache can stop now. Otherwise, Apache would continue applying subsequent rules, which is a powerful feature but unnecessary for all but the most complex rule sets.

Implementing the B&Q Example

The recommendation for B&Q above could be implemented with an .htaccess file, like so:

RewriteEngine On
#Look for the word "products" followed by slash, product title, slash, id number
RewriteRule  ^products/.*/([0-9]+)$   diy/jsp/bq/nav.jsp?action=detail&fh_secondid=$1 [NC,L]

Here, the .* matches zero or more of any character, so nothing or anything. And the [0-9] matches a single numerical digit, so [0-9]+ matches one or more numbers.

The next section covers a couple of more complex conditional examples. You can also read the Apache rewriting guide for much more information on all that URL rewriting has to offer.

Conditional Rewriting

URL rewriting can also include conditions and make use of environment variables. These two features make for an easy way to redirect requests from one domain alias to another. This is especially useful if a website changes its domain, from mywebsite.co.uk to mywebsite.com for example.

Domain Forwarding

Most domain registrars allow for domain forwarding, which redirects all requests from one domain to another domain, but which might send requests for www.mywebsite.co.uk/horses to the home page at www.mywebsite.com and not to www.mywebsite.com/horses. You can achieve this with URL rewriting instead:

RewriteEngine On
RewriteCond   %{HTTP_HOST}   !^www.mywebsite.com$         [NC]
RewriteRule   (.*)           http://www.mywebsite.com/$1  [L,R=301]

The second line in this example is a RewriteCond, rather than a RewriteRule. It is used to compare an Apache environment variable on the left (such as the host name in this case) with a regular expression on the right. Only if this condition is true will the rule on the next line be considered.

In this case, %{HTTP_HOST} represents www.mywebsite.co.uk, the host (i.e. domain) that the browser is trying to visit. The ! means “not.” This tells Apache, if the host does not begin and end with www.mywebsite.com, then remember and rewrite zero or more of any character to www.mywebsite.com/$1. This converts www.mywebsite.co.uk/anything-at-all to www.mywebsite.com/anything-at-all. And it will work for all other aliases as well, like www.mywebsite.biz/anything-at-all and mywebsite.com/anything-at-all.

The flag [R=301] is very important. It tells Apache to do a 301 (i.e. permanent) redirect. Apache will send the new URL back to the browser or search engine, and the browser or search engine will have to request it again. Unlike all of the examples above, the new URL will now appear in the browser’s location bar. And search engines will take note of the new URL and update their databases. [R] by itself is the same as [R=302] and signifies a temporary redirect.

File Existence and WordPress

Smashing Magazine runs on the popular blogging software WordPress. WordPress enables the author to choose their own URL, called a “slug.” Then, it automatically prepends the date, such as http://coding.smashingmagazine.com/2011/09/05/getting-started-with-the-paypal-api/. In your pre-URL rewriting days, you might have assumed that Smashing Magazine’s Web server was actually serving up a file located at …/2011/09/05/getting-started-with-the-paypal-api/index.html. In fact, WordPress uses URL rewriting extensively.

screenshot

WordPress enables the author to choose their own URL for an article.

WordPress’ .htaccess file looks like this:

RewriteEngine On
RewriteBase /  
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

The -f means “this is a file” and -d means “this is a directory.” This tells Apache, if the requested file name is not a file, and the requested file name is not a directory, then rewrite everything (i.e. any path containing any character) to the page index.php. If you are requesting an existing image or the log-in page wp-login.php, then the rule is not triggered. But if you request anything else, like /2011/09/05/getting-started-with-the-paypal-api/, then the file index.php jumps into action.

Internally, index.php (probably) looks at the environment variable $_SERVER['REQUEST_URI'] and extracts the information that it needs to find out what it is looking for. This gives it even more flexibility than Apache’s rewrite rules and enables WordPress to mimic some very sophisticated URL rewriting rules. In fact, when administering a WordPress blog, you can go to Settings ? Permalink on the left side, and choose the type of URL rewriting that you would like to mimic.

screenshot

WordPress’ permalink settings, letting you choose the type of URL rewriting that you would like to mimic.

Rewriting Query Strings

If you are hired to recreate an existing website from scratch, you might use URL rewriting to redirect the 20 most popular URLs on the old website to the locations on the new website. This could involve redirecting things like prod.php?id=20 to products/great-product/2342, which itself gets redirected to the actual product page.

Apache’s RewriteRule applies only to the path in the URL, not to parameters like id=20. To do this type of rewriting, you will need to refer to the Apache environment variable %{QUERY_STRING}. This can be accomplished like so:

RewriteEngine On
RewriteCond   %{QUERY_STRING}           ^id=20$                   
RewriteRule   ^prod.php$             ^products/great-product/2342$      [L,R=301]
RewriteRule   ^products/(.*)/([0-9]+)$  ^productview.php?id=$1             [L]

In this example, the first RewriteRule triggers a permanent redirect from the old website’s URL to the new website’s URL. The second rule rewrites the new URL to the actual PHP page that displays the product.

Examples Of URL Rewriting On Shopping Websites

For complex content-managed websites, there is still the issue of how to map friendly URLs to underlying resources. The simple examples above did that mapping by hand, manually associating a URL like horses.htm with the file or resource Xu8JuefAtua.htm. Wikipedia looks up the resource based on the title, and WordPress applies some complex internal rule sets. But what if your data is more complex, with thousands of products in hundreds of categories? This section shows the approach that Amazon and many other shopping websites take.

If you’ve ever come across a URL like this on Amazon, http://www.amazon.co.uk/High-Voltage-AC-DC/dp/B00008AJL3, you might have assumed that Amazon’s website has a subdirectory named /High-Voltage-AC-DC/dp/ that contains a file named B00008AJL3.

This is very unlikely. You could try changing the name of the top-level “directory” and you would still arrive on the same page, http://www.amazon.co.uk/Test-Voltage-AC-DC/dp/B00008AJL3.

The bit at the end is what really matters. Looking down the page, you’ll see that B00008AJL3 is this AC/DC album’s ASIN (Amazon Standard Identification Number). If you change that, you’ll get a “Page not found” or an entirely different product: http://www.amazon.co.uk/High-Voltage-AC-DC/dp/B003BEZ7HI.

The /dp/ also matters. Changing this leads to a “Page not found.” So, the B00008AJL3 probably tells Amazon what to display, and the dp tells the website how to display it. This is URL rewriting in action, with the original URL possibly ending up getting rewritten to something like:
http://www.amazon.co.uk/displayproduct.php?asin=B00008AJL3.

Features of an Amazon URL

This introduces some important features of Amazon’s URLs that can be applied to any website with a complex set of resources. It shows that the URL can be automatically generated and can include up to three parts:

  1. The wordsIn this case, the words are based on the album and artist, and all non-alphanumeric characters are replaced. So, the slash in AC/DC becomes a hyphen. This is the bit that helps humans and search engines.
  2. An ID numberOr something that tells the website what to look up, such as B00008AJL3.
  3. An identifierOr something that tells the website where to look for it and how to display it. If dp tells Amazon to look for a product, then somewhere along the line, it probably triggers a database statement such as SELECT * FROM products WHERE id='B00008AJL3'.

Other Shopping Examples

Many other shopping websites have URLs like this. In the list below, the ID number and (suspected) identifier are in bold:

  • http://www.ebay.co.uk/itm/Ian-Rankin-Set-Darkness-Rebus-Novel-/140604842997
  • http://www.kelkoo.com/c-138201-lighting/brand/caravan
  • http://www.ciao.co.uk/Fridge_Freezers_5266430_3
  • http://www.gumtree.com/p/for-sale/boys-bmx-bronx-blaze/97669042
  • http://www.comet.co.uk/c/Televisions/LCD-Plasma-LED-TVs/1844

A significant benefit of this type of URL is that the actual words can be changed, as shown below. As long as the ID number stays the same, the URL will still work. So products can be renamed without breaking old links. More sophisticated websites (like Ciao above) will redirect the changed URL back to the real one and thus avoid creating the appearance of duplicate content (see below for more on this topic).

screenshot

Websites that use URL rewriting are more flexible with their URLs?—?the words can change but the page will still be found.

Friendly URLs

Now you know how to map nice friendly URLs to their underlying Web pages, but how should you create those friendly URLs in the first place?

If we followed the current advice, we would separate words with hyphens rather than underscores and capitalize consistently. Lowercase might be preferable because most people search in lowercase. Punctuation such as dots and commas should also be turned into hyphens, otherwise they would get turned into things like %2C, which look ugly and might break the URL when copied and pasted. You might want to remove apostrophes and parentheses entirely for the same reason.

Whether to replace accented characters is debatable. URLs with accents (or any non-Roman characters) might look bad or break when rendered in a different character format. But replacing them with their non-accented equivalents might make the URLs harder for search engines to find (and even harder if replaced with hyphens). If your website is for a predominately French audience, then perhaps leave the French accents in. But substitute them if the French words are few and far between on a mainly English website.

This PHP function succinctly handles all of the above suggestions:

function GenerateUrl ($s) {
  //Convert accented characters, and remove parentheses and apostrophes
  $from = explode (',', "ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u,(,),[,],'");
  $to = explode (',', 'c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u,,,,,,');
  //Do the replacements, and convert all other non-alphanumeric characters to spaces
  $s = preg_replace ('~[^\w\d]+~', '-', str_replace ($from, $to, trim ($s)));
  //Remove a - at the beginning or end and make lowercase
  return strtolower (preg_replace ('/^-/', '', preg_replace ('/-$/', '', $s)));
}

This would generate URLs like this:

echo GenerateUrl ("Pâtisserie (Always FRESH!)"); //returns "patisserie-always-fresh"

Or, if you wanted a link to a $product variable to be pulled from a database:

$product = array ('title'=>'Great product', 'id'=>100);
echo '<a href="' . GenerateUrl ($product['title']) . '/' . $product['id'] . '">';
echo $product['title'] . '</a>';

Changing Page Names

Search engines generally ignore duplicate content (i.e. multiple pages with the same information). But if they think they are being manipulated, search engines will actively penalize the website, so avoid this where possible. Google recommends using 301 redirects to send users from old pages to new ones.

When a URL-rewritten page is renamed, the old URL and new URL should both still work. Furthermore, to avoid any risk of duplication, the old URL should automatically redirect to the new one, as WordPress does.

Doing this in PHP is relatively easy. The following function looks at the current URL, and if it’s not the same as the desired URL, it redirects the user:

function CheckUrl ($s) {
  // Get the current URL without the query string, with the initial slash
  $myurl = preg_replace ('/\?.*$/', '', $_SERVER['REQUEST_URI']);
  //If it is not the same as the desired URL, then redirect
  if ($myurl != "/$s") {Header ("Location: /$s", true, 301); exit;}
}

This would be used like so:

$producturl = GenerateUrl ($product['title']) . '/' . $product['id'];
CheckUrl ($producturl); //redirects the user if they are at the wrong place

If you would like to use this function, be sure to test it in your environment first and with your rewrite rules, to make sure that it does not cause any infinite redirects. This is what that would look like:

screenshot

This is what happens when Google Chrome visits a page that redirects to itself.

Checklist And Troubleshooting

Use the following checklist to implement URL rewriting.

1. Check That It’s Supported

Not all Web servers support URL rewriting. If you put up your .htaccess file on one that doesn’t, it will be ignored or will throw up a “500 Internal Server Error.”

2. Plan Your Approach

Figure out what will get mapped to what, and how the correct information will still get found. Perhaps you want to introduce new URLs, like my-great-product/p/123, to replace your current product URLs, like product.php?id=123, and to substitute new-category/c/12 for category.php?id=12.

3. Create Your Rewrite Rules

Create an .htaccess file for your new rules. You can initially do this in a /testing/ subdirectory and using the [R] flag, so that you can see where things go:

RewriteEngine On
RewriteRule   ^.+/p/([0-9]+)   product.php?id=$1    [NC,L,R]
RewriteRule   ^.+/c/([0-9]+)   category.php?id=$1    [NC,L,R]

Now, if you visit www.mywebsite.com/testing/my-great-product/p/123, you should be sent to www.mywebsite.com/testing/product.php?id=123. You’ll get a “Page not found” because product.php is not in your /testing/ subdirectory, but at least you’ll know that your rules work. Once you’re satisfied, move the .htaccess file to your document root and remove the [R] flag. Now www.mywebsite.com/my-great-product/p/123 should work.

4. Check Your Pages

Test that your new URLs bring in all the correct images, CSS and JavaScript files. For example, the Web browser now believes that your Web page is named 123 in a directory named my-great-product/p/. If the HTML refers to a file named images/logo.jpg, then the Web browser would request the image from www.mywebsite.com/my-great-product/p/images/logo.jpg and would come up with a “File not found.”

You would need to also rewrite the image locations or make the references absolute (like <img src="/images/logo.jpg"/>) or put a base href at the top of the <head> of the page (<base href="/product.php"/>). But if you do that, you would need to fully specify any internal links that begin with # or ? because they would now go to something like product.php#details.

5. Change Your URLs

Now find all references to your old URLs, and replace them with your new URLs, using a function such as GenerateUrl to consistently create the new URLs. This is the only step that might require looking deep into the underlying code of your website.

6. Automatically Redirect Your Old URLs

Now that the URL rewriting is in place, you probably want Google to forget about your old URLs and start using the new ones. That is, when a search result brings up product.php?id=20, you’d want the user to be visibly redirected to my-great-product/p/123, which would then be internally redirected back to product.php?id=20.

This is the reverse of what your URL rewriting already does. In fact, you could add another rule to .htaccess to achieve this, but if you get the rules in the wrong order, then the browser would go into a redirect loop.

Another approach is to do the first redirect in PHP, using something like the CheckUrl function above. This has the added advantage that if you rename the product, the old URL will immediately become invalid and redirect to the newest one.

7. Update and Resubmit Your Site Map

Make sure to carry through your new URLs to your site map, your product feeds and everywhere else they appear.

Conclusion

URL rewriting is a relatively quick and easy way to improve your website’s appeal to customers and search engines. We’ve tried to explain some real examples of URL rewriting and to provide the technical details for implementing it on your own website. Please leave any comments or suggestions below.

(al)


© Paul Tero for Smashing Magazine, 2011.