C# Code: How to transform Åäö to Aao
I have extended the Friendly Url Rewriter in one project to rewrite all URL:s following a specific pattern to a search page.
Search Engine Optimization (SEO) with Friendly URL Rewriter
Instead of having to use URL that looks like this:
http://www.example.com/Search.aspx?country=åland?type=hotel
We accept URL:s in a more friendly format and rewrite internally to the format above with parameters:
http://www.example.com/åland-hotel.html
There exists no file with that name on the server instead we recognize the pattern with a regular exception like this one:
^(?<country>.+)-(?<type>.+)\.html$
Google AdWords does not allow diacritic marks in URLS
Using letters with accents, rings and umlaut a url is not allowed with Google AdWords so we needed a generic matching algorithm that would both recognize “aland” and “åland” when comparing search parameters.
There is a nice Unicode function, String.Normalize(), that makes it very easy to transform unwanted characters into something allowed.
Fabrice wrote a snippet of code to transform Åäö to Aao that I used:
public static String RemoveDiacritics(string s)
{
string normalizedString = s.Normalize(NormalizationForm.FormD);
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < normalizedString.Length; i++)
{
char c = normalizedString[i];
if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
stringBuilder.Append(c);
}
return stringBuilder.ToString();
}