You are currently viewing Simple Way To Remove Special Characters From C# String

One of our clients has an ASP.Net C# application that pulls string data from another source. The text it is using is usually copy and pasted by the user. Doing this seems fine on the surface and fine from the users perspective but could prove problematic for developers.

Sometimes when someone copies from Microsoft Word or another text source. Hidden characters come along with it. Even though you don’t see them initially. When the text displays on the application they can appear as ? or a small dot. This confuses the user and anyone looking at the site.

He is some code we used to ‘clean up’ copy and pasted text by removing the hidden characters. Use this c# string to remove all special characters.

 

[code]

markup = Regex.Replace(markup, @”<[^>]+>|&nbsp;”, “”).Trim();

markup = Regex.Replace(markup, @”[^\u0020-\u007F]”, String.Empty);

[/code]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.