Generating Random Unicode Strings in C#

January 14, 2009 – 11:05 pm

I have to admit this is quite a dull subject, but I thought it might help some guys( and gals! ) out there.

I am working on a C# project for uni, and while fighting some pesky bugs, I decided I should get more organized and have a small “unit testing” framework built for it.

I had a need for random generated Unicode strings, and quick googling turned up no results. Instead of doing some broaded searching, I decided I could learn more by writing my own code:

class RandomUnicodeString
    {
        private Random _r;
 
        public RandomUnicodeString()
        {
            _r = new Random();
        }
 
        public string GetString(int length)
        {
            byte[] str = new byte[length * 2];
 
            for (int i = 0; i < length * 2 ; i+=2)
            {
                int chr = _r.Next(0xD7FF);
                str[i+1] = (byte)((chr & 0xFF00) >> 8);
                str[i] = (byte)(chr & 0xFF);
            }
 
            return Encoding.Unicode.GetString(str);
        }
    }

At first I had the string generation in the constructor, but it caused repeated strings, as I was using it in a loop, and newly created Random objects were using the same seed. More on this can be found in Guy’s blog.

If you want to limit the string to be of a specific language, there’s a reference of unicode mapping on Wikipedia, just change _r.Next to use those limits.

There’s probably a better way for converting bytes to unicode chars, and this may not work on a little/big endian machine, but I don’t know enough about C# to tell for sure.

If you got here from google and this solved your coding problem, please post a comment ! =)

  1. 2 Responses to “Generating Random Unicode Strings in C#”

  2. Beautiful solution! Thank you =)

    By kat on Feb 9, 2009

  3. Brilliant! This is exactly what I am looking for, and the character limitation to certain languages is really cool!

    By mark on Mar 12, 2009

Post a Comment