July 5, 2012

Generate word like sentences

Randomizing data is fun so I decided to make a program to automatically fill objects with data.
i started out with totally random data, but I realized that 'lhghhjfdr' is not that fun to put as value in a string so I have been producing generators.

Generators that can do the Lorem ipsum, generate random names, superhero names, street names, titles, cities... and generating random stuff started to be fun again.

One thing I wanted to do was to generate word like words, strings that have no meaning but are easy to read and pronounce.  Pseudo random text.
I found this and it gave me a basic algorithm, I spiced it up with this to give each letter a statistical weight.

The random word is made of a random number of parts that are concatenated. The parts are either v, cv or cvc where c stands for a consonant and v stands for a vocal. which part type I choose is of course random.

The probability for each letter is according to the letter frequency in the english language:


private void LoadLetters()
{
Vowels = new List<WeightedLetter>();
Consonants = new List<WeightedLetter>();
Vowels.Add(new WeightedLetter() { Letter = 'a', Weight = 81 });
Vowels.Add(new WeightedLetter() { Letter = 'e', Weight = 127 });
Vowels.Add(new WeightedLetter() { Letter = 'i', Weight = 70 });
Vowels.Add(new WeightedLetter() { Letter = 'o', Weight = 75 });
Vowels.Add(new WeightedLetter() { Letter = 'u', Weight = 28 });
Consonants.Add(new WeightedLetter() { Letter = 'b', Weight = 15 });
Consonants.Add(new WeightedLetter() { Letter = 'c', Weight = 28 });
Consonants.Add(new WeightedLetter() { Letter = 'd', Weight = 43 });
Consonants.Add(new WeightedLetter() { Letter = 'f', Weight = 23 });
Consonants.Add(new WeightedLetter() { Letter = 'g', Weight = 20 });
Consonants.Add(new WeightedLetter() { Letter = 'h', Weight = 60 });
Consonants.Add(new WeightedLetter() { Letter = 'j', Weight = 2 });
Consonants.Add(new WeightedLetter() { Letter = 'k', Weight = 7 });
Consonants.Add(new WeightedLetter() { Letter = 'l', Weight = 40 });
Consonants.Add(new WeightedLetter() { Letter = 'm', Weight = 25 });
Consonants.Add(new WeightedLetter() { Letter = 'n', Weight = 67 });
Consonants.Add(new WeightedLetter() { Letter = 'p', Weight = 19 });
Consonants.Add(new WeightedLetter() { Letter = 'q', Weight = 1 });
Consonants.Add(new WeightedLetter() { Letter = 'r', Weight = 60 });
Consonants.Add(new WeightedLetter() { Letter = 's', Weight = 63 });
Consonants.Add(new WeightedLetter() { Letter = 't', Weight = 90 });
Consonants.Add(new WeightedLetter() { Letter = 'v', Weight = 10 });
Consonants.Add(new WeightedLetter() { Letter = 'w', Weight = 24 });
Consonants.Add(new WeightedLetter() { Letter = 'x', Weight = 2 });
Consonants.Add(new WeightedLetter() { Letter = 'y', Weight = 20 });
Consonants.Add(new WeightedLetter() { Letter = 'z', Weight = 1 });
}


Choosing parts:



private static string GeneratePart()
{
int PartTypeChoose = GeneratorData.Instance.Randomizer.Next(8);
if (PartTypeChoose == 0) //less single vovels
//v
{
return  GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels);
}
else if (PartTypeChoose < 4) //3 out of 8 seems good
//cv
return GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels);
else 
//cvc
return GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants);
}

Creating some text returns this:

hix ridholo nilroclel ananlu lishavho sehehef casho tiba tendutose senoset tevsehro atota rignu getil hamoko votrof riheu deta oa reue sorfartim rocdino atehhi sesraw mit etos wereh ret teqe moletoh tata e hecladi dadug lepxu dehdon hey tifoi usencaw bi hepnis ita hemeleyna nohni wahi coyitif xor nahtan sohaa wenora no sateto tapef larnemacag tiyco cagsa yer husra ta fa

Isn't it beautiful?

No comments:

Post a Comment