I tweaked my Markov text generator a bit, but by far the most interesting thing is how unstructured the output becomes depending on the preference for higher-order context. In pic related, the left and right sides have identical training and generation parameters, other than the function I use to update the probability to use higher-order statistics (e.g. using P('d' | "Lor") vs. P('d' | "or") vs. P('d' | "o") vs. P('d')) when sampling a character.
The left side uses this:
// no preference for higher vs lower order sampling
// if they're equal it's 50/50, chance to fire is proportional to the difference in probability
p_target = 0.5f * (p_fire_correct - p_hold_correct) + 0.5f;
While the right side uses:
// always prefer higher order sampling unless lower order is more accurate
// if they're equal always use higher order
p_target = min(1.f, 1.f + (p_fire_correct - p_hold_correct));
p_fire_correct is the probability that "firing" (using higher-order sampling) will produce the observed character, and where p_hold_correct is the probability that using this-order context will produce the correct character.
The firing probability is basically a moving average of prior target probabilities:
Comment too long. Click here to view the full text.