Phonotactics- the recipes for building new words.
So someone created a fake language. But they died, or got a real job or otherwise abandoned it. How to move it forward if they didn’t document the phonotactics?
There are many word generators and they mostly use very small domain specific languages (DSLs). For example
C = XYZ
V = AEI
Valid Patterns: CV, CVCV, CCVV, VCV
And usually there are additional rules to reduce the total number of rules, e.g. (C)VCV means VCV can optionally start with a consonant.
Now the values of C, V and “Valid Patterns” are all sort of simple. So why not generate rule sets at random and then score how often they are able to account for the existing words? And to further optimize the algorithm, mutate the best sets or genetically cross them (take half of the rules of each highly performing rule set and check to see how suitable a new merged ruleset is)
This would allow for providing a list of sample words, generating a rule set and then generating a list of potential new words.
What this won’t do: it won’t account for things like in CVCV, the two vowels will be similar to each other because people have lazy tongues, so the vowels sometimes become similar or identical. But with enough computations, defects like these might become unimportant.