Let us assume that you have a magical machine that can immediately optimize all parameters of an AI model as such that it is the most "intelligent". Let us ignore that there is no agreed upon metric of what that means, and just assume that we have some magical metric that measures intelligence correctly.Given all that, how good would a, say, 4B parameter model be? What is the ceiling of how intelligent a small model like this can be with established transformer architecture?
>>107824208>wants to know how "intelligent" a model can become>can't measure intelligence