Interesting paper on “small” vs “large” LLMs.
2408.16737v1.pdf (1.1 MB)
1 Like
Interesting. Do I get it right that fine tuning using synthetic data from smaller models resulted in better performance, because smaller models can provide more samples for fine tuning for the same budget?
That’s the way i read it. Probably will create lots of debate.
1 Like
Agree - would be interesting to see if there is a gap (and if so how big it is) if the same number of samples is taken from a “stronger” LLM.
Crazy how quick things develop in that space these days… currently playing around with some of the older and newer vision models and really blown away how good even “smaller” models (~7b Parameter) have become…