MTP Support?

#2
by LeePapa - opened

Does the model support multi-token prediction? if so, how do you configure it within inference engines like vLLM or llama.cpp?

Sign up or log in to comment