limits?

#32
by tcporco - opened

What are the limitations for prompt length and number of output tokens? I am finding the model takes an inordinately long time to respond. Are there other limitations regarding memory we should know about?

When I run the model on llama server and just said "Hi", it started talking to itself.

Sign up or log in to comment