limits?
#32
by
tcporco
- opened
What are the limitations for prompt length and number of output tokens? I am finding the model takes an inordinately long time to respond. Are there other limitations regarding memory we should know about?
When I run the model on llama server and just said "Hi", it started talking to itself.