Context Size Error But Size is Less Than Limit

Seeing this error from time to time:

Context(debugDescription: "Content contains 4089 tokens, which exceeds the maximum allowed context size of 4096.", underlyingErrors: [])

Of course, 4089 is less than 4096 so what is this telling me and how do I work around it? Is the limit actually lower than 4096?

Looking around, sounds like perhaps the 4096 also needs to hold tokens beyond the content?

If correct, it would be nice to have this broken out better in the error or something similar because just reading this makes it sound like a framework bug.

Great question @Hunter!

You're completely correct, this error message is very confusing:

Context(debugDescription: "Content contains 4089 tokens, which exceeds the maximum allowed context size of 4096.", underlyingErrors: [])

We're actively investigating the issue, but until we have the error message fixed, here's what's going on:

  • The context window size is always 4096
  • When the model throws this error at a number close to to limit (4089 above) that means the model doesn't have enough tokens to generate its response. In other words, it had 4096 - 4089 = 7 tokens left, but it needs more than 7 tokens to produce its response.
Context Size Error But Size is Less Than Limit
 
 
Q