Empirical Research · June 2026
Empirical analysis of file upload token consumption in Claude LLM conversation sessions. The cost isn't in what you say — it's in what you upload.
Read the paper ↓ scroll"Most people optimizing their prompts while the real cost is quietly sitting in the attachments tab."
Abstract
Large language models such as Anthropic's Claude operate on a context window — a fixed-size buffer of tokens that constitutes every piece of information the model processes per inference call. When users upload files to a Claude session, those files are tokenized and injected into this context window, where they persist for the entire conversation. This paper presents an empirical investigation into the token cost of file uploads relative to total session token consumption. Across five file types (PDF, DOCX, XLSX, plain text, and source code), four conversation lengths (5, 10, 20, and 40 turns), and controlled prompt sizes, we measure and report the proportion of tokens attributable to uploaded content versus user-generated messages. Our findings confirm that uploaded files account for approximately 95–99% of cumulative token usage in typical multi-turn sessions. These results carry significant financial implications for both API users and SaaS consumers, and motivate a set of session design recommendations — including prompt caching, selective context injection, and document pre-summarization — to minimize unnecessary token expenditure.
Core Metric
Tfile = token count of the uploaded file (constant per turn)
Nturns = number of conversation turns
Ttotal = cumulative input tokens for the session
Research Figures