top of page

How to efficiently send data to LLM models: reduce token usage and save costs

  • Writer: Aakash Walavalkar
    Aakash Walavalkar
  • Nov 2
  • 3 min read

For organizations, developers, and creative professionals, large language models (LLMs) such as Google Gemini 2.5 Pro and GPT-4o have become indispensable tools. These models handle data in tokens, whether you're creating chatbots, document summaries, or insights. Additionally, as the most of LLM APIs impose fees on token usage, you can drastically reduce your expenses by streamlining your data transmission.


Using details gathered from Tech-Aakash's open-source GitHub project "How to Efficiently Send Data to LLM Models," this article discusses how to send data to LLM models efficiently. You'll discover the results of the study, the importance of format, and how minor adjustments can have a significant impact.


Azure OpenAI test with JSON & YAML

Azure OpenAI test with JSON & YAML

Google Gemini 2.5 pro test with JSON & YAML

Google Gemini 2.5 pro test with JSON & YAML


"Efficiently send data to LLM models": What Does It Mean?

The text you send (referred to as a prompt) is transformed into tokens, which are little words or passages of text, each time you engage with an AI model.


  • Every model prices and processes data according to tokens.

  • Longer reaction times and higher costs are associated with more tokens.


Therefore, you will spend more money on each request if your data is extensive or has additional formatting. Effective data formatting can help with it.


The Problem: JSON Uses More Tokens Than Necessary


Most developers use JSON (JavaScript Object Notation) to structure data. JSON is great for machines but comes with extra characters like {}, "", and , — which all count toward your token usage.

Example:

{

"Name": "John Smith",

"Role": "Data Scientist",

"Visa": "H1B"

}


These quotes, commas, and brackets increase token counts unnecessarily. When sending such data to LLMs repeatedly, it adds up and so do the costs.


The Alternative: YAML – A Cleaner, More Compact Format


The repository proposes a simple experiment: compare JSON with YAML (YAML Ain’t Markup Language). YAML uses indentation instead of punctuation to organize information.

Example:

Name: John Smith

Role: Data Scientist

Visa: H1B


Same information, fewer characters — and therefore, fewer tokens.


The Experiment: Measuring Token Usage Across Models


The GitHub project used two major AI platforms:

  • Azure OpenAI GPT-4o

  • Google Gemini 2.5 Pro


Two identical data files one in JSON and one in YAML were tested to count how many tokens each format consumed.


Tools Used

  • tiktoken (for counting tokens in OpenAI models)

  • Google’s token counter (for Gemini models)

  • Python Jupyter notebooks for easy visualization


The Results: YAML Wins by About 30%

Model

JSON Tokens

YAML Tokens

Savings

GPT-4o

59

42

28.8% less tokens

Gemini 2.5 pro

75

52

30.7% less tokens

That’s a consistent 30% reduction in token usage just by changing the file format!

Imagine cutting your API costs and latency by nearly a third, without altering any content or losing information.


Why YAML Is More efficiently send data to LLM models


  • Compact syntax: YAML removes extra punctuation like commas and quotes.

  • Better readability: It’s easier for humans to read and edit than JSON.

  • Same meaning: LLMs understand YAML as easily as JSON, so no information is lost.

  • Lower cost per request: Fewer tokens = lower bills on paid APIs.


When you efficiently send data to LLM models, you optimize both speed and cost, a win-win for developers and organizations alike.


Real-World Impact


For companies using LLMs heavily (chatbots, automation tools, or data pipelines), a 30% token reduction translates directly into cost savings. For example:


  • If your LLM usage costs $1,000 per month, switching to YAML could save around $300 monthly.

  • Faster processing means shorter wait times for users and smoother workflows.


Future Scope of Efficient Data Sending


The research also suggests future directions:


  • Testing other LLMs such as Claude or Mistral.

  • Exploring multi-line YAML blocks for complex data.

  • Building automated converters to transform JSON into optimized YAML for LLM prompts.


These ideas could lead to even more savings and performance improvements.



Comments


Empowering data science enthusiasts with vibrant discussions, expert Q&A, and specialized services for an accelerated journey towards success.

Thanks for subscribing!

bottom of page