Tejas Kumar: The future of AI isn’t LLMs, but affordable small language models
Tejas Kumar, an AI DevRel Engineer at DataStax, took the stage at the Infobip Shift conference with a no-hype, straight-to-the-point talk on AI.
He broke down what AI engineering looks like today, sharing techniques for cutting costs, avoiding hallucinations, and what’s going to be key for building the next wave of AI systems.
RAG solves the top 3 AI limitations
The main limitations developers face today when working with AI are hallucinations, knowledge cutoffs, and finite context windows. Tejas believes that these three “flies” can be swatted in one strike using a technique called Retrieval-Augmented Generation (RAG), which combines pre-trained language models with a real-time data retrieval system:
With RAG, you fetch data from an authoritative source and use it to enhance or alter the generated text from an LLM. This data reaches the LLM through prompt engineering.
Tejas demonstrated how RAG works with a simple example. Kumar illustrated the RAG process with just a few clicks: he inputs a webpage into an embedding model, which then numerically encodes the data.
This model performs a similarity search, pulling relevant information from the database to answer the user’s question. This process ensures that responses are based on the most up-to-date information, effectively eliminating hallucinations common in LLMs like GPT.
Chatbots are boring – AI should feel real
AI chatbots are everywhere today, but Tejas believes they’re mostly boring. They serve a purpose, but that purpose is very narrowly defined. That’s why Tejas offers an example of how a chatbot can be used more broadly-like searching Netflix.
Tejas entered “movies with a strong female lead” into Netflix’s search system, which traditionally might return incorrect or no results. However, if a search system uses semantic AI in the background—understanding the meaning of the user’s query rather than just keywords – the user experience can be significantly enhanced:
With semantic search, we improve search results and generate interactive user interfaces that understand user intent on demand.
Tejas illustrated how DataStax developed a tool for semantic search that not only delivers accurate results for such queries but can generate an interactive user interface (UI) on demand. This means that by typing “movies with a strong female lead,” Netflix could present relevant movie posters and trailers. This kind of interactive UI represents the future of AI, where developers can use models like Langflow to integrate AI into applications without disrupting the user experience, Tejas emphasized:
As developers, we have a responsibility to our users. We must build AI experiences beyond simple chatbots and deliver real, purposeful interactions.
SSMs instead of LLMs?
Looking ahead, Tejas sees a shift from general-purpose LLMs to small specialized models (SSMs), which is his (unofficial) term for AI systems tailored to specific tasks:
What if, instead of models like GPT-4 with 600 billion parameters, we had a smaller model with 7 billion specialized parameters? That’s the future, and that’s where we should invest.
Tejas believes companies will turn to smaller models focused on individual needs. That way developers will drastically cut costs while maintaining good product performance.
Building Responsible AI Must Come First
AI must be developed ethically, and one of the key things to watch out for is what Tejas calls “authority bias” – where users assume that results generated by AI are always correct simply because they come from an authoritative-sounding source:
We need to be transparent about the data used to train LLMs. AI should be able to say, “Hey, this data might be wrong.”
The future of AI is in creating tools that allow models to recognize the limits of their capabilities. When AI can’t provide an answer, it should be able to use external tools or APIs to retrieve the necessary information to ensure accuracy.
In conclusion, Tejas encourages developers to think beyond simple chatbots because he believes the future of AI is tied to combining the power of LLMs with specialized models and dynamic interfaces that enhance user experiences.
AI won’t replace developers, but some skills will disappear
Tejas could also be heard further on the panel “AI-Powered Development Tools: Enhancing or Replacing Human Developers?” where he was joined by Simi Olabisi, an AI expert from Microsoft, and the discussion was moderated by our executive editor Antonija Bilić Arar on the ShiftMag stage!
AI will do the opposite of what people expect. It won’t replace developers; it will make them better at their job, says Simi:
The tools we’re building at Microsoft are designed to handle repetitive tasks, allowing developers to focus on more complex and creative activities.
This brings us to the question of juniors and how they will learn. Simi believes they won’t need to spend time mastering basic tasks:
Just like floppy disks became obsolete, some fundamental skills may become less important to master, but that doesn’t mean they’ll skip important lessons. They’ll face challenging tasks early in their careers.
Think about this as an evolution from a paintbrush to a camera. Tejas pointed out that basic tools of human creativity are still necessary to solve 70 to 80% of coding tasks, but human oversight and creativity remain essential. Simi concluded that we’re not facing any dramatic change within the next five years. Tools will advance, and AI will continue to enhance our abilities, but developers remain a key part of the entire process.