•
Launching GAT – An SAT for your AI
- What are the possible internal & external use-cases if I had an accurate AI system on top of our data & systems?
- For each use-case, what is the concrete business impact of this AI system at different levels of reliabilty and performance?
- Given how rapidly the AI ecosystem is evolving, how do I make the right strategic choice of vendor or technique to build the AI solution for my use-case?
- Given the use-case, what is the right form factor for my AI solution? What is the right entry point for my stakeholders to maximize impact?
- Given the use-case, how do I plan a roadmap in terms of scope and in terms of accuracy for my AI project?
- How can I objectively evaluate progress of my AI project? Whether its vendor or whether it's something we're building in-house?
- What is a GAT
- Why are existing GenAI Assessment methods failing
- The GAT methodology
- Start a GenAI assessment with us
What is a GAT?
- An evalset with business-impact noted against each eval. (Read more about what evals are and how AI researchers use them to build AI here). This helps you define the scope, roadmap and expected ROI of your AI project. This also helps you measure evaluation and progress concretely as your AI project goes from evaluation to pilot to rollout.
- A decision making framework, that maps your use-case into a 3x3 AI strategy matrix. This helps you choose the right technique to customize generic LLMs to your domain and the right model family that aligns with your workload.
Why are existing methods to assess GenAI failing?
The GAT methodology
- Evals are your only defense against AI snake oil: Evals are well known amongst AI researchers and engineers. However, we've realized that it is in fact the most critical asset for an AI leader to create clarity in driving a GenAI project. Whether it's a product that is purchased or a solution that is built in-house. Evals are the only way to define the scope of an AI system, and the only way to assess its current capability level.
- Institutional knowledge determines the approach: Every Gen AI system that integrates with proprietary data and systems is dependent on integrating institutional knowledge (documented, tribal or tacit). Understanding the extent of that dependence, and the lifecycle of that institutional knowledge is critical to evaluating the right AI approach.
- Mapping workloads to LLM output types optimizes model family: Today there are 3 primary types of enterprise workloads (search, act & solve) that map directly to generative outputs that LLMs are optimized for. Understanding that workload is critical to creating a longer term AI strategy, making you resilient to constant model updates and also continuously benefit from these updates.
Read more
- Evals 101 for executives: What is an eval? How does it help define scope, measure progress, create a roadmap & assess ROI?
- The 3x3 decision making framework for GenAI Assessment