Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

    MiniMax’s new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

    OpenAI deploys Cerebras chips for ‘near-instant’ code generation in first major move beyond Nvidia

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026

      To avoid accusations of AI cheating, college students are turning to AI

      January 29, 2026

      ChatGPT can embrace authoritarian ideas after just one prompt, researchers say

      January 24, 2026
    • Business

      The HDD brand that brought you the 1.8-inch, 2.5-inch, and 3.5-inch hard drives is now back with a $19 pocket-sized personal cloud for your smartphones

      February 12, 2026

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025
    • Crypto

      How Polymarket Is Turning Bitcoin Volatility Into a Five-Minute Betting Market

      February 13, 2026

      Israel Indicts Two Over Secret Bets on Military Operations via Polymarket

      February 13, 2026

      Binance’s October 10 Defense at Consensus Hong Kong Falls Flat

      February 13, 2026

      Argentina Congress Strips Workers’ Right to Choose Digital Wallet Deposits

      February 13, 2026

      Monero Price Breakdown Begins? Dip Buyers Now Fight XMR’s Drop to $135

      February 13, 2026
    • Technology

      Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

      February 13, 2026

      MiniMax’s new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

      February 13, 2026

      OpenAI deploys Cerebras chips for ‘near-instant’ code generation in first major move beyond Nvidia

      February 13, 2026

      Google Chrome ships WebMCP in early preview, turning every website into a structured tool for AI agents

      February 13, 2026

      AI inference costs dropped up to 10x on Nvidia’s Blackwell — but hardware is only half the equation

      February 13, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Building an AI agent inside a 7-year-old Rails monolith
    Technology

    Building an AI agent inside a 7-year-old Rails monolith

    TechAiVerseBy TechAiVerseDecember 26, 2025No Comments5 Mins Read3 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Building an AI agent inside a 7-year-old Rails monolith
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Building an AI agent inside a 7-year-old Rails monolith

    I’m a Director of Engineering at Mon Ami, a US-based start-up building a SaaS solution for Aging and Disability Case Workers. We built a large Ruby on Rails monolith over the last 7 years.

    It’s a multi-tenant solution where data sensitivity is crucial. We have multiple layers of access checks, but to simplify the story, we’ll assume it’s all abstracted away into a Pundit policy.

    While I would not describe us as a group dealing with Big Data problems, we do have a lot of data. Looking up clients’ records is, in particular, an action that is just not performant enough with raw database operations, so we built an Algolia index to make it work.

    Given all that: the big monolith, complicated data access rules, and the nature of the business we are in, building an AI agent has not yet been a primary concern for us.

    SF Ruby, and the disconnect

    I was at SF Ruby, in San Francisco, a few weeks ago. Most of the tracks were, of course, heavily focused on AI. Lots of stories from people building AIs into all sorts of products using Ruby and Rails,

    They were good talks. But most of them assumed a kind of software I don’t work on — systems without strong boundaries, without multi-tenant concerns, without deeply embedded authorization rules.

    I kept thinking: this is interesting, but it doesn’t map cleanly to my world. At Mon Ami, we can’t just release a pilot unless it passes strict data access checks.

    Then I saw a talk about using the RubyLLM gem to build a RAG-like system. The conversation (LLM calls) context was augmented using function calls (tools). This is when it clicked. I could encode my complicated access logic into a specific function call and ensure the LLM gets access to some of our data without having to give it unrestricted access.

    RubyLLM

    RubyLLM is a neat gem that abstracts away the interaction with many LLM providers with a clean API.

    gem "ruby_llm"

    It is configured in an initializer with the API keys for the providers you want to use.

    RubyLLM.configure do |config|
      config.openai_api_key = Rails.application.credentials.dig(:openai_api_key)
      config.anthropic_api_key = Rails.application.credentials.dig(:anthropic_api_key)
      # config.default_model = "gpt-4.1-nano"
    
      # Use the new association-based acts_as API (recommended)
      config.use_new_acts_as = true
    
      # Increase timeout for slow API responses
      config.request_timeout = 600  # 10 minutes (default is 300)
      config.max_retries = 3        # Retry failed requests
    end
    
    # Load LLM tools from main app
    Dir[Rails.root.join('app/tools/**/*.rb')].each { |f| require f }

    It provides a Conversation model as an abstraction for an LLM thread. The Conversation contains a set of Messages. It also provides a way of defining structured responses and function calls available.

    AVAILABLE_TOOLS = [
      Tools::Client::SearchTool
    ].freeze
    
    conversation = Conversation.find(conversation_id)
    chat = conversation.with_tools(*AVAILABLE_TOOLS)
    
    chat.ask 'What is the phone number for John Snow?'

    A Conversation is initialized by passing a model (gpt-5, claude-sonnet-4.5, etc) and has a method for chatting to it.

    conversation = Conversation.new(model: RubyLLM::Model.find_by(model_id: 'gpt-4o-mini'))

    RubyLLM comes with a neat DSL for defining accepted parameters (the descriptions are passed to the LLM as context since it needs to decide if the tool should be used based on the conversation). The tool implements an execute method returning a hash. The hash is then presented to the LLM. This is all the magic needed.

    class SearchTool < BaseTool
      description 'Search for clients by name, ID, or email address. Returns matching clients.'
    
      param :query,
        desc: 'Search query - can be client name, ID, or email address',
        type: :string
    
      def execute(query:)
      end
    end

    We’ll now build a modest function call and a messaging interface. The function call allows searching a client using Algolia and ensuring the resulting set is visible to the user (by merging in the pundit policy).

    def execute(query:)
      response = Algolia::SearchClient
        .create(app_id, search_key)
        .search_single_index(Client.index_name, {
          query: query.truncate(250)
        })
    
      ids = response.hits.map { |hit| hit[:id] }.compact
    
      base_scope = Client.where(id: ids)
      client = Admin::Org::ClientPolicy::Scope.new(base_scope).resolve.first or return {}
    
      {
        id: client.id,
        ami_id: client.slug,
        slug: client.slug,
        name: client.full_name,
        email: client.email
      }
    end

    The LLM acts as the magic glue between the natural language input submitted by the user, decides which (if any) tool to use to augment the context, and then responds to the user. No model should ever know Jon Snow’s phone number from a SaaS service, but this approach allows this sort of retrieval.

    The UI is built with a remote form that enqueues an Active Job.

    = turbo_stream_from @conversation, :messages
    
    .container-fluid.h-100.d-flex.flex-column
      .sticky-top
        %h2.mb-0
          Conversation ##{@conversation.id}
    
      .flex-grow-1
        = render @messages
    
      .p-3.border-top.bg-white.sticky-bottom#message-form
      = form_with url: path, method: :post, local: false, data: { turbo_stream: true } do |f|
        = f.text_area :content
        = f.submit 'Send'

    The job will process the Message.

    class ProcessMessageJob < ApplicationJob
      queue_as :default
    
      def perform(conversation_id, message)
        conversation = Conversation.find(conversation_id)
        conversation.ask message
      end
    end

    The conversation has broadcast refresh enabled to update the UI when the response is received.

    class Conversation < RubyLLM::Conversation
      broadcasts_refreshes
    end

    The form has a stimulus controller that checks for new messages being appended in order to scroll to the end of the conversation.

    A note on selecting the model

    I checked a few OpenAI models for this implementation: gpt-5, gpt-4o, gpt4. GPT-5 has a big context, meaning we could have long-running conversations, but because there are a number of round-trips, the delay to queries requiring 3+ consecutive tools made the Agent feel sluggish.

    GPT-4, on the other hand, is interestingly very prone to hallucinations – rushing to respond to queries with made-up data instead of calling the necessary tools. GPT-4o strikes, so far, the best balance between speed and correctness.

    Closing thoughts

    Building this tool took probably about 2-3 days of Claude-powered development (AIs building AIs). The difficulty and the complexity of building such a tool were the things that surprised me the most. The tool service object is essentially an API controller action – pass inputs and get a JSON back. Interestingly.

    Before building this Agent, I looked at the other gems in this space. ActiveAgent (a somewhat similar gem for interacting with LLMs) is a decent contender that moves the prompts to a view file. It didn’t fit my needs since it had no built-in support for defining tools or having long-running conversations.

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleTurboDiffusion: 100–200× Acceleration for Video Diffusion Models
    Next Article Toyota’s Prius Prime saved me gas money but probably not the environment
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

    February 13, 2026

    MiniMax’s new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

    February 13, 2026

    OpenAI deploys Cerebras chips for ‘near-instant’ code generation in first major move beyond Nvidia

    February 13, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025668 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025256 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025153 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025111 Views
    Don't Miss
    Technology February 13, 2026

    Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

    Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy Vercel Security Checkpoint…

    MiniMax’s new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

    OpenAI deploys Cerebras chips for ‘near-instant’ code generation in first major move beyond Nvidia

    Google Chrome ships WebMCP in early preview, turning every website into a structured tool for AI agents

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

    February 13, 20260 Views

    MiniMax’s new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

    February 13, 20260 Views

    OpenAI deploys Cerebras chips for ‘near-instant’ code generation in first major move beyond Nvidia

    February 13, 20260 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    This new Roomba finally solves the big problem I have with robot vacuums

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.