Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Spotify’s new feature makes it easier to find popular audiobooks

    This portable JBL Grip Bluetooth speaker is so good at 20% off

    ‘AI’ could dox your anonymous posts

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      What the polls say about how Americans are using AI

      February 27, 2026

      Tensions between the Pentagon and AI giant Anthropic reach a boiling point

      February 21, 2026

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026
    • Business

      Weighing up the enterprise risks of neocloud providers

      March 3, 2026

      A stolen Gemini API key turned a $180 bill into $82,000 in two days

      March 3, 2026

      These ultra-budget laptops “include” 1.2TB storage, but most of it is OneDrive trial space

      March 1, 2026

      FCC approves the merger of cable giants Cox and Charter

      February 28, 2026

      Finding value with AI and Industry 5.0 transformation

      February 28, 2026
    • Crypto

      Strait of Hormuz Shutdown Shakes Asian Energy Markets

      March 3, 2026

      Wall Street’s Inflation Alarm From Iran — What It Means for Crypto

      March 3, 2026

      Ethereum Price Prediction: What To Expect From ETH In March 2026

      March 3, 2026

      Was Bitcoin Hijacked? How Institutional Interests Shaped Its Narrative Since 2015

      March 3, 2026

      XRP Whales Now Hold 83.7% of All Supply – What’s Next For Price?

      March 3, 2026
    • Technology

      Spotify’s new feature makes it easier to find popular audiobooks

      March 3, 2026

      This portable JBL Grip Bluetooth speaker is so good at 20% off

      March 3, 2026

      ‘AI’ could dox your anonymous posts

      March 3, 2026

      Microsoft says new Teams location feature isn’t for ’employee tracking’

      March 3, 2026

      OpenAI got ‘sloppy’ about the wrong thing

      March 3, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»A Basic Just-In-Time Compiler (2015)
    Technology

    A Basic Just-In-Time Compiler (2015)

    TechAiVerseBy TechAiVerseJanuary 3, 2026No Comments9 Mins Read1 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    A Basic Just-In-Time Compiler (2015)


    March 19, 2015

    nullprogram.com/blog/2015/03/19/

    (The author is currently open to employment opportunities in the United
    States.)

    This article was discussed on Hacker News and on reddit.

    Monday’s /r/dailyprogrammer challenge was to write a program to
    read a recurrence relation definition and, through interpretation,
    iterate it to some number of terms. It’s given an initial term
    (u(0)) and a sequence of operations, f, to apply to the previous
    term (u(n + 1) = f(u(n))) to compute the next term. Since it’s an
    easy challenge, the operations are limited to addition, subtraction,
    multiplication, and division, with one operand each.

    For example, the relation u(n + 1) = (u(n) + 2) * 3 - 5 would be
    input as +2 *3 -5. If u(0) = 0 then,

    • u(1) = 1
    • u(2) = 4
    • u(3) = 13
    • u(4) = 40
    • u(5) = 121
    • …

    Rather than write an interpreter to apply the sequence of operations,
    for my submission (mirror) I took the opportunity to
    write a simple x86-64 Just-In-Time (JIT) compiler. So rather than
    stepping through the operations one by one, my program converts the
    operations into native machine code and lets the hardware do the work
    directly. In this article I’ll go through how it works and how I did
    it.

    Update: The follow-up challenge uses Reverse Polish
    notation to allow for more complicated expressions. I wrote another
    JIT compiler for my submission (mirror).

    Allocating Executable Memory

    Modern operating systems have page-granularity protections for
    different parts of process memory: read, write, and execute.
    Code can only be executed from memory with the execute bit set on its
    page, memory can only be changed when its write bit is set, and some
    pages aren’t allowed to be read. In a running process, the pages
    holding program code and loaded libraries will have their write bit
    cleared and execute bit set. Most of the other pages will have their
    execute bit cleared and their write bit set.

    The reason for this is twofold. First, it significantly increases the
    security of the system. If untrusted input was read into executable
    memory, an attacker could input machine code (shellcode) into the
    buffer, then exploit a flaw in the program to cause control flow to
    jump to and execute that code. If the attacker is only able to write
    code to non-executable memory, this attack becomes a lot harder. The
    attacker has to rely on code already loaded into executable pages
    (return-oriented programming).

    Second, it catches program bugs sooner and reduces their impact, so
    there’s less chance for a flawed program to accidentally corrupt user
    data. Accessing memory in an invalid way will causes a segmentation
    fault, usually leading to program termination. For example, NULL
    points to a special page with read, write, and execute disabled.

    An Instruction Buffer

    Memory returned by malloc() and friends will be writable and
    readable, but non-executable. If the JIT compiler allocates memory
    through malloc(), fills it with machine instructions, and jumps to
    it without doing any additional work, there will be a segmentation
    fault. So some different memory allocation calls will be made instead,
    with the details hidden behind an asmbuf struct.

    #define PAGE_SIZE 4096
    
    struct asmbuf {
        uint8_t code[PAGE_SIZE - sizeof(uint64_t)];
        uint64_t count;
    };
    

    To keep things simple here, I’m just assuming the page size is 4kB. In
    a real program, we’d use sysconf(_SC_PAGESIZE) to discover the page
    size at run time. On x86-64, pages may be 4kB, 2MB, or 1GB, but this
    program will work correctly as-is regardless.

    Instead of malloc(), the compiler allocates memory as an anonymous
    memory map (mmap()). It’s anonymous because it’s not backed by a
    file.

    struct asmbuf *
    asmbuf_create(void)
    {
        int prot = PROT_READ | PROT_WRITE;
        int flags = MAP_ANONYMOUS | MAP_PRIVATE;
        return mmap(NULL, PAGE_SIZE, prot, flags, -1, 0);
    }
    

    Windows doesn’t have POSIX mmap(), so on that platform we use
    VirtualAlloc() instead. Here’s the equivalent in Win32.

    struct asmbuf *
    asmbuf_create(void)
    {
        DWORD type = MEM_RESERVE | MEM_COMMIT;
        return VirtualAlloc(NULL, PAGE_SIZE, type, PAGE_READWRITE);
    }
    

    Anyone reading closely should notice that I haven’t actually requested
    that the memory be executable, which is, like, the whole point of all
    this! This was intentional. Some operating systems employ a security
    feature called W^X: “write xor execute.” That is, memory is either
    writable or executable, but never both at the same time. This makes
    the shellcode attack I described before even harder. For well-behaved
    JIT compilers
    it means memory protections need to be adjusted
    after code generation and before execution.

    The POSIX mprotect() function is used to change memory protections.

    void
    asmbuf_finalize(struct asmbuf *buf)
    {
        mprotect(buf, sizeof(*buf), PROT_READ | PROT_EXEC);
    }
    

    Or on Win32 (that last parameter is not allowed to be NULL),

    void
    asmbuf_finalize(struct asmbuf *buf)
    {
        DWORD old;
        VirtualProtect(buf, sizeof(*buf), PAGE_EXECUTE_READ, &old);
    }
    

    Finally, instead of free() it gets unmapped.

    void
    asmbuf_free(struct asmbuf *buf)
    {
        munmap(buf, PAGE_SIZE);
    }
    

    And on Win32,

    void
    asmbuf_free(struct asmbuf *buf)
    {
        VirtualFree(buf, 0, MEM_RELEASE);
    }
    

    I won’t list the definitions here, but there are two “methods” for
    inserting instructions and immediate values into the buffer. This will
    be raw machine code, so the caller will be acting a bit like an
    assembler.

    asmbuf_ins(struct asmbuf *, int size, uint64_t ins);
    asmbuf_immediate(struct asmbuf *, int size, const void *value);
    

    Calling Conventions

    We’re only going to be concerned with three of x86-64’s many
    registers: rdi, rax, and rdx. These are 64-bit (r) extensions
    of the original 16-bit 8086 registers. The sequence of
    operations will be compiled into a function that we’ll be able to call
    from C like a normal function. Here’s what it’s prototype will look
    like. It takes a signed 64-bit integer and returns a signed 64-bit
    integer.

    The System V AMD64 ABI calling convention says that the first
    integer/pointer function argument is passed in the rdi register.
    When our JIT compiled program gets control, that’s where its input
    will be waiting. According to the ABI, the C program will be expecting
    the result to be in rax when control is returned. If our recurrence
    relation is merely the identity function (it has no operations), the
    only thing it will do is copy rdi to rax.

    There’s a catch, though. You might think all the mucky
    platform-dependent stuff was encapsulated in asmbuf. Not quite. As
    usual, Windows is the oddball and has its own unique calling
    convention. For our purposes here, the only difference is that the
    first argument comes in rcx rather than rdi. Fortunately this only
    affects the very first instruction and the rest of the assembly
    remains the same.

    The very last thing it will do, assuming the result is in rax, is
    return to the caller.

    So we know the assembly, but what do we pass to asmbuf_ins()? This
    is where we get our hands dirty.

    Finding the Code

    If you want to do this the Right Way, you go download the x86-64
    documentation, look up the instructions we’re using, and manually work
    out the bytes we need and how the operands fit into it. You know, like
    they used to do out of necessity back in the 60’s.

    Fortunately there’s a much easier way. We’ll have an actual assembler
    do it and just copy what it does. Put both of the instructions above
    in a file peek.s and hand it to nasm. It will produce a raw binary
    with the machine code, which we’ll disassemble with nidsasm (the
    NASM disassembler).

    $ nasm peek.s
    $ ndisasm -b64 peek
    00000000  4889F8            mov rax,rdi
    00000003  C3                ret
    

    That’s straightforward. The first instruction is 3 bytes and the
    return is 1 byte.

    asmbuf_ins(buf, 3, 0x4889f8);  // mov   rax, rdi
    // ... generate code ...
    asmbuf_ins(buf, 1, 0xc3);      // ret
    

    For each operation, we’ll set it up so the operand will already be
    loaded into rdi regardless of the operator, similar to how the
    argument was passed in the first place. A smarter compiler would embed
    the immediate in the operator’s instruction if it’s small (32-bits or
    fewer), but I’m keeping it simple. To sneakily capture the “template”
    for this instruction I’m going to use 0x0123456789abcdef as the
    operand.

    mov   rdi, 0x0123456789abcdef
    

    Which disassembled with ndisasm is,

    00000000  48BFEFCDAB896745  mov rdi,0x123456789abcdef
             -2301
    

    Notice the operand listed little endian immediately after the
    instruction. That’s also easy!

    long operand;
    scanf("%ld", &operand);
    asmbuf_ins(buf, 2, 0x48bf);         // mov   rdi, operand
    asmbuf_immediate(buf, 8, &operand);
    

    Apply the same discovery process individually for each operator you
    want to support, accumulating the result in rax for each.

    switch (operator) {
        case '+':
            asmbuf_ins(buf, 3, 0x4801f8);   // add   rax, rdi
            break;
        case '-':
            asmbuf_ins(buf, 3, 0x4829f8);   // sub   rax, rdi
            break;
        case '*':
            asmbuf_ins(buf, 4, 0x480fafc7); // imul  rax, rdi
            break;
        case '/':
            asmbuf_ins(buf, 3, 0x4831d2);   // xor   rdx, rdx
            asmbuf_ins(buf, 3, 0x48f7ff);   // idiv  rdi
            break;
    }
    

    As an exercise, try adding support for modulus operator (%), XOR
    (^), and bit shifts (<, >). With the addition of these
    operators, you could define a decent PRNG as a recurrence relation. It
    will also eliminate the closed form solution to this problem so
    that we actually have a reason to do all this! Or, alternatively,
    switch it all to floating point.

    Calling the Generated Code

    Once we’re all done generating code, finalize the buffer to make it
    executable, cast it to a function pointer, and call it. (I cast it as
    a void * just to avoid repeating myself, since that will implicitly
    cast to the correct function pointer prototype.)

    asmbuf_finalize(buf);
    long (*recurrence)(long) = (void *)buf->code;
    // ...
    x[n + 1] = recurrence(x[n]);
    

    That’s pretty cool if you ask me! Now this was an extremely simplified
    situation. There’s no branching, no intermediate values, no function
    calls, and I didn’t even touch the stack (push, pop). The recurrence
    relation definition in this challenge is practically an assembly
    language itself, so after the initial setup it’s a 1:1 translation.

    I’d like to build a JIT compiler more advanced than this in the
    future. I just need to find a suitable problem that’s more complicated
    than this one, warrants having a JIT compiler, but is still simple
    enough that I could, on some level, justify not using LLVM.

      Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
      Previous ArticleProving Liveness with TLA
      Next Article Adventure 751 (1980)
      TechAiVerse
      • Website

      Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

      Related Posts

      Spotify’s new feature makes it easier to find popular audiobooks

      March 3, 2026

      This portable JBL Grip Bluetooth speaker is so good at 20% off

      March 3, 2026

      ‘AI’ could dox your anonymous posts

      March 3, 2026
      Leave A Reply Cancel Reply

      Top Posts

      Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

      April 22, 2025703 Views

      Lumo vs. Duck AI: Which AI is Better for Your Privacy?

      July 31, 2025286 Views

      6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

      April 14, 2025164 Views

      6 Best MagSafe Phone Grips (2025), Tested and Reviewed

      April 6, 2025124 Views
      Don't Miss
      Technology March 3, 2026

      Spotify’s new feature makes it easier to find popular audiobooks

      Spotify’s new feature makes it easier to find popular audiobooks Image: Spotify Summary created by…

      This portable JBL Grip Bluetooth speaker is so good at 20% off

      ‘AI’ could dox your anonymous posts

      Microsoft says new Teams location feature isn’t for ’employee tracking’

      Stay In Touch
      • Facebook
      • Twitter
      • Pinterest
      • Instagram
      • YouTube
      • Vimeo

      Subscribe to Updates

      Get the latest creative news from SmartMag about art & design.

      About Us
      About Us

      Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

      Facebook X (Twitter) Pinterest YouTube WhatsApp
      Our Picks

      Spotify’s new feature makes it easier to find popular audiobooks

      March 3, 20262 Views

      This portable JBL Grip Bluetooth speaker is so good at 20% off

      March 3, 20262 Views

      ‘AI’ could dox your anonymous posts

      March 3, 20261 Views
      Most Popular

      7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

      March 13, 20250 Views

      VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

      March 13, 20250 Views

      Best TV Antenna of 2025

      March 13, 20250 Views
      © 2026 TechAiVerse. Designed by Divya Tech.
      • Home
      • About Us
      • Contact Us
      • Privacy Policy
      • Terms & Conditions

      Type above and press Enter to search. Press Esc to cancel.