Next Generation Emulation banner

Dynamic Recompilation - An Introduction

34805 Views 85 Replies 21 Participants Last post by  Padremayi
In another thread sniff_381 asked me to explain what dynamic recompilation (dynarec for short) is.
Since most NG emus have a dynarec now, I thought it might be a good idea to cover the topic in a separate thread.

I have to admit that I haven't programmed a dynarec myself yet, but I have decent knowledge of the basics and some details. I'm not totally sure how much I should go into depth anyway. I guess I'll see if there is enough interest to talk about details.

First of all, the term "dynamic recompilation" is a bit odd, because "to recompile" often means to compile the source code of a program again, but the older term "binary translation" is more precise, since the binary code of a game or application is translated - not the source code - and "dynamic" only means that it is done during runtime and on demand.

So what's the difference to "traditional" or "interpretive" emulation?
An interpretive emulator always picks the instruction the program counter (PC) points to, decodes it, and executes it, just like a real processor would do it. So every time the emulator comes across it same instruction it has to do all the steps again.
In his article How To Write a Computer Emulator (http://fms.komkon.org/EMUL8/HOWTO.html) Marat Fayzullin uses this pseudo C-code sequence to describe the process:
Code:
Counter=InterruptPeriod;
PC=InitialPC;

for( ;; )
{
  OpCode=Memory[PC++];
  Counter-=Cycles[OpCode];

  switch(OpCode)
  {
    case OpCode1:
    case OpCode2:
    ...
  }

  if(Counter<=0)
  {
    /* Check for interrupts and do other */
    /* cyclic tasks here                 */
    ...
    Counter+=InterruptPeriod;
    if(ExitRequired) break;
  }
}
Dynamic recompilation deviates from this procedure by working with whole blocks of code instead of single instructions, and that those blocks are translated into the machine language of the processor the emulator is running on. There wouldn't be a speed advantage if the translated blocks weren't cached and simply recalled as soon as the program counter enters that block again.
Here is some sample code from my (unfortunately still unfinished) DRFAQ (http://www.dynarec.com/~mike/drfaq.html):
Code:
/* the following line defines a 'function pointer',               */
/* which can be used to call the code generated by the translator */
/* CTX is the context of the processor, ie. the register values   */

int (*dyncode)(Context *CTX);


/* the following simplyfied loop is often called the "dispatcher" */

for( ;; ) {

  /* try to find the current address of the PC in the translation cache */
  address = block_translated(CTX->PC);

  /* nothing found, ie. first translate the code block starting at the PC address */
  if (address == NULL)
    /* do the translation and add it to the translation cache */                                      
    address = translate_block(CTX->PC);

  /* point the function pointer to the address of the translated code */
  dyncode = (int(*)(Context*)) address;

  /* call the translated code with the current context */
  status = (*dyncode)(CTX);

  /* handle interrupts and other events here */                                 
}
That's basically how a dynarec works, only that I still haven't explained how the translation cache and of course the translation are handled.

I spoke of code blocks several times, and it might be a good idea to define the term, since not all will be into compiler theory...
In compilers the smallest block of cohesive instructions is called a basic block. Such a block has a starting point and ends with the next conditional jump or branch, ie. as soon as there is a possibility that the program counter changes apart from pointing to the next instruction the block ends. It's also important that no other code block can jump into the middle of the basic block, only at the starting address, because only that way the compiler can see it as a separate collection of code that can be optimized in every possible way.
Most dynarecs probably work with basic blocks, but some use end the block with the next unconditional jump or branch, which leads to larger blocks, often called translation units. This leads to faster code, because all conditional branches can jump in the translated code without having to go through the dispatcher loop first, but it can be problematic to handle interrupts since you don't have a guarantee that the code returns to the dispatcher. Of course that could be handled within the generated code, but that makes things more complicated.

I think that's enough for an introduction. If there are any questions feel free to ask, and if there is interest in extending the parts I haven't covered yet, I could write something about the translation cache, some translation problems, register allocation, the difference to threaded interpretation, etc.
See less See more
1 - 9 of 86 Posts
I've been trying to get this tutorial to work
Tutorials - Introduction To Dynamic Recompilation

but there seems to be some bugs. I've fixed them except for this
ModRM(0, DISP32, to);

which occurs in a few places, it doesn't match the written function. Documentation isn't all that clear on what the function does too.

Anyone know the answer?
C++ has a fit too, I had to change that.

the function call doesn't match the header.

in a few places it makes calls like this
ModRM(0, to, DISP32)

where DISP32=5 and to is an X86RegisterType

but the function is written
void ModRM(unsigned char mod, unsigned char rm, X86RegisterType reg);

so I'm a bit confused. Thought maybe it was just a typo, but it doesn't work even if I call ModRM(0, DISP32,to).

so I'm not really sure where the problem is.

Guess I start reading that page now.
See less See more
well I'm gong to take a look at the intel manuals but for now I have a question again, if I just call the Ret function shouldn't that work when I attempt to call the BlockFunction?

I run the program through the debugger and it just crashes as soon as it tries to call the code.
By the way, if anyone was getting an access violation from the OP's first Dynarec code (On windows 7), then here is the modified code that works. Credit given to Martyn.Rae on dreamincode.net

Code:
#include <windows.h>
#include <stdio.h>

/* In the beginning we'll have to define the function pointer.  */ 
/* I called the function 'dyncode' and gave it an int argument  */ 
/* as well as an int return value just to show what's possible. */ 
 
int (*dyncode)(int);  /* prototype for call of dynamic code */ 
 
/* The following char array is initialized with some binary code */ 
/* which takes the first argument from the stack, increases it,  */ 
/* and returns to the caller.                                    */ 
/* Just very simple code for testing purposes...                 */ 
 
unsigned char code[] = {0x8B,0x44,0x24,0x04,  /* mov eax, [esp+4] */ 
                        0x40,                 /* inc eax          */ 
                        0xC3                  /* ret              */ 
                       }; 
 
 
/* Include the prototypes of the functions we are using... */ 
 
 
int main() 
{ 
        /* To show you that the code can be dynamically generated    */ 
        /* although I defined static data above, the code is copied  */ 
        /* into an allocated memory area and the starting address is */ 
        /* assigned to the function pointer 'dyncode'.               */ 
        /* The strange stuff in front of the malloc is just to cast  */ 
        /* the address to the same format the function pointer is    */ 
        /* definded with, otherwise you'd get a compiler warning.    */ 
 
        dyncode = (int (*)(int)) VirtualAlloc(NULL, sizeof(code),
                                              MEM_COMMIT, PAGE_EXECUTE_READWRITE);

        /* We now have a page of memory that is readable, writeable    */
        /* and executable. so the memcpy will work without any         */
        /* problems!                                                   */

        memcpy(dyncode, code, sizeof(code)); 
 
        /* To show that the code works it is called with the argument 41 */ 
        /* and the retval sould be 42, obviously.                        */ 
 
        /* This code will now execute correctly!                         */

        printf("retval = %d\n", (*dyncode)(41) );  /* call the code and print the return value */   

        /* Freeing the page allocated.                                   */

        VirtualFree(dyncode, NULL, MEM_RELEASE);
        return 0;
}
:thumb:
THANK YOU!

I had given up on getting this working. This works!
I'm trying to make the code work in linux now and I'm having issues.

I used mprotect to let me call the function without a segmentation fault but now I just keep getting zero back from the function.

Not really sure how I can debug this effectively

Code:
/* In the beginning we'll have to define the function pointer.  */
/* I called the function 'dyncode' and gave it an int argument  */
/* as well as an int return value just to show what's possible. */

int (*dyncode)(int);  /* prototype for call of dynamic code */

/* The following char array is initialized with some binary code */
/* which takes the first argument from the stack, increases it,  */
/* and returns to the caller.                                    */
/* Just very simple code for testing purposes...                 */

unsigned char code[] = {0x8B,0x44,0x24,0x04,  /* mov eax, [esp+4] */
                        0x40,                 /* inc eax          */
                        0xC3                  /* ret              */
                       };


/* Include the prototypes of the functions we are using... */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

int foo(int);
int main(void)
{
  /* To show you that the code can be dynamically generated    */
  /* although I defined static data above, the code is copied  */
  /* into an allocated memory area and the starting address is */
  /* assigned to the function pointer 'dyncode'.               */
  /* The strange stuff in front of the malloc is just to cast  */
  /* the address to the same format the function pointer is    */
  /* definded with, otherwise you'd get a compiler warning.    */

  dyncode = (int (*)(int)) malloc(sizeof(code) + 4095);
  dyncode = (int (*)(int))(((long long)dyncode + 4095 & ~(4095)));

  int t = mprotect(dyncode, sizeof(code), PROT_READ | PROT_WRITE | PROT_EXEC);
  if(t == -1)
    perror("test");
  memcpy(dyncode, code, sizeof(code));

  unsigned char *t2 = (unsigned char *)dyncode;
  int i = 0;
  for(i = 0; i < 10; i++)
  {
    printf("%x\n", *(t2+i));
  }
  /* To show that the code works it is called with the argument 41 */
  /* and the retval sould be 42, obviously.                        */
  int test = dyncode(41);
  //always returns 0
  printf("retval = %d %d %d\n", test, dyncode(42), (*dyncode)(43));/* call the code and print the return value */  

  return 0;
}
See less See more
yes I had that thought last night. rather obvious in hingsight.

i was getting sick of ubuntu anyway so I just installed a 32 bit version of arch.

the above code works fine now.
I just got my first opcode to translate.

addiu rt, rs, immediate

it's a mips r4400 CPU. I bet someone here can figure out what I'm trying to emulate ;)
1 - 9 of 86 Posts
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top