How does the emulator work?

Does anyone besides @fmanga know how the emulator really works? I so, can you please explain it to me? I’ll make it worth your while :slight_smile:

1 Like

Hey, I only just saw this post now, here’s a bit of an explanation. I assume you’re asking specifically about how the emulation part works, not the general webpage bit.

For starters, things might seem a bit overly-complicated/engineered/rubbish and in some parts it really is. ProjectABE started off as an experiment without goals beyond simply testing things to see if they were a good idea. In some cases, in retrospect, there were some good, some awful, and some outright silly ideas.

Exhibit A: “Can you write a CPU emulator using regular expressions at its core?”
Turns out the answer is “Yes… but you shouldn’t.”

And that brings me back the subject of how it actually works. ProjectABE isn’t a classic interpretor-based emulator. So that we’re on the same page, I’m calling a classic interpretor-based emulator something that looks like this:

void updateCPU(){
  auto op = read(reg[PC]++);
  if (op & ADD_MASK == ADD_OP) {
    reg[op & LHS_MASK] += reg[op & RHS_MASK];
    cycles += 1;
    return;
  }
  if (op & MOV_MASK == MOV_OP) { ...  }
  if (op & B_MASK == B_OP) { ... }
}

Depending on the CPU running the emulator and the CPU being emulated, it might make more sense to write the chain of if statements as one or more switch statements or function pointer tables. In C/C++ the best way isn’t obvious: it depends on what the compiler will do with the huge switch, on how much overhead a function call has, if/which functions get inlined, if/which things fit in cache memory.
Since ProjectABE is in Javascript, that brings its own set of unknowns. The only way to know for sure what is best is to implement and test. There was already a classic interpretor for the Gamebuino Classic, which has a very similar processor, so I figured I’d do something else with ProjectABE and we’d have two implementations to compare side-by-side. That “something else” being recompilation.

Since I didn’t want to write yet-another-gamebuino-emulator, I made ProjectABE modular: by separating the core from peripherals like timers/uart/etc and being able to reconfigure what each pin is connected to (screen, buttons, sound), it is capable of emulating 3 systems.

With recompilation, the idea is that you generate a function that contains a block of pre-translated instructions:

void block0A00() {
  // add r0, r1
  reg[0] += reg[1];
  // mov r2, r0
  reg[2] = reg[0];

  reg[PC] += 4;
  cycles += 2; // add + mov
}

void updateCPU() {
  blockTable[reg[PC]]();
}

Initially I experimented with taking advantage of the small flash and the fact that you can’t run code in RAM. Instead of having a table of blocks, it could simply translate everything in one large block. Turns out browsers didn’t like that at all and would often crash, hence ABE using multiple blocks and a table.

One thing that complicates matters is that you might want to jump into an arbitrary address inside a block. To deal with this, blocks have a switch that jump to the exact address needed. Something like this:

void block0A00() {
  switch(reg[PC]) {
  case 0xA00:
  // add r0, r1
  reg[0] += reg[1];

  case 0xA02:
  // mov r2, r0
  reg[2] = reg[0];
  }

  reg[PC] += 4;
  cycles += 2; // add + mov
}

I’ve been using C++ for pseudocode so far, but of course ProjectABE generates Javascript and the contents of the switch looks more like this (debugger-related code omitted for clarity/brevity):

case 3710:
	this.pc = 3710;
	if( (this.tick += 2) >= this.endTick ) break;

// (STACK) ← Rd
memory[sp--] =  (reg[(15)]>>>0); if(sp<this.minStack) this.minStack = sp; 

case 3711:
	this.pc = 3711;
	if( (this.tick += 1) >= this.endTick ) break;

// Rd = Rr
r = reg[(12)] = (reg[(22)]>>>0)

// Rd+1 = Rr+1
r = reg[(12)+1] = (reg[(22)+1]>>>0)


case 3712:
	this.pc = 3712;
	if( (this.tick += 1) >= this.endTick ) break;

// Rd = Rr
r = reg[(14)] = (reg[(24)]>>>0)

// Rd+1 = Rr+1
r = reg[(14)+1] = (reg[(24)+1]>>>0)

(In case anyone is wondering, the >>>0 is Javascript’s way of casting a value into an unsigned int)

You might have noticed the memory variable above. Generally, accessing anything in the address space involves going through a function that checks what that address belongs to. It can be RAM, a register for one of the peripherals, or even a CPU register. Reading/writing one of these registers could then trigger something else, like a byte getting forwarded to the screen or a button state being read. It would be really weird for C/C++ code to try to access a register through the stack, so here ProjectABE considers it’s safe to inline a raw memory access and skips all of the extra checks.

The tick increments and endTick checks ensure the block executes for a given amount of cycles and bails if the limit is reached.

Now lets take a closer look at how each opcode is implemented. Specifically, the MOVW above:

// Rd = Rr
r = reg[(14)] = (reg[(24)]>>>0)

// Rd+1 = Rr+1
r = reg[(14)+1] = (reg[(24)+1]>>>0)

This is defined in Opcodes.js like this:

{
        name: 'MOVW',
	score:250,
        str:'00000001ddddrrrr',
        impl:[
            'Rd = Rr',
            'Rd+1 = Rr+1'
        ],
	shift:{
	    d:1,
	    r:1
	},
	print:{
	    r:r=>"r" + r + ":r" + (r+1),
	    d:d=>"r" + d + ":r" + (d+1)
	}
}
  • name is used by the disassembler
  • print is used by the disassembler so that it knows how to format each parameter.
  • score is used for sorting the definitions so that more frequent opcodes are checked first when decoding an instruction. Really not necessary.
  • str contains the information need for decoding each instruction’s bits into an opcode, the opcode’s mask, and so on.

Lowercase letters map bits that are joined together, a shift left and/or a sign extension may be applied, and the resulting value replaced into the impl metacode.

00000001 dddd rrrr
00000001 0111 1100
d = 0111 << 1 = 01110 = 14
r = 1100 << 1 = 11000 = 24

Applying the values of `d` and `r`:
R14 = R24
R14+1 = R24+1

Regular expressions “draw the rest of the owl” and convert that into the javascript you saw earlier.

4 Likes

Wow…
:exploding_head:

1 Like

Ah, that makes sense.

That explains why I couldn’t find the usual ‘big switch’ that most emulators have.

This seems like the kind of thing LISP would be really good at.
(Now there’s something you don’t hear everyday.)

Just in case anyone thinks you’re joking:

Did you manage to summon Zalgo by any chance? :P

Oh joy, another one for my list.

2 Likes

To me it doesn’t seem all that different compared to C++'s implicit type conversions combined with operator overloading.

Rest assured, no ͠P̯͍̭O̚​N̐Y̡ was summned while writng ProjectABE 多分he comes

1 Like

rotflmao

Dude thank you so much I appreciate you! All glory to the @fmanga!