Dynamic recompiler

Discussion of development and patch submission.
rflego
Posts: 18
Joined: Thu 12 Mar, 2015 7:39 am

Re: Dynamic recompiler

Post by rflego »

I had not looked the Descent 2 I'm using is a demo version ...
descent2.jpg
descent2.jpg (131.9 KiB) Viewed 23930 times
EDIT: ok, I installed the full version of Descent 2, now sound working perfectly and reviewed every others games.

Thanks Leilei.
Zilog
Posts: 51
Joined: Wed 13 May, 2015 8:01 pm

Re: Dynamic recompiler

Post by Zilog »

Hi,
how version of gcc use on widnows?

In the network i found this version:
http://sourceforge.net/projects/mingw/files/Installer/

or this version

http://mingw-w64.org/doku.php

Thanls a lot.
by.
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

As of rev 312 the interpreter and dynamic recompiler are merged into a single binary. -DDYNAREC no longer exists, recompiler is an option in the configure dialogue on the CPUs that support it.
AnnaWu
Posts: 44
Joined: Mon 12 May, 2014 6:10 pm
Location: Germany
Contact:

Re: Dynamic recompiler

Post by AnnaWu »

TomWalker wrote:As of rev 312 the interpreter and dynamic recompiler are merged into a single binary. -DDYNAREC no longer exists, recompiler is an option in the configure dialogue on the CPUs that support it.
Nice, to see which CPU supports Dynarec.
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

As of the commit that just happened 24 minutes ago from now, there's been a huge 1.9x performance increase in Eradicator on Cyrix 6x86 with infinite cache; was in the 40-50%s, now pulling in the 95-100%'s :)

EDIT: I've noticed Unreal Tournament's a bit slower now, going from 100% exec to now 80% on PMMX166 infinite settings
EDIT: I profiled on the latest and it's back to 100% now, but there is an odd hitch now and then. also half-life's software renderer crashed the emulator after a while, don't have reproducing steps yet
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

Lightmap blending artifacts in Half-Life's software renderer noticed in the recent CPU updates

Doesn't occur in interpreter. Dunno which commit exactly began this either
Attachments
ltgh.png
ltgh.png (36.14 KiB) Viewed 22483 times
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

Hopefully this should be fixed in rev 586. It's a little difficult to be sure - this issue was very intermittent for me.
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

Tyrian gets a "Runtime error 216 at 00F7:E4B8" with the recompiler on 486s and Pentiums

Interpreter works.
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

Fixed in rev 597.
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

Rebel Moon Rising crashes itself to desktop (pcem survives) within a split-second of gameplay. Works on interpreter.

The Labyrinth (2000, realNetworks/Nonstop Gaming) doesn't seem to start on the recompiler. (pcem survives) Works on interpreter. (also the menu text and cursor doesn't show on Voodoo2 and that's a separate issue)
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

Both of those games work okay for me (other than the Voodoo2 issues). What exact setup are you using?
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

430VX
S3 ViRGE/DX 4MB
Intel Mobile PMMX 300
Fast VLB/PCI
AWE32 @8MB
64MB RAM
Windows 98 (first edition)
DirectX 6
Rebel Moon Rising version off the Intel MMX Demos CD

odd that labyrinth works again for me. hm. Rebel Moon Rising's still broke though

EDIT: Compiled with just -flto fyi. Maybe the labyrinth crash was a profile-generated glitch. Was also going to mention Max Payne crashing the emulator but that seems to be an ATAPI issue with Safedisc? (a cracked max works)
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

Fixed Rebel Moon Rising issues in rev 616. That was a fun bug to track down...
terub56
Posts: 35
Joined: Mon 23 Jan, 2017 12:31 pm

Re: Dynamic recompiler

Post by terub56 »

Hi and thanks for this great emulator. I've been using it since v9.
I noticed in Award SiS 496/497 that any CPU at 16 or 33 Mhz fails to boot with recompiler enabled.

Regards
User avatar
omarsis81
Posts: 945
Joined: Thu 17 Dec, 2015 6:20 pm

Re: Dynamic recompiler

Post by omarsis81 »

terub56 wrote:Hi and thanks for this great emulator. I've been using it since v9.
I noticed in Award SiS 496/497 that any CPU at 16 or 33 Mhz fails to boot with recompiler enabled.

Regards
Sounds like this bug to me
viewtopic.php?f=3&t=484
terub56
Posts: 35
Joined: Mon 23 Jan, 2017 12:31 pm

Re: Dynamic recompiler

Post by terub56 »

I didn't know that has already been answered.
Thanks omarsis81
szadycbr
Posts: 295
Joined: Mon 21 Nov, 2016 6:23 pm

Re: Dynamic recompiler

Post by szadycbr »

I have a question about the CPU emulation, cos i use poor i5 2140m and i can emulate like p100 on windows 98 and idle it stays 100% ,i can run ex. NFS3 and it is 100% during gameplay, most of the time, but when in windows i do copy files, open folders , browse cd rom , it goes down to 80% easy, so only while handling things like read/write performance drops quickly. I know it must be CPU intensive task, but when i do test benchmarks, 3D graphics etc. as heavy as i want the performance usually dont drop that bad. i suspect that what causes that drop is the very reading / writing the virtual file, among some other things, which is noticeable when win 98 desktop boot, and the sound has to go in the same time, and many other things. All of this operations go under one thread and they cant run all together right? so what if you make only CPU emulation on second thread, i mean separate thread? out of all the other operations, undisturbed?. of course the emulated cpu speed will not be higher but i think it would not drop , during this operations and whole PCem would run a lot smoother. in my case p100 would run smoothly for someone else 300mmx etc. So, would it be very difficult to move CPU emulation to separate thread? soon no one will use single thread/core cpus , anyways they are almost dead and The older P4 CPUs will die in natural way,very soon if they are not dead allready. i just think that mooving CPU emulation to separate thread will make a loot of good. what do you think?
User avatar
dreamer
Posts: 40
Joined: Wed 28 Dec, 2016 11:56 am

Re: Dynamic recompiler

Post by dreamer »

Szadycbr - if You move something to a separate thread You must synchronize and with the amount of things PCem does - it might very well actually be slower. The performance drop could be because You are not using SSD to run PCem and the underlying HDD emulation is more complex disk access wise than direct access.
szadycbr
Posts: 295
Joined: Mon 21 Nov, 2016 6:23 pm

Re: Dynamic recompiler

Post by szadycbr »

Dreamer, you right i m not using SSD. Maybe just dynarec could be on second thread, and main CPU emulation will stay with PCem? i m sure syncing, could be done and unless somebody will try do it partialy or anyhow it will stay on "maybe, might be". But thank You for crashing my hopes :cry: , ;)
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

It is fundamentally not possible to split the CPU emulation over multiple threads. The dynamic recompiler is part of the CPU emulation, so running it on a separate thread is a complete non-starter.

I covered this in general the last time you asked about this, in viewtopic.php?f=2&t=603. The answer has not changed!
szadycbr
Posts: 295
Joined: Mon 21 Nov, 2016 6:23 pm

Re: Dynamic recompiler

Post by szadycbr »

Last time i asked about splitting on multiple threads which now i understand is not possible, this time i asked for moving on to separate thread. The dynamic recompiler is Part of CPU emulation, i know , that was stupid question. when i will have the cash next summer i will pay someone to do it, 500 pounds. i know its not much, but i think i will find someone, to try how it will work on separate thread.
User avatar
dreamer
Posts: 40
Joined: Wed 28 Dec, 2016 11:56 am

Re: Dynamic recompiler

Post by dreamer »

Szadycbr the whole task that PCem is performing is serial. It would require not recompiling, but complete rewriting of each and every application You want to parallelize. Going parallel is not a magic bullet - if it was then everything would be massively parralel, and now it's mostly for certain tasks that are known ahead of time and can be done separately to later merge results - with PCem it doesn't seem feasible, especially for CPU. I am not sure how Sarah did Voodoo parallelization though...
szadycbr
Posts: 295
Joined: Mon 21 Nov, 2016 6:23 pm

Re: Dynamic recompiler

Post by szadycbr »

Dreamer , you are right, and it will be massive task to rewrite all components, but it will take much less if only one PC will be changed , and only one Dispaly, sound adapter etc. not the whole PCem but only one complete PC with specific boards. still a lot of work. without voodoo emulation it could be on 2nd thread, or with voodoo CPU could run on 4th thread. and if voodoo can work in parallelization and it comunicate in real time, so could do the CPU i believe it is worth trying. If it will work out then future boards could be easyly made that way, and old PC's will work old way and in example pentium pro will work in new way.
PCem in gameplay run sound,cd music , graphics, read/write hd / memory , mainboard chipset handling everything (its way to fast BTW) at the same time and squizing 100% from single thread, i think that CPU on separate thread could also squeze 100% alone.
A. Naim
Posts: 139
Joined: Thu 09 Jul, 2015 5:06 pm

Re: Dynamic recompiler

Post by A. Naim »

szadycbr wrote:Dreamer , you are right, and it will be massive task to rewrite all components, but it will take much less if only one PC will be changed , and only one Dispaly, sound adapter etc. not the whole PCem but only one complete PC with specific boards. still a lot of work. without voodoo emulation it could be on 2nd thread, or with voodoo CPU could run on 4th thread. and if voodoo can work in parallelization and it comunicate in real time, so could do the CPU i believe it is worth trying. If it will work out then future boards could be easyly made that way, and old PC's will work old way and in example pentium pro will work in new way.
PCem in gameplay run sound,cd music , graphics, read/write hd / memory , mainboard chipset handling everything (its way to fast BTW) at the same time and squizing 100% from single thread, i think that CPU on separate thread could also squeze 100% alone.
Ok, so you want each board PCem emulates to be on a separate thread, running on a separate core. That's doable. Let's talk about what would be needed to do it. Note that I haven't looked at the code for PCem, and this is "just my opinion, ok?" ;)

Also, your idea is probably going to sound good to other people who wander on here, so a good explanation could help there. :) I'm going to be thorough, while at the same time trying not to sound insulting. I am not being insulting, but I am going to have to explain some stuff, and that's always tricky. :)

Third, I'm not posting this to hate on you. You seem to honestly be trying to help. Unfortunately, Hollywood, and the internet in general, makes programming seem a lot easier than it is. Sure, you can "learn to program" in 21 hours. The same way you can "learn how to be a carpenter" in 21 hours. It won't even be hard. You'll learn how to hammer nails into walls, use a screwdriver, an electronic level (I can even teach you how to use an old-fashioned "bubble" level. It's not hard), how to pour concrete. We can look up how to run wires and place outlets, too. There's how-to's and such. :)

Except you know I'm talking complete nonsense, and anything either of us actually built would probably fall down and/or catch fire.

Except that's exactly how so many seemingly-authoritative people on the internet seem to think programming works, and how easily you can do it.

The tl;dr here is that just about anything that needs some notable amount of CPU time, which is the emulated CPU itself and the Voodoo, already is using all 4 cores of a modern, mainstream CPU, and anything else either doesn't need nearly that much processing, or would actually add a bunch more processing for no real benefit. Details below.

1 core for the CPU.
1 core for the 3D graphics board.
1 core for the 2D graphics board.
1 core for the sound board.
1 core for the motherboard.
1 core for the network board.
1 core for the RAM
1 core for the monitor.

You can fit this config on an 8-core CPU. Ryzen isn't going to bring "8 core to everyone". Yeah, the CPU we have data on has 8 cores; it's also in the same market-bracket as Intel's 8-core CPU. So, onward.

Your CPU is fine, and a little more capable - Except not really, because of synchronization. In multicore programming, "synchronize" is one of those words that means "slow everything down and cause problems, and the best you can do is make it less bad." I'll get into that below. :)

Your emulated Voodoo card has 2 fewer cores to work with, because PCem already uses 3 for its Voodoo emulation.

Your emulated 2D card is taking up an entire core drawing pixels on the screen, and doing a worse job of it than if you just passed the results of the Voodoo draw operation to Windows' built-in GUI to paste on a standard panel, since you're adding a processing step to process processed data. ;) So let's merge those. 7 cores out of 8.

Now, a sound card. Sound is cheap, processing-wise. Motherboards have built-in sound cards, and most people can't tell the difference between that and a genuine SoundBlaster - Yes, they're still going, and still making high-end sound cards. Make a dedicated room for it, and you can hear the difference...if you have high-enough quality audio files to play, for there to be a difference in the first place. Plus, sound "chunks" from the DOS to late Win98 era are measured in kilobytes. This core is idling. So merge it with another thread - the CPU thread, for reasons I will explain below. Also, the CPU thread has to send data to this thread, and also react appropriately if this thread chunks some info back. So it'd be faster and easier to just have that all local. :) 6 cores out of 8.

Your motherboard core can be covered by a class container for a set of interfaces. You don't need an entire core for that. You probably don't even need close to 100 bytes, if you measure only the pointers to the interfaces. That's 5 cores out of 8.

An entire core for networking? Nope; Windows and Linux mostly don't even touch more than the surface. All you need is to call out to your Windows or Linux host, which calls out to your network box, which does most of the actual work. Granted, you probably need buffering and other surface support, but even then, this is overkill for a single core. So merge it into the CPU core, too. 4 cores out of 8, now. Also, now the CPU thread doesn't have to synchronize with this thread. That'd be a lot of synchronization, because of *IRQs

All the RAM does is store data. One or more arrays wrapped in an interface will do just fine, here. Sure, you could simulate a chip to refresh data, send data to the CPU, locate data in the RAM - But none of that actually improves accuracy of emulation. :) So fold this into the CPU thread, too. 3 cores out of 8, now. And your CPU is running faster, now, as it doesn't have to synchronize with a RAM thread to get data.

The monitor itself, as you may guess, does not need a core; just slap the new results of the Voodoo cores onto the panel, and tell Windows to update your window. Windows shoves that off onto your hardware GPU, and forgets about it. :) So you're now using 1 core for CPU, and 1 core for Voodoo.

Corrections are, of course, welcome. :)

* IRQ = Interrupt Request. Basically, some peripheral grabs the CPU's metaphorical face, yells "Look at my data!" it its ear, and then shoves it face-first into the data. This is the better solution, since the other solution involves missing critical data. Yeah...Synchronization. It happens every time you press a key on your keyboard, a button on your mouse, or even move your mouse. Add in synchronizing that across multiple threads...
szadycbr
Posts: 295
Joined: Mon 21 Nov, 2016 6:23 pm

Re: Dynamic recompiler

Post by szadycbr »

amazing, real good explanation, but you get this completely wrong, i meant the all components on one thread, except voodoo, and cpu on other thread. Emulating every component on separate thread would be uterly stupid. By mentioning all the boards i meant to visualise how already single core is busy and separating CPU would probably give nice boost.
:idea: But Yeah!!!! FDC on separate thread would be helpfull!!! :P
Really nice explanation but you went toooo far. :shock:

BTW i know how to use level , i used to do kitchens , bathrooms, with plumbing and electrics i can build a house if you want, when i was young from 10yrs up to 21 i was sleeping with computers, building them , fixing them , literally with solder in my hand , i dont know about programing of course cos my only programing was in Basic on ZX Spectrum 48k when i was 10 at the same age i build a radio, no one ever helped me with anything, at 21 i realised that i m addicted to computers and i left all computerwork behind throwing everything i had out of the window (you should see my neighbours, it was christmass for everybody) and i went into search for the other life. I m 35 now and i do some other things than building and fixing PCs, and i have a looooot of free time. therefore you can see me here at anytime. Good luck. :)
Last edited by szadycbr on Tue 07 Feb, 2017 6:49 am, edited 2 times in total.
User avatar
SarahWalker
Site Admin
Posts: 2054
Joined: Thu 24 Apr, 2014 4:18 pm

Re: Dynamic recompiler

Post by SarahWalker »

CPU/RAM/motherboard/cache are basically part of the same task and have to run in the same thread. There's no parallelization possible here so talking about splitting them into different threads is a complete non-starter.

Voodoo is already on (multiple) threads, as is the blitter on all the 2D cards.

Sound, and the remaining 2D graphics functions, are tightly synchronised with the CPU emulation, to ensure maximum compatibility with games that do 'interesting' things. Sound has very low CPU usage for most sound cards, and even on the more complicated cards (GUS and AWE32), it's usually in the region of only a couple of percent, and not really worth worrying about. The remaining graphics functionality is usually also relatively lightweight.

Beyond that, there's not really anything that's using anything beyond a minuscule amount of CPU time (FDC is certainly using next to nothing!). I know that no one ever believes me when I say this, but CPU emulation is >90% of the work done by PCem, and there's no magic bullet that will make everything much faster, no matter how many cores you throw at it.
User avatar
leilei
Posts: 1039
Joined: Fri 25 Apr, 2014 4:47 pm

Re: Dynamic recompiler

Post by leilei »

The only time there should ever be threaded CPU emulation is if the dual Pentium Pros get emulated and early SMP processing is generally useless for gaming in that era anyway (especially w/ the OS required). It'd only help old 3d apps that could thread out more lines for software rendering. Multithreaded gaming started to be a thing around 2005 and there's plenty of virtual machine software that can cover gaming from that point on already.

On topic-ish. the only game i've found lately to be unusually slow in the dynamic recompiler is Puyo Puyo 2 for Win95. I had to drop to a P133 to get 100% in that.

Haven't ran across much else of concern yet, besides the MDK2 fade bug mentioned before (which also occurs on interpreter).
User avatar
omarsis81
Posts: 945
Joined: Thu 17 Dec, 2015 6:20 pm

Re: Dynamic recompiler

Post by omarsis81 »

leilei wrote:The only time there should ever be threaded CPU emulation is if the dual Pentium Pros get emulated and early SMP processing is generally useless for gaming in that era anyway (especially w/ the OS required). It'd only help old 3d apps that could thread out more lines for software rendering. Multithreaded gaming started to be a thing around 2005 and there's plenty of virtual machine software that can cover gaming from that point on already.
and we've already heard from Sarah that dual CPU emulation won't happen ever, so...
Orchidsworn
Posts: 65
Joined: Sun 22 Mar, 2015 10:16 pm

Re: Dynamic recompiler

Post by Orchidsworn »

So is the only real remaining hope of moving forward left to waiting on faster computers or is there still hope of better CPU performance? Just trying to know what I can expect for the future?
User avatar
omarsis81
Posts: 945
Joined: Thu 17 Dec, 2015 6:20 pm

Re: Dynamic recompiler

Post by omarsis81 »

Orchidsworn wrote:So is the only real remaining hope of moving forward left to waiting on faster computers or is there still hope of better CPU performance? Just trying to know what I can expect for the future?
Please read this recien post by Sarah. She said there is room for improvement for the "support code"
viewtopic.php?f=2&t=603#p4143
There is hope for seeing early Pentium III (Katmai) by what she said

There are also weekly optimizations that makes PCem work faster with same (host) hardware
Post Reply