I'm just registered but i was following the development of pcsx2 for quite a
while. I was disappointed that VUrec can't run on my xp 2600+ so I started
playing with cvs sources and noticed that actually there are six sse2 instruc-
tions currently used by the code. There are two ways to go: rewrite the code
involved (most efficient and polite) or simply emulate the sse2 instructions
through sse and other code (dirty but faster to code and more flexible).
I started with the latter.
I wrote some (EDIT: now working) code that replaces SSE2_ functions with
SSE2EMU_ functions that should do roughly the same thing, just a bit slower
(especially PSHUFD, which is very heavy to emulate); anyway the benefits of
recompilation are far bigger that the penalty introduced with this code: it will
be faster than interpreter anyway. Unfortunately I have no experience with
sse and sse2 coding so I could have made big mistakes with it; moreover it's
quite difficult to debug this code for me since can't compare the output of the
emulated opcodes with the real ones.
I'm submitting this code to the developers and anyone who would help; please
don't flame me if you find my idea a bad one, or simply you are not interested
in it. I'm just trying to help anyone who, like me, can't afford an sse2 capable
cpu. I was thinking of just email all this to some developer, but I didn't want
to bother them more than the necessary.
Thanks for your attention
edit: code updated. now working. need someone that compares output with
a true sse2 cvs build and 3d games!
OMG wow....what a great idea. I always thought they should have made VUrec for SSE 1 also. Since all athlon xp/sempron, most celeron and p III users dont have SSE 2 and also dont have to waste money to see the most wonderful speeds and images shown by the great emu. not everyone here has amd 64/fx/p4 machines. All im saying, is that kekko, you are the man.....and so is the rest of the PCSX2 team for making this even remotely possible.
For all posters: To clear some things up before they get out of hand, kekko did not write all the code in the file, he has merely added some functions to it, the majority of the code in that file is the work of the PCSX2 team.
Also to explain for those non-programmers in here, kekko has converted PCSX2 functions written specifically for SSE2 enabled CPUs to all SSE1 instructions to allow more people to use the VUrec. What this means for the users is hopefully a decent speedboost on PCs without SSE2 enabled processors.
Well I don't know if we should because.....kekko stated that it was not working atm. I guess we will have to wait till linuzappz or any other person from the PCSX2 team to get their hands on the code and have the time to enhance it. I would love to have VUrec on my AMD Sempron 2500+, but I can wait for however long needed till its complete.
of course he did not write all code himself if the person who assumed this would have EVER checked the PCSX 2 source he would know..
The amount of kekkos work covers about 9~10 pages
@The Unknown One and other people which doesn´t seem to know PCSX 2 source
Have u ever browsed the COMPLETE Code of the Project ?? ur wondering about 64 pages? LOL.... .....ROFL ...... -_-'
Here is a calculation example:
if 58 pages (kekko example) take a size of 90 kb uncompressed guess how many lines of code are:3,64 MEGA BYTE
3,64 MEGA BYTE , thats the size of the PCSX 2 source unzipped takes 3,64 MEGA BYTE of space JUST pure text.. can u even imagine or consider the complexity of this!
...think about... praise the L...äh.. the PCSX 2 Team ^^
IMO currently problem is or more waht REALLY important ATM:
the emuathors STILL haven´t contaced him! Refraction wanted to do this? I know the other authors may be busy but the thing is that kekko need some specific information(see above)
I suggest u try to contact the authors via (like generalplot suggested):
IRC, server: efnet channel: #pcsx2
I think the problem is that the authors are always botherd by so much spam that its hard to contact them anyway... but now is a situation which requires to make a contact as fast as possible. I hope kekko get fast acces to the developper network the rest will workout then fine. @mods what u thinkare these ideas ok?
wbr Shin Gouki
Well all real emu fans (not counting noobs) know how complex the PCSX2 code can get. I mean look at what the emu has to emulate. It doesn't take a rocket scientist to see how complex the PS2 hardware really is. Not to mention the information on the hardware (or lack therof) makes it really hard for emu arthors to code such an emulator. 64 pages is really nothing. Kekko should really get in contact with the PCSX2 team asap. his idea will make many people without SSE2 (like me) VERY happy .
enlightment on the way!
ATM i´m VERY busy so i cant try myself but i looked at the code from kekko and i think i could do this waht hes trying to do.. the problem is time... so till i can do something here something for u:
- learn C
- read through this: http://ds9a.nl/gcc-simd/example.html
to get a impression what SSE2 TO SSE means
as kekko pointet out above debuggin is hard, which is reasonable.
I understand the problem as follows(roughly):
there are components inside PCSX 2 which call functions from VUMicro:
``external Call -- incoming ´´
|the data is probably some |
|floating point stuff and now |
|needs to be processed accordingly |
``result Call -- outgoing ´´
- Stuff to do now is: getting the neccessary data structures for input and desired output ( this should come from PCSX team)
- take a look at SSE and SSE2 functions in general and onto C implementation
- create algorithms and datastructure , the algoeithms must do produce same results as SSE2 functions but using SSE functions.
May be it would be nice to have a test function where a "component" solution is passed into so u can see IF it is working and (thats important) how FAST
.. its late on my side i need sleep yet have work to do but roughly VERY rougly thats it...
All the stuff i wrote , kekko wrote too he just used kinda "short" form(the clean and "dirty" thing he meant)!
wbr Shin Gouki