Hi!
I'm just registered but i was following the development of pcsx2 for quite a
while. I was disappointed that VUrec can't run on my xp 2600+ so I started
playing with cvs sources and noticed that actually there are six sse2 instruc-
tions currently used by the code. There are two ways to go: rewrite the code
involved (most efficient and polite) or simply emulate the sse2 instructions
through sse and other code (dirty but faster to code and more flexible).
I started with the latter.
I wrote some (EDIT: now working) code that replaces SSE2_ functions with
SSE2EMU_ functions that should do roughly the same thing, just a bit slower
(especially PSHUFD, which is very heavy to emulate); anyway the benefits of
recompilation are far bigger that the penalty introduced with this code: it will
be faster than interpreter anyway. Unfortunately I have no experience with
sse and sse2 coding so I could have made big mistakes with it; moreover it's
quite difficult to debug this code for me since can't compare the output of the
emulated opcodes with the real ones.
I'm submitting this code to the developers and anyone who would help; please
don't flame me if you find my idea a bad one, or simply you are not interested
in it. I'm just trying to help anyone who, like me, can't afford an sse2 capable
cpu. I was thinking of just email all this to some developer, but I didn't want
to bother them more than the necessary.
Thanks for your attention
-kekko
edit: code updated. now working. need someone that compares output with
a true sse2 cvs build and 3d games!
I'm just registered but i was following the development of pcsx2 for quite a
while. I was disappointed that VUrec can't run on my xp 2600+ so I started
playing with cvs sources and noticed that actually there are six sse2 instruc-
tions currently used by the code. There are two ways to go: rewrite the code
involved (most efficient and polite) or simply emulate the sse2 instructions
through sse and other code (dirty but faster to code and more flexible).
I started with the latter.
I wrote some (EDIT: now working) code that replaces SSE2_ functions with
SSE2EMU_ functions that should do roughly the same thing, just a bit slower
(especially PSHUFD, which is very heavy to emulate); anyway the benefits of
recompilation are far bigger that the penalty introduced with this code: it will
be faster than interpreter anyway. Unfortunately I have no experience with
sse and sse2 coding so I could have made big mistakes with it; moreover it's
quite difficult to debug this code for me since can't compare the output of the
emulated opcodes with the real ones.
I'm submitting this code to the developers and anyone who would help; please
don't flame me if you find my idea a bad one, or simply you are not interested
in it. I'm just trying to help anyone who, like me, can't afford an sse2 capable
cpu. I was thinking of just email all this to some developer, but I didn't want
to bother them more than the necessary.
Thanks for your attention
-kekko
edit: code updated. now working. need someone that compares output with
a true sse2 cvs build and 3d games!