The current URL is datacrystal.tcrf.net.
Xbox 360/Hardware information/Xenos (GPU): Difference between revisions
m (small grammar fix) |
No edit summary |
||
Line 20: | Line 20: | ||
Xenos, like other Xbox console GPUs, is able to directly ingest and execute ucode without having to be compiled by the driver first, allowing programmers to squeeze out more performance by relying on the fixed HW spec. | Xenos, like other Xbox console GPUs, is able to directly ingest and execute ucode without having to be compiled by the driver first, allowing programmers to squeeze out more performance by relying on the fixed HW spec. | ||
{{todo|Fix this. Xenos ucodes use the 10 MSBs of the word to select the instruction, so hex values are not ideal for grouping them}} | |||
{| class="wikitable" | {| class="wikitable" | ||
Line 26: | Line 28: | ||
! Instruction !! Hex !! Description | ! Instruction !! Hex !! Description | ||
|- | |- | ||
| exec || | | exec || (Unknown) || Start executing code | ||
|- | |- | ||
| cnop? || | | cnop? || (Unknown) || NOP, end of program | ||
|- | |- | ||
| add || (Unknown) || Adds op2 and op3 and stores the result in op1 | | add || (Unknown) || Adds op2 and op3 and stores the result in op1 |
Latest revision as of 20:41, 31 July 2024
A repository for ROMhacking relevant info for Xenos.
Basic Specifications
- ATI Xenos @ 500MHz
- World's first unified shading GPU
- TeraScale uarch
- 3 "SIMD cores" (predecessor to CUs)
- 8 ROPS
- 16 TMUs
- 240 shaders
- Direct3D Feature Level 9_0c, with some things from higher feature levels (mainly FL 10)
- MEMEXPORT allows for GPGPU compute
- Connected to 10MB of eDRAM framebuffer
- eDRAM has extremely high bandwidth and PIM for free MSAA at 720p and low-cost at 1080p
Things to note: If using data from MEMEXPORT on CPU, you must enable both d3d12_readback_memexport and d3d12_readback_resolve in Xenia's options or use HW!
ucode Programming
Xenos, like other Xbox console GPUs, is able to directly ingest and execute ucode without having to be compiled by the driver first, allowing programmers to squeeze out more performance by relying on the fixed HW spec.
To do: Fix this. Xenos ucodes use the 10 MSBs of the word to select the instruction, so hex values are not ideal for grouping them |
Instruction | Hex | Description |
---|---|---|
exec | (Unknown) | Start executing code |
cnop? | (Unknown) | NOP, end of program |
add | (Unknown) | Adds op2 and op3 and stores the result in op1 |
cndeq | (Unknown) | If op2 = 0.0f, op1 = op3, else op1 = op4 |
Texture Formats
Format | Description | Xenos Specific | Other |
---|---|---|---|
Uncompressed | Bitmap | No | Not recommended due to high VRAM use |
DXT1/BC1 | 4x4 pixels packed into 64 bits
c0 16-bit color (ARRRRRGGGGGGBBBBB) c1 16-bit color (ARRRRRGGGGGGBBBBB) pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11 c0 > c1: c2 and c3 linearly interpolate between c0 and c1 c0 < c1: c3 rather than representing a middle point between c0/c1, represents full transparency |
No | |
DXT2/BC2 | 4x4 pixels pack into 128 bits
pixelAlphaTable[16] 4-bit alpha data c0 16-bit color (-RRRRRGGGGGGBBBBB) c1 16-bit color (-RRRRRGGGGGGBBBBB) pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11 c2 and c3 linearly interpolate between c0 and c1. Data is considered pre-multiplied by alpha |
No | |
DXT3/BC2 |
pixelAlphaTable[16] 4-bit alpha data c0 16-bit color (-RRRRRGGGGGGBBBBB) c1 16-bit color (-RRRRRGGGGGGBBBBB) pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11 c2 and c3 linearly interpolate between c0 and c1. Data is considered NOT pre-multiplied by alpha |
No | |
DXT3A | 4x4 pixels packed into 64 bits
pixelScalars[16] 4-bit scalar value |
Yes | |
DXT3A as 1111 | Same as DXT3A, but each bit of scalar data represents mask for respective channel | Yes | |
DXT4/BC3 | 4x4 pixels packed into 128 bits
a0 8-bit alpha value a1 8-bit alpha value pixelAlphaIndices[16] 3-bit index c0 16-bit color (-RRRRRGGGGGGBBBBB) c1 16-bit color (-RRRRRGGGGGGBBBBB) If a0 > a1, all alpha indices are linearly interpolated, else only a2-a5 are linearly interpolated while a6 and a7 are 0 and 255 respectively. Data is considered pre-multiplied by alpha |
No | |
DXT5/BC3 | 4x4 pixels packed into 128 bits
a0 8-bit alpha value a1 8-bit alpha value pixelAlphaIndices[16] 3-bit index c0 16-bit color (-RRRRRGGGGGGBBBBB) c1 16-bit color (-RRRRRGGGGGGBBBBB) If a0 > a1, all alpha indices are linearly interpolated, else only a2-a5 are linearly interpolated while a6 and a7 are 0 and 255 respectively. Data is considered NOT pre-multiplied by alpha |
No | |
DXT5A | 4x4 pixels packed into 64 bits
s0 8-bit scalar s1 8-bit scalar pixelScalarIndices[16] 3-bit indices Scalars are linearly interpolated |
Yes | |
DXN | 4x4 pixels packed into 128 bits
Same as DXT5A but with two channels of scalar |
Yes | |
CTX1 | 4x4 pixels packed into 64 bits
s0 8-bit scalar channel 0 s1 8-bit scalar channel 1 s2 8-bit scalar channel 2 s3 8-bit scalar channel 3 pixelScalarIndices[16] 2-bit indices, shared between both channels |
Yes |
Useful Links
To do: Organize these better |
https://www.techpowerup.com/gpu-specs/xbox-360-gpu-90nm.c1919
https://learn.microsoft.com/en-us/windows/win32/direct3d9/dx9-graphics-reference
https://learn.microsoft.com/en-us/windows/win32/direct3d9/dx9-graphics-programming-guide
https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-reference
https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-programming-guide
https://en.wikipedia.org/wiki/Unified_shader_model
https://fileadmin.cs.lth.se/cs/Personal/Michael_Doggett/talks/unc-xenos-doggett.pdf
http://www.students.science.uu.nl/~3220516/advancedgraphics/papers/inferred_lighting.pdf
https://en.wikipedia.org/wiki/S3_Texture_Compression
http://web.archive.org/web/20100423054747/http://msdn.microsoft.com:80/en-us/library/bb313877.aspx