Xbox 360/Hardware information/Xenos (GPU): Difference between revisions

From Data Crystal
Jump to navigation Jump to search
(Texture format table update)
(Texture format update. Add prelim info and links on SM 3.0 Xenos ucode (thanks TriΔng3l!))
Line 28: Line 28:
| exec || 0x10 0x2A 0x11 0x** || Start executing code
| exec || 0x10 0x2A 0x11 0x** || Start executing code
|-
|-
| (Unknown) || 0xFF 0xFF 0x03 0x** || NOP, end of program
| cnop? || 0xFF 0xFF 0x03 0x** || NOP, end of program
|-
| add || (Unknown) || Adds op2 and op3 and stores the result in op1
|-
| cndeq || (Unknown) || If op2 = 0.0f, op1 = op3, else op1 = op4
|}
|}


Line 42: Line 46:
|-
|-
| DXT1/BC1 || 4x4 pixels packed into 64 bits
| DXT1/BC1 || 4x4 pixels packed into 64 bits
c0 16-bit color (ARRRRRGGGGGGBBBBB)
'''c0''' 16-bit color (ARRRRRGGGGGGBBBBB)


c1 16-bit color (ARRRRRGGGGGGBBBBB)
'''c1''' 16-bit color (ARRRRRGGGGGGBBBBB)


pixelIndex[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11
'''pixelIndices[16]''' 2-bit indices, '''c0''' = 00, '''c1''' = 01, '''c2''' = 10, '''c3''' = 11


c0 < c1:  c3 rather than representing a middle point between the two colors, represents full transparency
'''c0''' > '''c1''':  '''c2''' and '''c3''' linearly interpolate between '''c0''' and '''c1'''
 
'''c0''' < '''c1''''''c3''' rather than representing a middle point between '''c0'''/'''c1''', represents full transparency
|| No ||  
|| No ||  
|-
|-
| DXT2/BC2 ||  || No ||  
| DXT2/BC2 || 4x4 pixels pack into 128 bits
'''pixelAlphaTable[16]''' 4-bit alpha data
 
'''c0''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''c1''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''pixelIndices[16]''' 2-bit indices, '''c0''' = 00, '''c1''' = 01, '''c2''' = 10, '''c3''' = 11
 
'''c2''' and '''c3''' linearly interpolate between '''c0''' and '''c1'''. Data is considered pre-multiplied by alpha
|| No ||  
|-
|-
| DXT3/BC2 ||  || No ||  
| DXT3/BC2 ||
pixelAlphaTable[16] 4-bit alpha data
 
'''c0''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''c1''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''pixelIndices[16]''' 2-bit indices, '''c0''' = 00, '''c1''' = 01, '''c2''' = 10, '''c3''' = 11
 
'''c2''' and '''c3''' linearly interpolate between '''c0''' and '''c1'''. Data is considered NOT pre-multiplied by alpha
|| No ||  
|-
|-
| DXT3A || || Yes ||  
| DXT3A || 4x4 pixels packed into 64 bits
'''pixelScalars[16]''' 4-bit scalar value
 
|| Yes ||  
|-
|-
| DXT3A as 1111 || || Yes ||  
| DXT3A as 1111 || Same as DXT3A, but each bit of scalar data represents mask for respective channel || Yes ||  
|-
|-
| DXT4/BC3 ||  || No ||  
| DXT4/BC3 || 4x4 pixels packed into 128 bits
'''a0''' 8-bit alpha value
 
'''a1''' 8-bit alpha value
 
'''pixelAlphaIndices[16]''' 3-bit index
 
'''c0''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''c1''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
If '''a0''' > '''a1''', all alpha indices are linearly interpolated, else only '''a2'''-'''a5''' are linearly interpolated while '''a6''' and '''a7''' are 0 and 255 respectively. Data is considered pre-multiplied by alpha
|| No ||  
|-
|-
| DXT5/BC3 ||  || No ||  
| DXT5/BC3 || 4x4 pixels packed into 128 bits
'''a0''' 8-bit alpha value
 
'''a1''' 8-bit alpha value
 
'''pixelAlphaIndices[16]''' 3-bit index
 
'''c0''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
'''c1''' 16-bit color (-RRRRRGGGGGGBBBBB)
 
If '''a0''' > '''a1''', all alpha indices are linearly interpolated, else only '''a2'''-'''a5''' are linearly interpolated while '''a6''' and '''a7''' are 0 and 255 respectively. Data is considered NOT pre-multiplied by alpha
|| No ||  
|-
|-
| DXT5A || || Yes ||  
| DXT5A || 4x4 pixels packed into 64 bits
 
'''s0''' 8-bit scalar
 
'''s1''' 8-bit scalar
 
'''pixelScalarIndices[16]''' 3-bit indices
 
Scalars are linearly interpolated
|| Yes ||  
|-
|-
| DXN || || Yes ||  
| DXN || 4x4 pixels packed into 128 bits
Same as DXT5A but with two channels of scalar
|| Yes ||  
|-
|-
| CTX1 || || Yes ||  
| CTX1 || 4x4 pixels packed into 64 bits
'''s0''' 8-bit scalar channel 0
 
'''s1''' 8-bit scalar channel 1
 
'''s2''' 8-bit scalar channel 2
 
'''s3''' 8-bit scalar channel 3
 
'''pixelScalarIndices[16]''' 2-bit indices, shared between both channels
|| Yes ||  
|}
|}


==Useful Links==
==Useful Links==
{{todo|Organize these better}}


https://www.techpowerup.com/gpu-specs/xbox-360-gpu-90nm.c1919
https://www.techpowerup.com/gpu-specs/xbox-360-gpu-90nm.c1919
Line 91: Line 167:


https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/R600_Instruction_Set_Architecture.pdf
https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/R600_Instruction_Set_Architecture.pdf
https://en.wikipedia.org/wiki/S3_Texture_Compression
http://web.archive.org/web/20100423054747/http://msdn.microsoft.com:80/en-us/library/bb313877.aspx

Revision as of 01:25, 17 April 2024

A repository for ROMhacking relevant info for Xenos.

Basic Specifications

  • ATI Xenos @ 500MHz
    • World's first unified shading GPU
    • TeraScale uarch
    • 3 "SIMD cores" (predecessor to CUs)
    • 8 ROPS
    • 16 TMUs
    • 240 shaders
    • Direct3D Feature Level 9_0c, with some things from higher feature levels (mainly FL 10)
    • MEMEXPORT allows for GPGPU compute
    • Connected to 10MB of eDRAM framebuffer
      • eDRAM has extremely high bandwidth and PIM for free MSAA at 720p and low-cost at 1080p

Things to note: If using data from MEMEXPORT on CPU, you must enable both d3d12_readback_memexport and d3d12_readback_resolve in Xenia's options or use HW!

ucode Programming

Xenos, like other Xbox consoles, is able to directly ingest and execute ucode without having to be compiled by the driver first, allowing programmers to squeeze out more performance by relying on the fixed HW spec.

Xenos ucode
Instruction Hex Description
exec 0x10 0x2A 0x11 0x** Start executing code
cnop? 0xFF 0xFF 0x03 0x** NOP, end of program
add (Unknown) Adds op2 and op3 and stores the result in op1
cndeq (Unknown) If op2 = 0.0f, op1 = op3, else op1 = op4


Texture Formats

Xenos Supported Texture Formats
Format Description Xenos Specific Other
Uncompressed Bitmap No Not recommended due to high VRAM use
DXT1/BC1 4x4 pixels packed into 64 bits

c0 16-bit color (ARRRRRGGGGGGBBBBB)

c1 16-bit color (ARRRRRGGGGGGBBBBB)

pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11

c0 > c1: c2 and c3 linearly interpolate between c0 and c1

c0 < c1: c3 rather than representing a middle point between c0/c1, represents full transparency

No
DXT2/BC2 4x4 pixels pack into 128 bits

pixelAlphaTable[16] 4-bit alpha data

c0 16-bit color (-RRRRRGGGGGGBBBBB)

c1 16-bit color (-RRRRRGGGGGGBBBBB)

pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11

c2 and c3 linearly interpolate between c0 and c1. Data is considered pre-multiplied by alpha

No
DXT3/BC2

pixelAlphaTable[16] 4-bit alpha data

c0 16-bit color (-RRRRRGGGGGGBBBBB)

c1 16-bit color (-RRRRRGGGGGGBBBBB)

pixelIndices[16] 2-bit indices, c0 = 00, c1 = 01, c2 = 10, c3 = 11

c2 and c3 linearly interpolate between c0 and c1. Data is considered NOT pre-multiplied by alpha

No
DXT3A 4x4 pixels packed into 64 bits

pixelScalars[16] 4-bit scalar value

Yes
DXT3A as 1111 Same as DXT3A, but each bit of scalar data represents mask for respective channel Yes
DXT4/BC3 4x4 pixels packed into 128 bits

a0 8-bit alpha value

a1 8-bit alpha value

pixelAlphaIndices[16] 3-bit index

c0 16-bit color (-RRRRRGGGGGGBBBBB)

c1 16-bit color (-RRRRRGGGGGGBBBBB)

If a0 > a1, all alpha indices are linearly interpolated, else only a2-a5 are linearly interpolated while a6 and a7 are 0 and 255 respectively. Data is considered pre-multiplied by alpha

No
DXT5/BC3 4x4 pixels packed into 128 bits

a0 8-bit alpha value

a1 8-bit alpha value

pixelAlphaIndices[16] 3-bit index

c0 16-bit color (-RRRRRGGGGGGBBBBB)

c1 16-bit color (-RRRRRGGGGGGBBBBB)

If a0 > a1, all alpha indices are linearly interpolated, else only a2-a5 are linearly interpolated while a6 and a7 are 0 and 255 respectively. Data is considered NOT pre-multiplied by alpha

No
DXT5A 4x4 pixels packed into 64 bits

s0 8-bit scalar

s1 8-bit scalar

pixelScalarIndices[16] 3-bit indices

Scalars are linearly interpolated

Yes
DXN 4x4 pixels packed into 128 bits

Same as DXT5A but with two channels of scalar

Yes
CTX1 4x4 pixels packed into 64 bits

s0 8-bit scalar channel 0

s1 8-bit scalar channel 1

s2 8-bit scalar channel 2

s3 8-bit scalar channel 3

pixelScalarIndices[16] 2-bit indices, shared between both channels

Yes

Useful Links

Hmmm...
To do:
Organize these better

https://www.techpowerup.com/gpu-specs/xbox-360-gpu-90nm.c1919

https://learn.microsoft.com/en-us/windows/win32/direct3d9/dx9-graphics-reference

https://learn.microsoft.com/en-us/windows/win32/direct3d9/dx9-graphics-programming-guide

https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-reference

https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-programming-guide

https://xenia.jp/updates/2021/04/27/leaving-no-pixel-behind-new-render-target-cache-3x3-resolution-scaling.html

https://en.wikipedia.org/wiki/Unified_shader_model

https://fileadmin.cs.lth.se/cs/Personal/Michael_Doggett/talks/unc-xenos-doggett.pdf

http://www.students.science.uu.nl/~3220516/advancedgraphics/papers/inferred_lighting.pdf

https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/R600_Instruction_Set_Architecture.pdf

https://en.wikipedia.org/wiki/S3_Texture_Compression

http://web.archive.org/web/20100423054747/http://msdn.microsoft.com:80/en-us/library/bb313877.aspx