Bootrom

From GC-Forever Wiki
Jump to navigation Jump to search

Gamecube bootrom reverse engineering for CubeDocumented project

This reversing is not pretending to be compilable, and can be used for education purposes only. I dont hold any responsibility for damage of your Gamecube clock settings or memory card saves or else, produced from using code/information represented in this document.

Documented by Andrei Chestakov (org). Version 1.1. (19 Jul 2005)

Additions and corrections by tmbinc.

19 Jul 2005: Small corrections regardling "BS" acronym (see Appendix).

PART I. Reset Vector. Boot Stage.

Bootrom NTSC 1.0 version is used for reverse engineering.

See brief description of BS run flow at the end of disassembly.

After hardware reset (POWER ON button pressed), CPU will jump to reset vector. Reset exception handler address is 0x00100 by default, but Gekko's MSR[IP] bit is implemented to be set, after reset, so final address will be 0xFFF00100. 0xFFFxxxxx physical address is mapped by Flipper's memory interface to the first megabyte of GC bootrom area. Execution of any exception handler will proceed with disabled memory translation.

This part of GC bootrom is called BS - Bootstrap Stage. Bootstrap is written on assembly language. BS mainly purposed to load second logical part of bootrom, called BS2 or IPL (initial program loader). BS is placed in bootrom at 0x000-0x800 offsets. Disassembly obtained from Dolwin Debugger, skipped parts are filled by zeroes and not shown. Well, lets go..

Set HID0 to 0x00110C64. This will initialize Gekko implementation specifics. Meaning of this operation :

  • Mask machine check exception (MCP line).
  • Ignore memory bus parity errors.
  • Enable dynamic power management mode.
  • Check hard reset bit (to distinguish software/hardware resets)
  • [!] Disable data and instruction cache.
  • Cache invalidation will not write-back to memory.
  • Disable memory coherency for instruction fetch.
  • Disable gathering of non-word accesses.
  • Force data cache to set invalid entries on data miss.
  • Enable branch-instruction cache (BTIC).
  • [!] Disable broadcasting. CPU now can read/write data only from data cache.
  • Enable on-chip branch history (Gekko has 512-entry BHT).
  • Enable dcbt and dcbtst instructions.
:FFF00100   lis         r4, 0x0011
:FFF00104   addi        r4, r4, 3172
:FFF00108   mtspr       HID0, r4

Set machine state register to 0x2000. This will initialize CPU program model. Meaning of this operation :

  • Disable power management.
  • [!] Disable interrupts and decrementer exception.
  • [!] Set supervisor privilege level.
  • Enable floating-point instructions.
  • Disable machine check exception.
  • Disable FPU exceptions.
  • [!] Set exception vectors base to point on low memory.
  • [!] Disable instruction and data address translation.
  • Disable Gekko performance monitor.
  • Clear recoverable exception flag.
  • Enable little-endian mode (default for PowerPC).
:FFF0010C   lis         r4, 0x0000
:FFF00110   addi        r4, r4, 8192
:FFF00114   mtmsr       r4

Initialize auxiliary memory (ARAM). Meaning of bits in 0x5012 register is still unknown. Set this register value to 0x40 | 3. Last two bits are set only when ARAM expansion is present (looked in ARAM library). Set ARAM refresh value to 156MHz (0x501A register).

:FFF00118   lis         r4, 0x0C00
:FFF0011C   addi        r4, r4, 0x5012
:FFF00120   li          r5, 67
:FFF00124   sth         r5, 0 (r4)          ; Set 0x5012 register value to 0x43.
:FFF00128   li          r5, 156
:FFF0012C   sth         r5, 8 (r4)          ; Set ARAM refresh value.

Initialize Flipper memory interface. Meaning of 0x4026 register is still unknown. Set this register value to 0x40.

:FFF00130   lis         r3, 0x0C00
:FFF00134   ori         r3, r3, 0x4000
:FFF00138   li          r4, 64
:FFF0013C   sth         r4, 38 (r3)         ; Set 0x4026 register value to 0x40.
:FFF00140   nop
:FFF00144   nop

Enable data and instruction cache (set HID0[ICE] and HID0[DCE])

:FFF00148   mfspr       r3, HID0
:FFF0014C   ori         r4, r3, 0xC000
:FFF00150   mtspr       HID0, r4
:FFF00154   nop
:FFF00158   nop
:FFF0015C   nop
:FFF00160   isync

Initialize CPU memory model. Clear BATs and segment registers.

:FFF00164   li          r4, 0
:FFF00168   mtspr       DBAT0U, r4
:FFF0016C   mtspr       DBAT1U, r4
:FFF00170   mtspr       DBAT2U, r4
:FFF00174   mtspr       DBAT3U, r4
:FFF00178   mtspr       IBAT0U, r4
:FFF0017C   mtspr       IBAT1U, r4
:FFF00180   mtspr       IBAT2U, r4
:FFF00184   mtspr       IBAT3U, r4
:FFF00188   isync
:FFF0018C   lis         r4, 0x8000
:FFF00190   addi        r4, r4, 0
:FFF00194   mtsr        0, r4
:FFF00198   mtsr        1, r4
:FFF0019C   mtsr        2, r4
:FFF001A0   mtsr        3, r4
:FFF001A4   mtsr        4, r4
:FFF001A8   mtsr        5, r4
:FFF001AC   mtsr        6, r4
:FFF001B0   mtsr        7, r4
:FFF001B4   mtsr        8, r4
:FFF001B8   mtsr        9, r4
:FFF001BC   mtsr        10, r4
:FFF001C0   mtsr        11, r4
:FFF001C4   mtsr        12, r4
:FFF001C8   mtsr        13, r4
:FFF001CC   mtsr        14, r4
:FFF001D0   mtsr        15, r4

Configure memory model. Dolphin OS is using MIPS-like translation to simplify memory operations. Segment memory model (page tables) is not used in Dolphin OS. Although, some applications can use page talbes to simulate direct ARAM access. BAT registers 0 and 1 are reserved by Dolphin OS. BAT registers 3 and 4 can be used by applications. BAT 3 register is temporary used by bootrom.

Configuration of DBAT and IBAT registers in Dolphin OS is as follow :

DBAT0: 80001FFF 00000002    Write-back cached main memory, 256MB block.
DBAT1: C0001FFF 0000002A    Write-through cached main memory, 256MB block.
DBAT2: 00000000 xxxxxxxx    Dont care, reserved.
DBAT3: FFF0001F FFF00001    Bootrom, 1MB block.
IBAT0: 80001FFF 00000002    Write-back cached main memory, 256MB block.
IBAT1: 00000000 xxxxxxxx    Dont care, reserved.
IBAT2: 00000000 xxxxxxxx    Dont care, reserved.
IBAT3: FFF0001F FFF00001    Bootrom, 1MB block.

Write-back mean that all writes goes only in cache, and will be written back in memory only after cache "store" operation. Write-through mean that all writes goes immediately in main memory (data in cache is always same with data in main memory). Read is always performed from cache.

Data access on hardware registers is always broadcasting, since WIMG bit M is set for write-through block (0xC0000000-0xCFFFFFFF).

So, here is Dolphin OS (effective addressed) and GC (physical addressed) memory maps. Note, that bootrom's effective address is useful only during hard reset.

Effective Address (Dolphin OS)

   80000000  24MB  Main Memory (RAM), write-back cached
   C0000000  24MB  Main Memory (RAM), write-through cached
   C8000000   2MB  Embedded Framebuffer (EFB)
   CC000000        Command Processor (CP)
   CC001000        Pixel Engine (PE)
   CC002000        Video Interface (VI)
   CC003000        Peripheral Interface (PI)
   CC004000        Memory Interface (MI)
   CC005000        DSP and DMA Audio Interface (AID)
   CC006000        DVD Interface (DI)
   CC006400        Serial Interface (SI)
   CC006800        External Interface (EXI)
   CC006C00        Audio Streaming Interface (AIS)
   CC008000        PI FIFO (GX)
   FFF00000   1MB  Boot ROM (first megabyte), used during BS only.

Other memory access will ISI/DSI at first (if not mapped by MMU) and then Flipper memory interface interrupt (if its missing or not allowed).

Physical Address (Flipper memory interface)

   00000000  24MB  Main Memory (RAM)
   08000000   2MB  Embedded Framebuffer (EFB)
   0C000000        Command Processor (CP)
   0C001000        Pixel Engine (PE)
   0C002000        Video Interface (VI)
   0C003000        Peripheral Interface (PI)
   0C004000        Memory Interface (MI)
   0C005000        DSP and DMA Audio Interface (AID)
   0C006000        DVD Interface (DI)
   0C006400        Serial Interface (SI)
   0C006800        External Interface (EXI)
   0C006C00        Audio Streaming Interface (AIS)
   0C008000        PI FIFO (GX)
   FFF00000   1MB  Boot ROM (first megabyte)

Other memory access will generate Flipper memory interface interrupt.

:FFF001D4   lis         r4, 0x0000
:FFF001D8   addi        r4, r4, 2
:FFF001DC   lis         r3, 0x8000
:FFF001E0   addi        r3, r3, 8191
:FFF001E4   mtspr       DBAT0L, r4
:FFF001E8   mtspr       DBAT0U, r3
:FFF001EC   isync
:FFF001F0   mtspr       IBAT0L, r4
:FFF001F4   mtspr       IBAT0U, r3
:FFF001F8   isync
:FFF001FC   lis         r4, 0x0000
:FFF00200   addi        r4, r4, 42
:FFF00204   lis         r3, 0xC000
:FFF00208   addi        r3, r3, 8191
:FFF0020C   mtspr       DBAT1L, r4
:FFF00210   mtspr       DBAT1U, r3
:FFF00214   isync
:FFF00218   lis         r4, 0xFFF0
:FFF0021C   addi        r4, r4, 1
:FFF00220   lis         r3, 0xFFF0
:FFF00224   addi        r3, r3, 31
:FFF00228   mtspr       DBAT3L, r4
:FFF0022C   mtspr       DBAT3U, r3
:FFF00230   isync
:FFF00234   mtspr       IBAT3L, r4
:FFF00238   mtspr       IBAT3U, r3
:FFF0023C   isync

Enable instruction and data translation by setting MSR[IR] and MSR[DR] to 1.

:FFF00240   mfmsr       r4
:FFF00244   ori         r4, r4, 0x0030      ; Enable address translation.
:FFF00248   mtmsr       r4
:FFF0024C   isync

Write 0x0245248A to 0x3030 register. Meaning is unknown. Register is unknown.

:FFF00250   lis         r3, 0xCC00
:FFF00254   ori         r3, r3, 0x3000
:FFF00258   lis         r4, 0x0245
:FFF0025C   ori         r4, r4, 0x248A
:FFF00260   stw         r4, 48 (r3)         ; Write 0x0245248A to 0x3030 register.

Reset DVD, through PI reset register. Meaning of bits in reset register is still unknown.

:FFF00264   lwz         r4, 36 (r3)         ; Read PI reset register.
:FFF00268   ori         r4, r4, 0x0001
:FFF0026C   rlwinm      r4, r4, 0, 31, 28   ; Set bit 31, clear bit 29.
:FFF00270   stw         r4, 36 (r3)         ; Write new value in reset register.
:FFF00274   mftbl       r5
:FFF00278   mftbl       r6
:FFF0027C   sub         r7, r6, r5
:FFF00280   cmplwi      r7, 4388
:FFF00284   blt+        0xFFF00278          ; Wait ~9 us (with 486MHz clock)
:FFF00288   ori         r4, r4, 0x0003      ; Set bit 31, set bit 29.
:FFF0028C   stw         r4, 36 (r3)         ; Write new value in reset register.

Allow 32MHz EXI clock setting by CPU.

:FFF00290   lis         r14, 0xCC00
:FFF00294   ori         r14, r14, 0x6400
:FFF00298   li          r4, 0
:FFF0029C   stw         r4, 60 (r14)        ; SI EXICLK[LOCK] = 1

Probe EXI AD16. It is still unknown what type of hardware is representing AD16. Seems its used to store bootrom "trace" step (this is only one kind of value, have seen to be written in AD16 so far). AD16 is working at 8MHz.

AD16 trace state (what values was written in AD16 and when):

BS:    0x01000000      ?
       0x02000000      ?
       0x03000000      ?
       0x04000000      Memory test passed.
       0x05xxxxxx      \
       0x06xxxxxx       | Memory test failed.
       0x07xxxxxx      /
IPL:   0x08000000      IPL and OSInit called.
       0x09000000      DVDInit.
       0x0A000000      CARDInit.
       0x0B000000      VIInit.
       0x0C000000      PADInit.

To probe AD16 we must read its EXI ID. It should be 0x04120000. Place it to R20. Readed value will be checked in AD16 write routine.

:FFF002A0   lis         sd2, 0xCC00
:FFF002A4   ori         sd2, sd2, 0x6800
:FFF002A8   lis         r22, 0x0000
:FFF002AC   ori         r22, r22, 0x00BA
:FFF002B0   li          r8, 1
:FFF002B4   li          r10, 0
:FFF002B8   lis         r21, 0x0412
:FFF002BC   ori         r21, r21, 0x0000
:FFF002C0   lis         r3, 0x0000
:FFF002C4   ori         r3, r3, 0x0000
:FFF002C8   lis         r7, 0x0000
:FFF002CC   ori         r7, r7, 0x0015
:FFF002D0   stw         r3, 56 (sd2)        ; EXI2 DATA = 0 (Get ID command)
:FFF002D4   stw         r22, 40 (sd2)       ; Select AD16, through EXI2 CSR.
:FFF002D8   lwz         r16, 40 (sd2)
:FFF002DC   stw         r7, 52 (sd2)        ; Write immediate 2 bytes from DATA.
:FFF002E0   lwz         r16, 52 (sd2)       ; \
:FFF002E4   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF002E8   bgt+        0xFFF002E0          ; /
:FFF002EC   lis         r7, 0x0000
:FFF002F0   ori         r7, r7, 0x0031
:FFF002F4   stw         r7, 52 (sd2)        ; Read immediate 4 bytes to DATA (ID).
:FFF002F8   lwz         r16, 52 (sd2)       ; \
:FFF002FC   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF00300   bgt+        0xFFF002F8          ; /
:FFF00304   stw         r10, 40 (sd2)       ; Deselect device.
:FFF00308   lwz         r16, 40 (sd2)       ; Read EXI2 CSR twice. Why? No idea.
:FFF0030C   lwz         r16, 40 (sd2)       ; Maybe its deselect attribute..
:FFF00310   lwz         r20, 56 (sd2)       ; r20 = DATA. It should contain ID.
:FFF00314   b           0xFFF00320
:FFF00318
:FFF0031C

Write "trace step" value to AD16. Only when probe was success (R20 = AD16 ID). Input value (trace step) must be in R15.

:FFF00320   b           0xFFF00344
:FFF00324   lis         r3, 0xA000
:FFF00328   ori         r3, r3, 0x0000
:FFF0032C   lis         r7, 0x0000
:FFF00330   ori         r7, r7, 0x0005
:FFF00334   stw         r3, 56 (sd2)        ; EXI2 DATA = 0xA0000000 (Write AD16 command)
:FFF00338   stw         r22, 40 (sd2)       ; Select AD16, through EXI2 CSR.
:FFF0033C   lwz         r16, 40 (sd2)
:FFF00340   b           0xFFF00348
:FFF00344   b           0xFFF00364
:FFF00348   stw         r7, 52 (sd2)        ; Write immediate 1 byte from DATA.
:FFF0034C   lwz         r16, 52 (sd2)       ; \
:FFF00350   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF00354   bgt+        0xFFF0034C          ; /
:FFF00358   nop
:FFF0035C   nop
:FFF00360   b           0xFFF00368
:FFF00364   b           0xFFF00384
:FFF00368   lis         r7, 0x0000
:FFF0036C   ori         r7, r7, 0x0035
:FFF00370   stw         r15, 56 (sd2)       ; EXI2 DATA = trace step
:FFF00374   stw         r7, 52 (sd2)        ; Write immediate 4 bytes from DATA.
:FFF00378   lwz         r16, 52 (sd2)       ; \
:FFF0037C   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF00380   b           0xFFF00388          ; /
:FFF00384   b           0xFFF003A0
:FFF00388   bgt+        0xFFF00378
:FFF0038C   stw         r10, 40 (sd2)       ; Deselect device.
:FFF00390   lwz         r16, 40 (sd2)
:FFF00394   lwz         r16, 40 (sd2)
:FFF00398   blr
:FFF0039C
:FFF003A0   b           0xFFF003B0
:FFF003A4   cmplw       r20, r21            ; If AD16 probe failed, then skip.
:FFF003A8   beq+        0xFFF00324
:FFF003AC   blr

Trace step 0x01 - Nothing ?

:FFF003B0   lis         r15, 0x0100         ; AD16 = 0x01000000
:FFF003B4   bl          0xFFF003A4
:FFF003B8   nop
:FFF003BC   nop
:FFF003C0   nop
:FFF003C4   nop
:FFF003C8   nop
:FFF003CC   nop
:FFF003D0   nop
:FFF003D4   b           0xFFF003E0
:FFF003D8
:FFF003DC

Trace step 0x02 - Nothing ?

:FFF003E0   nop
:FFF003E4   nop
:FFF003E8   nop
:FFF003EC   nop
:FFF003F0   nop
:FFF003F4   lis         r15, 0x0200         ; AD16 = 0x02000000
:FFF003F8   bl          0xFFF003A4
:FFF003FC   b           0xFFF00400

Memory self test with given pattern. Input parameters: R25 - base address, R26 - word pattern. Registers R17, 18, 19, 27, 28 and 29 are used to save failed address information and cleared before test.

Fail test information has following format:

R17     \
R18      | Boundary.
R19     /
R27     Number of fails on address with last digit 0,1,2,3,4,5,6,7.
R28     Number of fails on address with last digit 8,9,A,B,C,D,E,F.
R29     Holds last address where test is failed.

(Well its not important and only for addicted people :))

It is not good idea to test memory with same pattern. Better test memory at address X with value X (e.g. compare [0x80000000] with 0x80000000 etc).

Here is C version of this routine :

SelfTest(base, pattern)
{
   memsize = 24 * 1024 * 1024;

   u32 *ptr = base;

   // Fill memory by test pattern.
   for(i=0; i<memsize/32; i++)
   {
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
       *ptr++ = pattern;
   }

   ptr = base;

   // Test memory.
   for(i=0; i<memsize/4; i++)
   {
       val = *ptr;

       if(val != pattern)
       {
           R17 |= 1 << (ptr & 0x1f);
           R18 |= 1 << (((ptr >> 18) - 32) & 0x1f);
           R19 |= 1 << (((ptr >> 18) - 64) & 0x1f);

           if( (ptr & 0xf) < 8 ) R28++;
           else R27++;

           if(R29 < ptr) R29 = ptr;
       }

       ptr++;
   }
}
:FFF00400   b           0xFFF00424
:FFF00404   nop
:FFF00408   nop
:FFF0040C   nop
:FFF00410   nop
:FFF00414   nop
:FFF00418   mr          r23, r25
:FFF0041C   lis         r24, 0x0180         ; Main memory size (24MB)
:FFF00420   b           0xFFF00428
:FFF00424   b           0xFFF00444
:FFF00428   rlwinm      r24, r24, 27, 5, 31 ; Fill memory (by 32 byte portions).
:FFF0042C   mtctr       r24
:FFF00430   stw         r26, 0 (r23)
:FFF00434   stw         r26, 4 (r23)
:FFF00438   stw         r26, 8 (r23)
:FFF0043C   stw         r26, 12 (r23)
:FFF00440   b           0xFFF00448
:FFF00444   b           0xFFF00464
:FFF00448   stw         r26, 16 (r23)
:FFF0044C   stw         r26, 20 (r23)
:FFF00450   stw         r26, 24 (r23)
:FFF00454   stw         r26, 28 (r23)
:FFF00458   addi        r23, r23, 32
:FFF0045C   bdnz+       0xFFF00430
:FFF00460   b           0xFFF00468
:FFF00464   b           0xFFF00484
:FFF00468   mr          r23, r25
:FFF0046C   lis         r24, 0x0180
:FFF00470   rlwinm      r24, r24, 30, 2, 31
:FFF00474   mtctr       r24                 
:FFF00478   lwz         r15, 0 (r23)        ; Begin to test.
:FFF0047C   cmplw       r15, r26
:FFF00480   b           0xFFF00488
:FFF00484   b           0xFFF004A4
:FFF00488   beq-        0xFFF00514
:FFF0048C   rlwinm      r15, r23, 14, 18, 31
:FFF00490   andi.       r15, r15, 0x001F
:FFF00494   li          r16, 1
:FFF00498   slw         r16, r16, r15
:FFF0049C   or          r17, r17, r16
:FFF004A0   b           0xFFF004A8
:FFF004A4   b           0xFFF004C4
:FFF004A8   rlwinm      r15, r23, 14, 18, 31
:FFF004AC   subi        r15, r15, 32
:FFF004B0   andi.       r15, r15, 0x001F
:FFF004B4   li          r16, 1
:FFF004B8   slw         r16, r16, r15
:FFF004BC   or          r18, r18, r16
:FFF004C0   b           0xFFF004C8
:FFF004C4   b           0xFFF004E4
:FFF004C8   rlwinm      r15, r23, 14, 18, 31
:FFF004CC   subi        r15, r15, 64
:FFF004D0   andi.       r15, r15, 0x001F
:FFF004D4   li          r16, 1
:FFF004D8   slw         r16, r16, r15
:FFF004DC   or          r19, r19, r16
:FFF004E0   b           0xFFF004E8
:FFF004E4   b           0xFFF00504
:FFF004E8   rlwinm      r15, r23, 0, 28, 31
:FFF004EC   cmplwi      r15, 8
:FFF004F0   bge-        0xFFF004FC
:FFF004F4   addi        r28, r28, 1
:FFF004F8   b           0xFFF00514
:FFF004FC   addi        r27, r27, 1
:FFF00500   b           0xFFF00508
:FFF00504   b           0xFFF00520
:FFF00508   cmplw       r29, r23
:FFF0050C   bge-        0xFFF00514
:FFF00510   mr          r29, r23
:FFF00514   addi        r23, r23, 4
:FFF00518   bdnz+       0xFFF00478
:FFF0051C   blr

Clear registers for memory test (see next). Trace step 0x03 - Nothing ?

:FFF00520   li          r17, 0
:FFF00524   li          r18, 0
:FFF00528   li          r19, 0
:FFF0052C   li          r27, 0
:FFF00530   li          r28, 0
:FFF00534   li          r29, 0
:FFF00538   lis         r15, 0x0300         ; AD16 = 0x03000000
:FFF0053C   bl          0xFFF003A4
:FFF00540   b           0xFFF00560
:FFF00544
:FFF00548
:FFF0054C
:FFF00550
:FFF00554
:FFF00558
:FFF0055C

Test memory by some patterns. Results are saved in AD16:

0x04000000      - Test passed.
0x05xxxxxx      - Failed on address with last digit 0,1,2,3,4,5,6,7.
0x06xxxxxx      - Failed on address with last digit 8,9,A,B,C,D,E,F.
0x07xxxxxx      - Failed on address with last digit 0-F.

'x' will represent bits 6...29 of last address, where test failed.

:FFF00560   b           0xFFF00584
:FFF00564   lis         r25, 0x8000
:FFF00568   lis         r26, 0xAAAA
:FFF0056C   ori         r26, r26, 0xAAAA
:FFF00570   bl          0xFFF00404          ; Test memory with 0xAA pattern.
:FFF00574   not         r26, r26
:FFF00578   bl          0xFFF00404          ; Test memory with 0x55 pattern.
:FFF0057C   nop
:FFF00580   b           0xFFF00588
:FFF00584   b           0xFFF005A4
:FFF00588   lis         r15, 0x0400
:FFF0058C   mr.         r16, r27
:FFF00590   beq-        0xFFF00598
:FFF00594   oris        r15, r15, 0x0200
:FFF00598   mr.         r16, r28
:FFF0059C   beq-        0xFFF005AC
:FFF005A0   b           0xFFF005A8
:FFF005A4   b           0xFFF005C4
:FFF005A8   oris        r15, r15, 0x0100
:FFF005AC   rlwinm      r29, r29, 30, 2, 31
:FFF005B0   or          r15, r15, r29
:FFF005B4   bl          0xFFF003A4          ; Set AD16 value.
:FFF005B8   nop
:FFF005BC   nop
:FFF005C0   b           0xFFF005CC

Halt execution if memory test failed.

:FFF005C4   cmplw       r20, r21
:FFF005C8   beq+        0xFFF00564
:FFF005CC   mr.         r16, r27            ; Bad address with last digit 0-7 ?
:FFF005D0   bne+        0xFFF005CC
:FFF005D4   mr.         r16, r28            ; Bad address with last digit 8-F ?
:FFF005D8   bne+        0xFFF005CC

Prepare GPR registers for IPL loading.

:FFF005DC   lis         sd2, 0xCC00         ; EXI registers base
:FFF005E0   ori         sd2, sd2, 0x6800
:FFF005E4   lis         r6, 0x0000          ; EXI0 CSR setup: device 1, 32MHz
:FFF005E8   ori         r6, r6, 0x0150
:FFF005EC   lis         r7, 0x0000
:FFF005F0   ori         r7, r7, 0x0035
:FFF005F4   li          r8, 1
:FFF005F8   lis         r9, 0x0000
:FFF005FC   ori         r9, r9, 0x0003
:FFF00600   li          r10, 0
:FFF00604   lis         r11, 0x0000         ; Max length of single transfer
:FFF00608   ori         r11, r11, 0x0400
:FFF0060C   lis         r12, 0x0001
:FFF00610   ori         r12, r12, 0x0000
:FFF00614   lis         r3, 0x0002          ; Bootrom starting offset (0x800)
:FFF00618   ori         r3, r3, 0x0000
:FFF0061C   lis         r4, 0x012F          ; Main memory starting address
:FFF00620   ori         r4, r4, 0xFFE0
:FFF00624   lis         sd1, 0x0017         ; Transfer length
:FFF00628   ori         sd1, sd1, 0x0000
:FFF0062C   b           0xFFF00640
:FFF00630
:FFF00634
:FFF00638
:FFF0063C
:FFF00640   b           0xFFF00664

Final step. Transfer IPL to main memory. EXI is used for DMA transfer, since bootrom is mapped as EXI device. Max length of single DMA transfer is 1024 bytes, thus whole IPL is loaded alternately by pieces (it is not hardware limitations, but BS specifics).

Starting memory address: 0x012FFFE0. Starting bootrom offset: 0x800. Common length of transfer: 0x170000 bytes.

Bootrom scrambler is decrypting data on the fly.

:FFF00644   cmpwi       sd1, 0              ; All bytes transferred ?
:FFF00648   beq-        0xFFF006D4
:FFF0064C   mr          r5, r11
:FFF00650   cmplw       sd1, r5
:FFF00654   bgt-        0xFFF0065C
:FFF00658   mr          r5, sd1
:FFF0065C   stw         r6, 0 (sd2)         ; Select bootrom, through EXI0 CSR.
:FFF00660   b           0xFFF00668
:FFF00664   b           0xFFF00684
:FFF00668   stw         r3, 16 (sd2)        ; EXI0 DATA - offset in bootrom + write command
:FFF0066C   lwz         r16, 0 (sd2)
:FFF00670   stw         r7, 12 (sd2)        ; Write immediate 4 bytes from DATA.
:FFF00674   lwz         r16, 12 (sd2)       ; \
:FFF00678   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF0067C   bgt+        0xFFF00674          ; /
:FFF00680   b           0xFFF00688
:FFF00684   b           0xFFF006A4
:FFF00688   stw         r4, 4 (sd2)         ; EXI0 MAR - DMA memory address.
:FFF0068C   lwz         r4, 4 (sd2)
:FFF00690   stw         r5, 8 (sd2)         ; EXI0 LEN - DMA transfer length.
:FFF00694   lwz         r5, 8 (sd2)
:FFF00698   stw         r9, 12 (sd2)        ; Start EXI0 DMA write transfer.
:FFF0069C   lwz         r16, 12 (sd2)
:FFF006A0   b           0xFFF006A8
:FFF006A4   b           0xFFF006C4          ; \
:FFF006A8   and.        r16, r16, r8        ;  | Wait until transfer complete.
:FFF006AC   bgt+        0xFFF0069C          ; /
:FFF006B0   stw         r10, 0 (sd2)        ; Deselect device.
:FFF006B4   lwz         r16, 0 (sd2)
:FFF006B8   lwz         r16, 0 (sd2)
:FFF006BC   add         r3, r3, r12
:FFF006C0   b           0xFFF006C8
:FFF006C4   b           0xFFF006E4
:FFF006C8   add         r4, r4, r11         ; Advance pointers.
:FFF006CC   sub         sd1, sd1, r5
:FFF006D0   b           0xFFF00644

Set link register to IPL entrypoint.

:FFF006D4   lis         r4, 0x8130
:FFF006D8   ori         r4, r4, 0x0000
:FFF006DC   mtlr        r4                  ; LR = 0x81300000
:FFF006E0   b           0xFFF006E8

Disable bootrom decryption logic and disallow 32MHz EXI clock setting by CPU.

:FFF006E4   b           0xFFF00704
:FFF006E8   lis         r6, 0x0000
:FFF006EC   ori         r6, r6, 0x2000
:FFF006F0   stw         r6, 0 (sd2)         ; Set ROMDIS bit in EXI0 CSR.
:FFF006F4   li          r4, 1
:FFF006F8   stw         r4, 60 (r14)        ; SI EXICLK[LOCK] = 1
:FFF006FC   lwz         r4, 60 (r14)
:FFF00700   b           0xFFF00708
:FFF00704   b           0xFFF00720

Clear OS pointer to DVD BI2 location. Jump to IPL entrypoint.

:FFF00708   lis         r4, 0x8000
:FFF0070C   li          r3, 0
:FFF00710   stw         r3, 0x00F4 (r4)
:FFF00714   blr                             ; !! IPL START TO EXECUTE !!

Why BS is jumping around? The jumping is because the way the instructions are fetched. They must be fetched in exact linear order, otherwise the scrambling goes out of sync. So in order to do any loops, BS enable the icache and to fill the icache, its jump to the first location in each icache line. That's why BS jump in 0x20 byte steps.

:FFF00720   b           0xFFF00740          ; No-ops between branches.
:FFF00740   b           0xFFF00760
:FFF00760   b           0xFFF00780
:FFF00780   b           0xFFF007A0
:FFF007A0   b           0xFFF007C0
:FFF007C0   b           0xFFF007E0
:FFF007E0   b           0xFFF00644

Since BS is very jumpy, here is brief run flow :

  • Init Flipper (ARAM, MI, reset DVD).
  • Init Gekko (enable cache, set Dolphin OS memory model, enable translation).
  • Probe AD16.
  • Test memory. Halt CPU, if test failed.
  • Load IPL into main memory and disable bootrom scrambler.
  • Jump to IPL's __start.

PART II. IPL - Initial Program Loader (Boot Stage 2)

IPL is written on C, and its using old development version of default Nintendo SDK libraries and Dolphin OS. You can think, that IPL is a DOL application inside bootrom.

I will not reverse old versions of library calls. Old version of __start differs from new version by missed OSInit call. At the end, it will call IPL's main, as always. (Just remind that __start is CRT init for GCC-like compilers).

void main()
{
   ???();
   OSInit();

   AD16Init();
   AD16WriteReg(0x08000000);

   DVDInit();
   AD16WriteReg(0x09000000);

   CARDInit();
   AD16WriteReg(0x0A000000);

   ???();

   __VIInit(0);
   VIInit();
   AD16WriteReg(0x0B000000);

   ???();
   ???();
   ???();

   PADSetSpec(PAD_SPEC_5);
   PADInit();

   AD16WriteReg(0x0C000000);

   ???();

   OSHalt("BS2 ERROR >>> SHOULD NEVER REACH HERE");
}

PART III. Bootrom hack tutorial and hardware details

Questions

  • Q: Where is bootrom chip placed? I cant see it..
  • Q: What length does actually bootrom have?
  • Q: Who developed bootrom chip and its data protection?
  • Q: What algorithm is used for bootrom encryption?
  • Q: Is it possible to read bootrom by EXI DMA after reset (in my application)?
  • Q: Yes, I know its not possible for EXI DMA, but how about physically mapped direct access from 0xFFFxxxxx area?
  • Q: How do I dump encrypted bootrom?
  • Q: Yes, decryption algo is unknown, but I've seen some "IPL Replacements", how its possible?
  • Q: Where I can get XOR cipher key, to decrypt my encrypted dump?

Appendix: Glossary of Terms and Acronyms

see Hardware Acronyms

ToDo

  • Get all bootrom versions and crosscheck against them.
  • Find more details of unknown registers.
  • Get more info about AD16. Trace steps 1, 2, 3.
  • Continue on IPL.
  • Finish bootrom hack tutorial.

Source

This information was found at: [1]