aboutsummaryrefslogtreecommitdiff
path: root/docs/research.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/research.md')
-rw-r--r--docs/research.md209
1 files changed, 5 insertions, 204 deletions
diff --git a/docs/research.md b/docs/research.md
index 73618d7..a3e86ff 100644
--- a/docs/research.md
+++ b/docs/research.md
@@ -33,6 +33,11 @@ this chip's features and limitations:
- tiles can be flipped using OAM
- no frame buffer
+Though this chip is documented very well from a programmer's perspective, we
+found very little documentation about any reverse-engineering of the chip's
+actual hardware. While our PPU provides mostly the same features as the NES's
+PPU, our design is entirely custom.
+
### Usage
The NES PPU has a lot of capabilities, so here's a quick run-down of how the
@@ -93,210 +98,6 @@ int main() {
setup();
while(1) loop();
}
-```
-
-## Custom PPU
-
-Here's a list of features our PPU should have:
-<!-- TODO: expand list with PPU spreadsheet -->
-
-- 320x240 @ 60Hz VGA output
-- single tilemap with room for 1024 tiles of 16x16 pixels
-- 8 colors per palette, with 4096 possible colors (12-bit color depth)
-- 512x448 background canvas with scrolling
-- NO background scrolling splits
-- 128 total sprites on screen (NO scanline sprite limit)
-- sprites are always drawn on top of the background layer
-- PPU control using DMA (dual-port asynchronous RAM)
-- tiles can be flipped using FAM or BAM
-- no frame buffer
-- vertical and horizontal sync output
-
-Notable differences:
-
-- NES nametable equivalent is called BAM (background attribute register)
-- NES OAM equivalent is called FAM (foreground attribute register)
-- 320x240 @ 60Hz output
-
- Since we're using VGA, we can't use custom resolutions without an
- upscaler/downscaler. This resolution was chosen because it's exactly half of
- the lowest standard VGA resolution 640x480.
-- No scanline sprite limit
-
- Unless not imposing any sprite limit makes the hardware implementation
- impossible, or much more difficult, this is a restriction that will likely
- lead to frustrating debugging sessions, so will not be replicated in our
- custom PPU.
-- Sprites are 16x16
-
- Most NES games already tile multiple 8x8 tiles together into "metatiles" to
- create the illusion of larger sprites. This was likely done to save on memory
- costs as RAM was expensive in the '80s, but since we're running on an FPGA
- cost is irrelevant.
-- Single 1024 sprite tilemap shared between foreground and background sprites
-
- The NES OAM registers contain a bit to select which tilemap to use (of two),
- which effectively expands each tile's index address by one byte. Instead of
- creating the illusion of two separate memory areas for tiles, having one
- large tilemap seems like a more sensible solution to indexed tiles.
-- 8 total palettes, with 8 colors each
-
- More colors is better. Increasing the total palette count is a very memory
- intensive operation, while increaing the palette color count is likely slower
- when looking up color values for each pixel on real hardware.
-- Sprites can be positioned paritally off-screen on all screen edges using only
- the offset bits in the FAM register
-
- The NES has a separate PPUMASK register to control special color effects, and
- to shift sprites off the left and top screen edges, as the sprite offsets
- count from 0. Our PPU's FAM sprite offset bits count from -15, so the sprite
- can shift past the top and left screen edges, as well as the standard bottom
- and right edges.
-- No status line register, only V-sync and H-sync outputs are supplied back to
- CPU
-
- The NES status line register contains some handy lines, such as a buggy
- status line for reaching the max sprite count per scanline, and a status line
- for detecting collisions between background and foreground sprites. Our PPU
- doesn't have a scanline limit, and all hitbox detection is done in software.
- Software hacks involving swapping tiles during a screen draw cycle can still
- be achieved by counting the V-sync and H-sync pulses using interrupts.
-- No background scrolling splits
-
- This feature allows only part of the background canvas to be scrolled, while
- another portion stays still. This was used to draw HUD elements on the
- background layer for displaying things like health bars or score counters.
- Since we are working with a higher foreground sprite limit, we'll use regular
- foreground sprites to display HUD elements.
-- Sprites are always drawn on top of the background layer
-
- Our game doesn't need this capability for any visual effects. Leaving this
- feature out will lead to a simpler hardware design
-
-### Hardware design schematics
-
-#### Top (level 1)
-
-![PPU top-level design](../assets/ppu-level-1.svg)
-
-Important notes:
-
-- The STM32 can reset the PPU. This line will also be connected to a physical
- button on the FPGA.
-- The STM32 uses direct memory access to control the PPU.
-- The PPU's native resolution is 320x240. It works in this resolution as if it
- is a valid VGA signal. The STM32 is also only aware of this resolution. This
- resolution is referred to as "tiny" resolution. Because VGA-compatible LCD's
- likely don't support this resolution due to low clock speed, a built-in
- pixel-perfect 2X upscaler is chained after the PPU's "tiny" output. This
- means that the display sees the resolution as 640x480, but the PPU and STM32
- only work in 320x240.
-- The STM32 receives the TVSYNC and THSYNC lines from the PPU. These are the
- VSYNC and HSYNC lines from the tiny VGA signal generator. These lines can be
- used to trigger interrupts for counting frames, and to make sure no
- read/write conflicts occur for protected memory regions in the PPU.
-- NVSYNC, NHSYNC and the RGB signals refer to the output of the native VGA
- signal generator.
-
-#### Level 2
-
-![PPU level 2 design (data flows from top to bottom)](../assets/ppu-level-2.svg)
-
-Important notes:
-
-- The pixel fetch logic is pipelined in 5 stages:
- 1. - (Foreground sprite info) calculate if foreground sprite exists at
- current pixel using FAM register
- - (Background sprite info) get background sprite info from BAM register
- 2. - (Sprite render) calculate pixel to read from TMM based on sprite info
- 3. - (Compositor) get pixel with 'highest' priority (pick first foreground
- sprite with non-transparent color at current pixel in order, fallback to
- background)
- - (Palette lookup) lookup palette color using palette register
- - (VGA signal generator) output real color to VGA signal generator
-- The pipeline stages with two clock cycles contain an address set and memory
- read step.
-- The pipeline takes 5 clock ticks in total. About 18 are available during each
- pixel. For optimal display compatibility, the output color signal should be
- stable before 50% of the pixel clock pulse width (9 clock ticks).
-- Since the "sprite info" and "sprite render" steps are fundamentally different
- for the foreground and background layer, these components will be combined
- into one for each layer respectively. They are separated in the above diagram
- for pipeline stage illustration.
-- The BAX, FAM, and PAL registers are implemented in the component that
- directly accesses them, but are exposed to the PPU RAM bus for writing.
-- Each foreground sprite render component holds its own sprite data copy from
- the RAM in it's own cache memory. The cache updates are fetched during the
- VBLANK time between each frame.
-
-#### Level 3
-
-This diagram has several flaws, but a significant amount of time has already
-been spent on these, so they are highlighted here instead of being fixed.
-
-![PPU level 3 design](../assets/ppu-level-3.svg)
-
-Flaws:
-
-- Pipeline stages 1-4 aren't properly connected in this diagram, see level 2
- notes for proper functionality
-- The global RESET input resets all PPU RAM, but isn't connected to all RAM
- ports
-- All DATA inputs on the same line as an ADDR output are connections to a
- memory component. Not all of these are connected in the diagram, though they
- should be.
-
-Important notes:
-
-- The background sprite and foreground sprite component internally share some
- components for coordinate transformations
-- The foreground sprite component is only shown once here, but is cloned for
- each foreground sprite the PPU allows.
-- The CIDX lines between the sprite and compositor components is shared by all
- sprite components, and is such tri-state. A single sprite component outputs a
- CIDX signal based on the \*EN signal from the compositor.
-- All DATA and ADDR lines are shared between all RAM ports. WEN inputs are
- controlled by the address decoder.
-
-### Registers
-
-|Address|Size (bytes)|Alias|Description|
-|-|-|-|-|
-|`0x00000`|`0x00000`|TMM |[tilemap memory][TMM]|
-|`0x00000`|`0x00000`|BAM |[background attribute memory][BAM]|
-|`0x00000`|`0x00000`|FAM |[foreground attribute memory][FAM]|
-|`0x00000`|`0x00000`|PAL |[palettes][PAL]|
-|`0x00000`|`0x00000`|BAX |[background auxiliary memory][BAX]|
-
-[TMM]: #tilemap-memory
-#### Tilemap memory
-
-- TODO: list format
-
-[BAM]: #background-attribute-memory
-#### Background attribute memory
-
-- TODO: list format
-
-[FAM]: #foreground-attribute-memory
-#### Foreground attribute memory
-
-- TODO: list format
-
-[PAL]: #palettes
-#### Palettes
-
-- TODO: list format
-
-[BAX]: #background-auxiliary-memory
-#### Background auxiliary memory
-
-- background scrolling
-
-[nesppuspecs]: https://www.copetti.org/writings/consoles/nes/
-[nesppudocs]: https://www.nesdev.org/wiki/PPU_programmer_reference
-[nesppupinout]: https://www.nesdev.org/wiki/PPU_pinout
-[custompputimings]: https://docs.google.com/spreadsheets/d/1MU6K4c4PtMR_JXIpc3I0ZJdLZNnoFO7G2P3olCz6LSc
# Generating audio signals