(This article may be found at http://www.combatsim.com/memb123/htm/2001/05/graphicwisd2)

Page 1

Graphic Wisdom: 3D Video Cards Part II
By Len "Viking1" Hjalmarson

Article Type: Feature / Hardware
Article Date: May 11, 2001


KYRO II: Horse of a Different Color

KYRO II uses tile-based rendering to determine what parts of a scene must be rendered. At the root of the technology is Display List Processing, a procedure in which the entire scene is divided into tiles with an area of 32x16 pixels. Since each tile is rendered independently, the Z-buffer data for the entire scene does not need to be stored in external memory, but instead an on-chip Z-buffer can hold the data for each tile. This effectively , eliminates the need to access external memory, reducing latency associated with such access.

Using a display list, the KYRO II can utilize per-pixel hidden surface removal to analyze each tile before textures are applied. The chip possesses another buffer that allows the processor to blend textures on-chip, without having to make external accesses to the frame buffer.

In short, the hardware tests whether or not a polygon is rendered before it is actually drawn. This method saves the hardware from having to draw and output polys to the screen that will end up being hidden by other objects anyway. The increase in rendering efficiency is as high as 300%.

Click For Larger Image
KYRO II

KYRO II needs that increased efficiency, since the core clock only runs at 175MHz. Furthermore, the KYRO II’s memory controller is not DDR RAM-capable, so the boards ships with 175MHz SDRAM. Each of the two pixel pipelines on the KYRO II is equipped with a single texture unit , compared to the quad-pipeline/dual-texture unit design of the GeForce2 and 3. But the design has the ability to apply up to eight textures per pass without having to resend data back across the memory bus during instances of multi-texturing.

Incredibly, in-game tests are revealing output equivalent to GeForce2 Pro. The weakness of the current KYRO II generation is that it is DirectX 8 compatible, but not compliant, limiting future growth. On the other hand, with the current generation clocked so low, there is room for growth in power through clock speed alone. Driver issues have retarded release, but as a competitor for GeForce2, the new chip looks to be an excellent solution.


A Four Horse Chariot

My son, if two horses are good, four horses are better.
So what about GeForce3? How fast is it and what do the new features give you?

The theoretical fill rate is lower than GeForce2, but in practice is more than 40% higher. More transistors and streamlined processes are making the difference. In Quake II HQ at 1024x768 and 32 bit color, GeForce3 is 20% faster than GeForce2. In current games the difference isn’t always noticeable, and a few are even slower.

GeForce3 supports a wide range of new features available in the latest DirectX 8 API. Not only does it sport the same integrated 'static' transform and lighting engine as GeForce2, but adds a programmable vertex processor and programmable texture operations. As a result, game developers are able to include new effects previously seen only in movies. Some of these effects are nothing short of spectacular.

It’s DX8 that is the key to these new features. Without the software interface, programmability in a graphics chip would be wasted. Here are descriptions of new features and performance enhancements of DX 8.0 relevant to NV20 (from the Microsoft website).

Programmable vertex processing language: Enables developers to write custom shaders for morphing and tweening animations, matrix palette skinning, user-defined lighting models, general environment mapping, procedural geometry operations, or any other developer-defined algorithm.

Programmable pixel processing language: Enables programmers to write custom hardware-accelerated shaders for general texture and color blending expressions, per-pixel lighting (bump mapping), per-pixel environment mapping for photo-realistic effects, or any other developer-defined algorithm.

Multisampling rendering support: Enables full-scene anti-aliasing and multisampling effects, such as motion blur and depth-of-field blur.

The addition of programmable shaders for vertex and pixel operations provides the framework for real-time programmable effects that rival movie quality. This programmability gives freedom to game developers by allowing them to implement whatever effect they see fit with the programmable pipeline. This means that every bit of hardware horsepower can be put to use to accomplish only the tasks specified by the developer. The power of the chip can constantly be maximized to accomplish pre-defined tasks.

Think of a powerful racing vehicle with specialized tires. The best snow tires in the world are on this vehicle, but suddenly it’s a hot and sunny day. If the tires were programmable, the tread would change to take advantage of the new conditions. Programmable graphics chips are sort of like those tires. The most powerful car in the world isn’t going to perform very well if the tires are wrong for the track conditions. Programmable shaders gives the developer the power to create a custom graphics board for each game.

Other improvements to DirectX include support for texture compression and a new particle system. Smoke, clouds, and weather effects will be easier to produce, look better, and use less hardware horsepower. DirectSound has also been improved, and DirectPlay Voice has been added to DirectPlay, the component that handles network functions.

If you recall some of the key features in the 3Dfx t-buffer, some descriptions above may suddenly sound familiar. Custom shaders, FSAA, multisampling effects like motion blur and depth of field blur – these were all key features espoused by 3dfx t-buffer.

The DirectX APIs are always written after careful consultation with game developers and hardware designers. DirectX 8 represents the evolution of graphical abilities on our PCs, and new hardware is designed to take advantage of new features. Games that are written specifically for DX 8 will attain more realistic visual effects and performance gains as high as forty percent over GeForce2 Ultra.

NVIDIA GeForce3

But what difference does that make with current titles? Not much. Eventually, in games where DirectX 8 features are fully exploited, games will run up to sixty percent faster. (To view NVIDIA demos click here)


Get Wisdom, and Get a New Image

If you are buying for the future and have to buy today, GeForce3 is the answer. But I think you are crazy; there simply isn't enough benefit for current sims to warrant the purchase.

If you are concerned about the future, but have GeForce2 technology already, wait for GeForce3 prices to fall or the introduction of ATI’s Radeon II this fall.

ATI's Radeon II will be the next fully compatible DX8 board to hit the market. When it arrives, some sources say in late September or early October, it will have an immediate effect, driving the prices on GeForce3 further down. Better yet, it may actually surpass GeForce3 in raw horsepower.

If you need to upgrade that old 3Dfx or TNT2 board and your budget allows, GeForce2 Ultra and Pro are powerful and will soon be bargains. If your budget isn’t quite that flexible, consider the GeForce2 MX 400 or the new KYRO II.

Most benchmarks place KYRO II between GeForce2 Pro and the GeForce MX, making it an excellent option at a lower price. (This does not apply to the XT 32MB version, which runs at a lower clock rate). Driver issues have delayed release, so purchase where you can return a board if you run into trouble.

My son, if your chariot is too slow and you have great treasure,
Do not wait . . .
Find a faster chariot today.

But if your treasure is rare, do not be foolish,
For why should you spend what you do not have,
And when your friends come to your tent, they sit on the hard floor?

Click to join a discussion about this article.




(This article may be found at http://www.combatsim.com/memb123/htm/2001/05/graphicwisd2)