NVIDIA: Four Generations of GPU
by Len "Viking1" HjalmarsonArticle Type: Hardware
Article Date: February 13, 2002
Roughly two years ago NVIDIA introduced their GeForce line of video accelerators with integrated transformation and lighting. The gaming world has not been the same since.
The first of these boards sported either 16 or 32MB of main memory and on 3DMark 2000 with a 1 GHz CPU generated 4840 3DMarks at 1024x768 and 16-bit color depth.
GeForce2 GTS arrived not long after, and improved performance on the same system by at least 33 percent. The GeForce2 series topped out with the GeForce2 Ultra, with 64MB of main memory and running nearly 8000 3DMarks on a 1 GHz class CPU.
Since then standards have changed somewhat, with the introduction of a new synthetic benchmark standard in 3DMark 2001 and the Aquanox benchmark. In spite of that, however, my GeForce3 produced roughly the same frame rate in IL-2 Sturmovik as did my GeForce2 Ultra. GeForce2 only showed its limitations when alpha effects were pushed to the max. Enabling the highest level of clouds and shadows gave me a frame rate loss with GeForce2 that was not as noticeable with GeForce3.
GeForce3 arrived in 2001 with a great deal of fanfare, but some confusion on the part of simulation fans. As the marketing wheels spun around, virtual pilots were asking tough questions. What use are all these futuristic features when my current games and even ones in late stages of development won’t make use of them? Why should I pay the price premium instead of buying into the last generation?
The answer, of course, was buying for the future. If you are in a position to upgrade and you don’t want to upgrade again next year, it makes sense to invest in technology that will carry you further. The programmable pixel shaders and vertex shaders of GeForce3 and DirectX 8 fame do indeed offer developers greater flexibility than they have ever had in the past. Furthermore, antialiasing has become a standard rather than an extra—most gamers demand some form of antialiasing while playing their favorite games.
The advanced technology behind the revolution required almost a new video vocabulary. Lightspeed memory architecture, vertex processing algorithms, nFiniteFX engine and more. But all this fancy jargon did have some justification. The technology behind video revolutions is inevitably complex, even if the bottom line in frames per second seems simple enough from the pilot’s chair.
GeForce3 was arguably revolutionary, in spite of the limitations of the games available at the time. But who wants to live there? Revolutions are best attended by a great deal of adjustment space. It’s good to let the software developers catch up. If every GPU was revolutionary, game developers would be in chaos.
GeForce4 Ti4600 |
GeForce4
It’s a good thing, therefore, that GeForce4 is not revolutionary, but evolutionary. Improve the best features, improve memory bandwidth, increase the clock rate and fill rate, and you have GeForce4. With video acceleration, more is better, and we do want to give software developers some breathing space.On February 6th in San Francisco NVIDIA launched the latest technology in the form of the GeForce4 line of accelerators. Arriving in at least three forms, the GeForce4 will appear in the flagship Ti4600 with 128MB, the middle of the line Ti4400 with 128MB, and the GeForce4 MX variant with 64MB.
The clock speed for the flagship Ti4600 product is now 300 MHz, up from the 240 of the previous GeForce3 Ti500. Memory speed has increased to 325 MHz from 240, while memory bandwidth is up more than 25 percent. The flagship product also doubles the raw memory of GeForce3 to 128MB.
The physical changes are in memory technology, antialiasing, and the addition of a second vertex shader unit. Two vertex shader units will improve performance in games that use more complex lighting and alpha effects (like smoke and clouds and transparency).
Improvements in Z-buffer and occlusion technology result in much greater efficiency in the actual rendering process. The GeForce4 is more than 25 percent more efficient at ignoring textures that will be masked (occluded) in the final scene. This means horsepower that can be spent on rendering parts of the scene that can actually be seen by the viewer. Improvements in efficiency are also met by improved caching technology.
antialiasing has never quite lived up to its promise with NVIDIA. Where 3dfx seemed to get it right from the start, it’s been a struggle with NVIDIA hardware. Typically the performance penalty has been 25 to 30 percent, but that was only with very basic antialiasing. 4X antialiasing, where scenes start to REALLY look smooth, has incurred a penalty of up to 40 percent, simply unacceptable in most simulations. Finally, there is some light at the end of the tunnel.
Accuview is a new antialiasing option which uses more texture samples per pixel for greater accuracy. Antialiasing as a whole has been streamlined, and is much more efficient, creating much less performance penalty than with the GeForce3 series. In fact, Quincunx antialiasing, which is equivalent to 4X, can now be performed at the same speed as 2X, which itself has been improved and has less penalty than with GeForce3. Good news!
For the Budget Minded
GeForce4 MX requires some special treatment. This chip sports the same core as the GeForce4 but minus the programmable pixel shaders. Why call it GeForce4 at all, then? Isn’t this misleading?It certainly appears to be a marketing decision more than an accurate representation of the technology. But while some might argue this way, the core of the GeForce4 MX is indeed the same as the rest with the exception of the missing pixel shaders and a limited vertex shader. Performance wise, the GeForce4 MX will be far beyond the GeForce2 MX. Video hardware based on this chip will run current games very well, while future games that take advantage of DirectX 8 features will suffer. For the budget minded the GeForce4 MX series will be attractive, and we will undoubtedly see the chip appearing as an integrated part on many mainboards.
This result is almost revolutionary, since it represents the first time that an almost state-of-the-art video chip will be present on integrated mainboards. The cost savings to systems builders is substantial, and those on a budget will also benefit. I personally build a few systems each year for friends, and most of them are not dedicated gamers as I am. For these ones the integration of a mainstream video accelerator is a bonus, and the standardization that this represents may also be a good thing.
Performance
How does GeForce4 Ti compare to GeForce3 Ti in current games? Where the boards sport the same amount of memory (64MB, the GeForce4 Ti4400) the performance improvement averages from 20 to 40 percent.In 3DMark 2001 on a 1.66 GHz Athlon system, the Ti4600 scored 9800 while the Ti4400 scored 9350. On the same system the GeForce3 Ti500 scored 8040 and the original GeForce3 scored 7120. The best scored managed by an overclocked GeForce3 was 7640, while an overclocked Ti4600 scored 10,240. When the CPU was overclocked also, the Ti4600 managed an incredible 11,120.
Equally amazing, the GeForce4 MX scored 5500. With a bandwidth roughly the same as the full meal deal, this may be a budget solution worth watching. While it isn't future proof as the GeForce3 and GeForce4 proper, it will be an excellent solution for current games.
3DMark Comparison |
For those familiar with the Serious Sam benchmark, running at 1024x768 and 32-bit color, GeForce4 Ti4600 scored 108.2, while GeForce3 Ti500 scored 78.5, almost a 40 percent increase.
Antialiasing performance is much improved over the GeForce3 series. At the same resolution and color depth, GeForce3 ran at 35 FPS with Quincunx enabled, GeForce3 Ti ran at nearly 60 FPS, and GeForce4 Ti4600 ran at 77.5 FPS.
Toward the Future
While it is not revolutionary, GeForce4 is evolutionary and a strong performer. I’m curious that the introduction of the part was on the .15 micron process rather than .13, but my guess is that we will see a higher clocked part on the .13 micron process by the summer.But everyone knows that increasing the core clock by 10 or even 20 percent doesn't garner much of a performance increase, while increasing memory performance generally does. We aren't likely to see any increase in memory bandwidth on GeForce4 this year.
Curiously, there is no mention anywhere that GeForce4 is AGP 6X compatible, but it’s a safe guess that it is. Manufacturers generally don't mention compatability with standards before they are formally supported. My guess is that we will see AGP 6X in the specifications when the new mainboards show up that support the faster standard.
All told, GeForce4 is a solid evolutionary product. Gamers who are early adopters will benefit by NVIDIA's frequent driver updates, and the fairly mature level of current driver production. Since NVIDIA was technically ready to ship GeForce4 parts last November, they have had plenty of time to ensure reliable drivers.
Graphic Card Resources
Articles:- 2001/03/13: GeForce3: Next Generation Video Acceleration
- 2001/05/07: Graphic Wisdom I: 3-D Video Cards
- 2001/05/11: Graphic Wisdom II: 3-D Video Cards
- 2001/09/24: LeadTek WinFast GeForce3TD Review
- 2001/10/02: Hercules 3D Prophet 4500 64 MB
- 2001/10/05: Creative SB Live! X-Gamer 5.1
- 2001/10/10: NVIDIA Detonator XP (4) Drivers
- 2001/12/06: OCZ Titan 3 GeForce3
- 2002/02/13: NVIDIA: Four Generations of GPU
- 2002/03/01: Inno3D Tornado GeForce4 MX440
- 2002/04/19: Leadtek WinFast A250 Ultra TD Review
- 2002/06/11: Graphic Wisdom III: 3-D Video Cards
- 2002/06/20: Graphic Wisdom IV: From GPUs to VPUs
- 2002/07/16: Gainward GeForce4 (Review)
- 2002/07/23: Matrox Parhelia (Review)
- 2002/08/09: ATI Radeon 9700 (Preview)
- Archived Video Card Articles
- Asus
- ATI Technologies
- Creative
- Elsa
- Guillemot
- Hercules
- Imagination Technologies
- InnoVision
- LeadTek
- NVIDIA
- OCZ Technology
- VisionTek
- Panorama Tech