How we test laptops (methodology 2018 to mid-2021)

0

Editor’s note: As of August 2021, this laptop reviews test program has been released by. replaced an updated methodology described here. For your reference, we’ve retained this summary of previous testing procedures that applied to laptops tested from 2018 through mid-August 2021.


The laptop review process at PCMag.com continues core traditions that date back to the founding of PC Labs in 1984: we compare each system to others in its category based on price, features, design, and internal performance tests. .

To evaluate performance, we use a series of software-based benchmark tests and real world applications and games carefully selected to reveal the strengths and weaknesses of the mix of PC components tested. This evaluation ranges from the CPU and the memory subsystem to the memory hardware and the graphics chip of the machine.

(Photo: Zlata Ivleva)

In some cases, we use standardized tests by established benchmark developers. If necessary, we have also created our own tests. We also regularly evaluate new benchmark solutions as they hit the market and revise our testing process as necessary to ensure we can accurately map the impact of the latest technology.

Our laptop tests are divided into three broad test classes: productivity tests, graphics tests, and a battery life test. Here is a breakdown of each.


Productivity tests

PCMark 10

Our first assignment is to evaluate the daily productivity performance of a laptop using UL’s PCMark 10 benchmark, which simulates real-world productivity and content creation workflows. (UL or Underwriters Labs has taken over Futuremark, the maker of the longstanding PCMark and 3DMark benchmarks.)

We use PCMark 10 to evaluate overall system performance for office-oriented tasks such as word processing, spreadsheets, web browsing, and video conferencing. The test generates a proprietary numerical score; higher numbers are better and, above all, meaningful compared to each other.

PCMark 10

We run the main test suite that came with the software, not the Express or Extended version. Note that all other things being equal, a higher screen resolution will affect a system’s performance on PCMark 10. (The more pixels that need to be transferred, the more resources are required.)

PCMark 8 storage

We then evaluate the speed of the laptop’s main boot drive with another UL benchmark, PCMark 8. This test suite has a dedicated PCMark 8 memory subtest that outputs a proprietary numerical value …

PCMark 8

As with PCMark 10, higher numbers are better. The results of laptops with state-of-the-art solid state drives (SSDs) tend to be close in this test.

Cinebench R15

Next up is Maxon’s CPU-crunching Cinebench R15 test. We run this test with the All Cores setting. This test was derived from Maxon’s Cinema 4D modeling and rendering software and is a CPU PS test. It is fully threaded to take advantage of all available processor cores and threads. Think of it as a complete processor deadlift.

Cinebench R15

Cinebench loads the CPU, not the GPU, to render a complex image. The result is a proprietary score that indicates a PC’s suitability for processor-intensive workloads when used with fully threaded software.

Handbrake 1.1.1

Cinebench is often a good predictor for our Handbrake video editing study. This is another tough thread workout that is highly CPU dependent and scales well as you add cores and threads.

Handbrake 1.1.1

In this test, we set a stopwatch on test systems while they were transcoding a standard 12-minute clip of a 4K video (Blender’s open source demo short film) Tears of steel) into a 1080p MP4 file. For this time-controlled test, we use the Fast 1080p30 preset in version 1.1.1 of the Handbrake app. Lower results (i.e., faster times) are better.

Adobe Photoshop CC photo editing test

Our final productivity test is a custom Adobe Photoshop image editing benchmark. With a version of the Creative Cloud version of Photoshop released in early 2018, we’re applying a number of complex filters and effects (dust, watercolor, stained glass, mosaic tiles, extrusion, and several blurring effects) to a PCMag standard JPEG image. (We use a script that is executed via an actions file that we created ourselves.) We measure each operation and add up the total execution time at the end. As with Handbrake, shorter times are better here.

Adobe Photoshop CC test

The Photoshop test puts a load on the CPU, memory subsystem, and RAM, but can also use most GPUs to speed up the application of filters. Systems with powerful graphics chips or cards can get a boost.


Graphics performance

In order to assess graphics performance, tests must be carried out that are challenging for any system and yet provide meaningful comparisons across the field. We use some benchmarks that report proprietary scores and others that measure frames per second (fps), the frequency with which the graphics hardware is rendering frames in a sequence, which means how smooth the scene looks in motion.

Synthetic tests: 3DMark and overlay

The first graphics test is UL’s 3DMark. The 3DMark suite includes a variety of different subtests that measure relative graphics performance by rendering sequences of highly detailed, gaming-style 3D graphics. Many of these tests emphasize particles and lighting.

We run two different 3DMark subtests, Sky Diver and Fire Strike, which are suitable for different types of systems. Both are DirectX 11 benchmarks, but Sky Diver is good for laptops and mid-range PCs, while Fire Strike is more demanding and designed for high-end PCs to show their stuff. The results are proprietary scores.

3D brand

There is also another synthetic graphics test in our graphics mix, this time from Unigine. Like 3DMark, the superposition test renders and pans a detailed 3D scene and measures how the system handles it. In this case, the rendering action takes place in the company’s Unigine engine of the same name, which offers a different 3D workload scenario than 3DMark. This gives a second opinion on the graphic capabilities of the machine.

Overlay

We present two overlay results performed with the 720p Low and 1080p High presets. Results are reported in frames per second, with higher frame rates being better. For low-end systems, maintaining at least 30 fps is the realistic goal, while more powerful computers should ideally achieve at least 60 fps in the test resolution.

Real world gaming tests

The synthetic tests above are helpful for measuring overall suitability for 3D graphics, but it’s hard to beat full retail video games to gauge gaming performance. Far Cry 5 and Rise of the Tomb Raider are both modern high-fidelity titles with built-in benchmarks that illustrate how a system handles real video games in different settings.

Far Cry 5

Far Cry 5

These games are run with both the moderate and maximum graphics quality settings in the benchmarking utility. (These presets are Normal and Ultra for Far Cry 5, Medium and Very High for Rise of the Tomb Raider.) We test at 1080p by default when possible and (if the laptop’s native screen resolution is higher or lower) the native resolution of the Screen to assess performance for a particular system.

Rise of the Tomb Raider

Rise of the Tomb Raider

These results are also reported in frames per second. Far Cry 5 is a DirectX 11-based game, while Rise of the Tomb Raider can be toggled into DirectX 12 mode, which is what we’re doing for this benchmark.


Checking the battery life

Finally, we carry out a video playback-based battery rundown test that supports all operating systems in order to estimate relatively how long the notebook can hold out without a power socket. Our rundown test involves playing a 720p loop version of Tears of steel (mentioned above) saved on the laptop’s storage drive in MP4 format. If the file does not fit on the system’s local storage, we run the video from an SD card or USB memory stick.

HP Specter x360

(Photo: Zlata Ivleva)

Before we start the test, we turn on the system’s energy-saving mode, turn the screen brightness down to 50 percent, increase the audio volume to 100 percent and deactivate the adaptive screen brightness. Wireless radios, keyboard backlights, and all cabinet lights are turned off. Then we start the test and let it run for as long as a fully charged battery lasts, usually so far that the system goes to sleep when the battery capacity is below 5 percent (as in the Critical Battery set by us in the energy options Action required). If the laptop has more than one battery, we run a separate second leak test with both batteries installed.

Assessing battery performance is difficult because results can vary widely depending on the type of tasks being performed. (On the one hand, heavy gaming without a battery leads to much shorter times.) But in general, we think a system is for all-day computing if it lasts more than eight hours in our battery rundown test.


Special cases: macOS, Chromebooks, mobile workstations

We don’t run all of the above tests on every laptop. We only run Far Cry 5 and Rise of the Tomb Raider on laptops that are specifically designed for gaming and are equipped with a dedicated graphics processor. And we don’t use PCMark, 3DMark, or Superposition to test Apple laptops because those tests don’t have a macOS version. In order to evaluate some specialized subgroups of laptops such as workstations, Chromebooks and ARM-based systems, we supplement our standard tests.

Chromebooks

For Chromebooks, for example, we only run the battery rundown test of the above tests, as this is the only one that is compatible with Chrome OS. Instead of the 720p file, we’ll use another lower resolution source file (a DVD rip of the full Lord of the rings Trilogy, looped) stored (if possible) on the internal storage of the Chromebook.

We then run Principled Technologies’ CrXPRT and WebXPRT benchmarks to make comparisons between Chromebooks. (Since WebXPRT is a web-based test that can be run on almost any PC, we also use it to test laptops with ARM processors.) These are one-click tests with no settings to customize, and they report proprietary results that are only meaningful relative to each other.

Workstation laptops

For workstation laptops, we carry out all of the above tests and add a few workstation-specific measures. These specialized tests include the multimedia rendering tool POV-Ray (for a ray tracing simulation). We also run the SPECviewperf 13 suite, which loads three “viewsets” for the Creo, Maya and SolidWorks apps to measure how the workstation machine handles the manipulation of relevant files in these three groundbreaking workstation programs. The POV-Ray results are reported as the time to complete the test task, and the SPECviewperf results are reported in frames per second.

What's New Now to get our top stories delivered to your inbox every morning.","created_at":null,"updated_at":null})" x-show="showEmailSignUp()" class="rounded bg-gray-lightest text-center md:px-32 md:py-8 p-4 font-brand mt-8 container-xs">
Get Our Best Stories!

Sign up for What’s new now to get our top stories to your inbox every morning.

This newsletter can contain advertising, offers or affiliate links. If you subscribe to a newsletter, you agree to our terms of use and privacy policy. You can unsubscribe from the newsletters at any time.

Share.

Comments are closed.