dvd shrink and dual core

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
This is kind of random, but I had no idea the simple/free program dvd shrink would make use of dual CPU's. Encoding full-length DVD's only takes about 30 minutes with the deep scan and highest detail settings where as it use to take over and hour for me. When I'm encoding a DVD, both CPU's are maxed out and it is very fast. Has anyone else noticed any applications that have benefited from dual CPU's on a simple level like this?
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
I believe that DVDShrink has two (for the most part) independent threads. So, when each thread can have a processor core to itself, things go a bit faster.

Honestly, very little software out there will give this kind of magical gain right of the box. It seems to me that a number of encoder programs gain from SMP support, however. I've used the SMP-enabled Lancer vorbis codec, and it's lightning fast.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
Windows claims 21 threads are running for DVDShrink (while it is currently encoding), but I'm uncertain how many of those threads are used for the actual encoding part (perhaps two like you suggested). Part of what surprised me the most is based on exactly what you've mentioned and that being...very little software is SMP-enabled so I was surprised that DVDShrink actually supports it.

Is the SMP support for Lancer vorbis codec mainly apply during the encoding of media or does it also work during playback/decoding?

Going back to the threading aspect of SMP; do you know how a developer would typically apply such a benefit at a lower programming level? I'm trying to understand how DVDShrink would even know I had two CPU's at its disposal, never mind how it is benefiting from them. My first impression is that one CPU could process frame 1 and CPU 2 would also work on frame 2 at the same time...and then piece them together as each CPU finishes the frames. Or perhaps it works even at a lower level by processing each line in the frame, one per CPU. I'm trying to understand how the parallel tasks can be done to improve performance and what are some of the typical conventions used to increase performance in an SMP environment.
 

Sol

Storage is cool
Joined
Feb 10, 2002
Messages
960
Location
Cardiff (Wales)
DVD shrink quite possibly doesn't know how many cores/cpus you have, it's just that when it's reencoding it's quite definitely doing two seperate things, it's decoding an mpeg2 video stream and then encoding another mpeg2 stream, since these tasks are done by two seperate instances of the mpeg2 codec windows will automatically run them on both cores/cpus.

Or that's my take anyway...
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,264
Location
I am omnipresent
Most video apps seem to be threaded applications. I don't know if that's because of the applications themselves or if some Windows libraries related to video processing use/require use of multiple threads.

Virtualdub, Nero Vision and Ulead Moviefactory are all dramatically faster with a second core, too.

Supposedly, AMD is working on some kind of optimizing binary interpreter which will allow multiple CPUs to be seen as a single processor (the opposite of threading) for executing single operations faster. That seems like it might be helpful, given that both Intel and AMD are looking at quad-core processors by the end of the year.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
I can't remember off the top of my head, but I think it was whatever the latest version they built before they had to stop (maybe 3.2.x).
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
Is the SMP support for Lancer vorbis codec mainly apply during the encoding of media or does it also work during playback/decoding?

I can't really tell you much about Lancer other than it's based on AoTuv and extremely fast. This is mostly because the codec's pages are in Japanese, and even the folks of HydrogenAudio don't seem to be able to glean a lot of technical informaiton.

Considering how decoding of audio works, I'm not even sure that you could set it up to take advantage of SMP, let alone get much of a gain from it.
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
DVD shrink quite possibly doesn't know how many cores/cpus you have, it's just that when it's reencoding it's quite definitely doing two seperate things, it's decoding an mpeg2 video stream and then encoding another mpeg2 stream, since these tasks are done by two seperate instances of the mpeg2 codec windows will automatically run them on both cores/cpus.

I'm pretty sure that's exactly the case. I don't think that the author did anything special; we all just lucked-out.

There's usually some time of gain from multithreaded applications, but the tendency of threads to be highly interdependent makes it difficult to get a massive advantage based solely upon the threads. Generally, the algorithms and code must be specifically aimed at multiprocessing to take real advantage of it. And that can be a real departure from the very linear programs that we have now.
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
My first impression is that one CPU could process frame 1 and CPU 2 would also work on frame 2 at the same time...and then piece them together as each CPU finishes the frames.

This wouldn't work with MPEG, as it uses key-framing; so, some frames depend upon previous ones. Because in live-action motion pictures, very little changes from frame to frame, a lot of space can be saved by only encoding the differences, much like joint stereo in MP3.
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
Supposedly, AMD is working on some kind of optimizing binary interpreter which will allow multiple CPUs to be seen as a single processor (the opposite of threading) for executing single operations faster. That seems like it might be helpful, given that both Intel and AMD are looking at quad-core processors by the end of the year.

It's not clear to me how, at a conceptual level, this would work.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
DVDShrink 3.2.0.15 is also what I have.

Unrealted to DVDShrink, but still related to SMP; I tried the LAME codec last night that was built for SMP using version "lame.3.97a.ms-mt.exe" and it wasn't significantly (if any) faster than the non-SMP version. Maybe the same thing applies to variable rate mp3's as it does to mpeg2 video...the next frame depends on the previous?
 

Gilbo

Storage is cool
Joined
Aug 19, 2004
Messages
742
Location
Ottawa, ON
I think some faster processors can be disk-limited on MP3 encoding from *.wav now --the encoders have gotten so efficient and the CPUs have gotten faster. I don't know if that's the case for you.
 

Buck

Storage? I am Storage!
Joined
Feb 22, 2002
Messages
4,514
Location
Blurry.
Website
www.hlmcompany.com
DVDShrink 3.2.0.15 is also what I have.

Unrealted to DVDShrink, but still related to SMP; I tried the LAME codec last night that was built for SMP using version "lame.3.97a.ms-mt.exe" and it wasn't significantly (if any) faster than the non-SMP version. Maybe the same thing applies to variable rate mp3's as it does to mpeg2 video...the next frame depends on the previous?

Handy, do you recall what the CPU useage was at?
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
It was roughly between 50% and 70%, but closer to an average of 60% during most of the encoding. I encoded a 38 minute WAV file for extended testing. A good portion of the wav file was no sound because it was a bonus track at the end of a CD. I just wanted a long enough sample to see how much CPU it was utilizing. Unlike DVDShrink, the lame mt (SMP enabled version) never got to 99% usage (that I can remember). I can do a time-comparison of both if you're curious about the numbers?
 

Buck

Storage? I am Storage!
Joined
Feb 22, 2002
Messages
4,514
Location
Blurry.
Website
www.hlmcompany.com
My interest in CPU utilization is to see what sub-system is the bottleneck. If CPU utilization is low, even on a dual core system, than some other sub-system is the culprit.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,918
Location
USA
Subject -
Mighty Mighty Bosstone (Ska-Core, the Devil and more)
Track #6 (Drugs and Kittens I'll Drink to That)
Length: 38:44:33

Tool:
extracted with Exact Audio Copy
Liteon DVD-ROM 16X




TEST 1:
LAME non-SMP
-b 320 -m s -h -V 0 -B 320
File size (on Disk)
391 MB (410,034,176 bytes)

C:\lame>lame -b 320 -m s -h -V 0 -B 320 "e:\Drugs And Kittens I'll Drink To That
.wav" "e:\Drugs And Kittens I'll Drink To That.mp3"
LAME 3.97 32bits (http://www.mp3dev.org/)
CPU features: MMX (ASM used), 3DNow! (ASM used), SSE (ASM used), SSE2
Using polyphase lowpass filter, transition band: 20094 Hz - 20627 Hz
Encoding e:\Drugs And Kittens I'll Drink To That.wav
to e:\Drugs And Kittens I'll Drink To That.mp3
Encoding as 44.1 kHz 320 kbps stereo MPEG-1 Layer III (4.4x) qval=2
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
88984/88984 (100%)| 1:13/ 1:13| 1:13/ 1:13| 31.774x| 0:00
320 [88984]
kbps LR % long switch short %
320.0 100.0 99.5 0.3 0.2
Writing LAME Tag...done
ReplayGain: -0.7dB




TEST 2:
LAME SMP enabled
-b 320 -m s -h -V 0 -B 320
File size (on Disk)
391 MB (410,034,176 bytes)


C:\lame>lame.3.97a.ms-mt -b 320 -m s -h -V 0 -B 320 "e:\Drugs And Kittens I'll D
rink To That.wav" "e:\Drugs And Kittens I'll Drink To That.mp3"
LAME version 3.97 MMX (alpha 2, Feb 21 2005 09:19:33) (http://www.mp3dev.org/)
warning: alpha versions should be used for testing only
CPU features: MMX (ASM used), 3DNow! (ASM used), SSE, SSE2
Using Multi-threaded encoder
Using polyphase lowpass filter, transition band: 20094 Hz - 20627 Hz
Encoding e:\Drugs And Kittens I'll Drink To That.wav
to e:\Drugs And Kittens I'll Drink To That.mp3
Encoding as 44.1 kHz 320 kbps stereo MPEG-1 Layer III (4.4x) qval=2
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
88981/88984 (100%)| 2:20/ 2:20| 2:20/ 2:20| 16.492x| 0:00
average: 320.0 kbps LR: 88984 (100.0%)

Writing LAME Tag...done
ReplayGain: -0.7dB

I have two CSV files that track the CPU usage and memory paging during each test. I used the windows performance monitor to take a snapshot once per second of each reading and then save it as a CSV file. You can open those in excel and see the times for each second (and apply a chart if you want).
 

sechs

Storage? I am Storage!
Joined
Feb 1, 2003
Messages
4,709
Location
Left Coast
LAME is a bad example.

First, to multithread LAME, you have to get rid of the bit resevoir. This leads to reduced quality output.

Second, the possible gain is far less than 1.5x the single-threaded version. All that this particular codec is doing is encoding two frames simultaneously.

I've never used this codec, but I recall reading that you had to pass the switch to turn off the bit resevoir before it actually ran in SMP mode. Otherwise, you'd just get general optimizations.
 
Top