Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/multithreading/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Multithreading 为什么在模拟ubuntu中使用theora编码时,两个线程比一个线程慢?_Multithreading_Ubuntu_Audio_Virtualbox_Ogg Theora - Fatal编程技术网

Multithreading 为什么在模拟ubuntu中使用theora编码时,两个线程比一个线程慢?

Multithreading 为什么在模拟ubuntu中使用theora编码时,两个线程比一个线程慢?,multithreading,ubuntu,audio,virtualbox,ogg-theora,Multithreading,Ubuntu,Audio,Virtualbox,Ogg Theora,我曾尝试使用一个线程,但似乎在使用virtualbox的模拟ubuntu时,当尝试使用2个线程而不是1个线程时,多线程测试实际上要慢一些。我使用了theora编码器,如中所述。我的硬件是intel i7 haswell,有2个内核,我已经为2个CPU配置了VirtualBox。为什么结果不如预期?我希望多线程编码速度更快,但速度要慢得多 developer@developer-VirtualBox:~/theora-multithread/examples$ lscpu Architecture

我曾尝试使用一个线程,但似乎在使用virtualbox的模拟ubuntu时,当尝试使用2个线程而不是1个线程时,多线程测试实际上要慢一些。我使用了theora编码器,如中所述。我的硬件是intel i7 haswell,有2个内核,我已经为2个CPU配置了VirtualBox。为什么结果不如预期?我希望多线程编码速度更快,但速度要慢得多

developer@developer-VirtualBox:~/theora-multithread/examples$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 69
Stepping:              1
CPU MHz:               0.000
BogoMIPS:              1687.55
L1d cache:             32K
L1d cache:             32K
L2d cache:             6144K
NUMA node0 CPU(s):     0,1
developer@developer-VirtualBox:~/theora-multithread/examples$ time ./encoder_example --number-of-threads 1 wavesound.wav tmp.yuv -o TEST-1-thread.ogg
File wavesound.wav is 16 bit 2 channel 44100 Hz RIFF WAV audio.
File tmp.yuv is 48x48 25.00 fps YUV12 video.
Number of Threads: 1
Compressing....
      0:46:32.08 audio: 66kbps video: 3kbps                 
done.


real    0m23.907s
user    0m12.319s
sys 0m1.623s
developer@developer-VirtualBox:~/theora-multithread/examples$ time ./encoder_example --number-of-threads 2 wavesound.wav tmp.yuv -o TEST-2-thread.ogg
File wavesound.wav is 16 bit 2 channel 44100 Hz RIFF WAV audio.
File tmp.yuv is 48x48 25.00 fps YUV12 video.
Number of Threads: 2
Compressing....
      0:46:32.08 audio: 66kbps video: 3kbps                 
done.


real    1m7.882s
user    0m22.370s
sys 0m33.304s
developer@developer-VirtualBox:~/theora-multithread/examples$ 
主机操作系统(Win 8.1)中的CPU-Z报告了以下有关硬件的信息

Processor 1         ID = 0
    Number of cores     2 (max 8)
    Number of threads   4 (max 16)
    Name            Intel Core i3/i5/i7 4xxx
    Codename        Haswell ULT
    Specification       Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
    Package (platform ID)   Socket 1168 BGA (0x6)
    CPUID           6.5.1
    Extended CPUID      6.45
    Core Stepping       C0
    Technology      22 nm
    TDP Limit       28 Watts
    Tjmax           100.0 °C
    Core Speed      798.4 MHz
    Multiplier x Bus Speed  8.0 x 99.8 MHz
    Stock frequency     2800 MHz
    Instructions sets   MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, EM64T, VT-x, AES, AVX, AVX2, FMA3
    L1 Data cache       2 x 32 KBytes, 8-way set associative, 64-byte line size
    L1 Instruction cache    2 x 32 KBytes, 8-way set associative, 64-byte line size
    L2 cache        2 x 256 KBytes, 8-way set associative, 64-byte line size
    L3 cache        4 MBytes, 16-way set associative, 64-byte line size
    FID/VID Control     yes
测试2 使用更大的音频文件(视频只是从静态png创建的虚拟视频)进行测试,则差异并没有那么大(?)

测试2(仅视频) 只有在测试视频时,我才能使用线程数再现加速:

developer@developer-VirtualBox:~/theora-multithread$ time ./examples/encoder_example -v 1 -a 1 --number-of-threads 1 stream.yuv > theora_testfile_1.ogg
File stream.yuv is 320x240 15.00 fps YUV12 video.
Number of Threads: 1
Compressing....
      0:00:07.60 audio: 0kbps video: 138kbps                 
done.


real    0m2.136s
user    0m1.920s
sys 0m0.083s
developer@developer-VirtualBox:~/theora-multithread$ time ./examples/encoder_example -v 1 -a 1 --number-of-threads 2 stream.yuv > theora_testfile_2.ogg
File stream.yuv is 320x240 15.00 fps YUV12 video.
Number of Threads: 2
Compressing....
      0:00:07.60 audio: 0kbps video: 139kbps                 
done.


real    0m2.043s
user    0m1.994s
sys 0m0.175s

如果我不得不猜测的话,那是因为创建线程和上下文切换的开销比进程本身更昂贵

请记住,内核线程要比用户线程昂贵得多。如果可以,请避免内核级线程

为了获得更好的性能,请尝试同时执行较大的任务,并避免触发上下文切换的操作(如等待资源或阻塞)


另外,重用线程资源。为每个任务创建新线程可能会影响应用程序的性能。池线程有助于避免创建它们的开销。

我可以使用更大的音频文件进行测试和/或同时启动两个作业,并选择先完成的单线程或多线程。视频文件只是因为查看输入所需的视频。我也可以直接用本地安装的ubuntu进行测试,如果这能起到作用的话。代码的作者确实报告了一个加速,如果可以的话,我想复制这个增益。一个更大的音频文件可能会有所帮助。这实际上取决于开发人员决定如何管理线程和分段任务。感谢您提供的信息。现在我可以得到一个加速,但只有视频。我认为该程序是多线程的,仅用于视频压缩,音频压缩可能是相同的。
developer@developer-VirtualBox:~/theora-multithread$ time ./examples/encoder_example -v 1 -a 1 --number-of-threads 1 stream.yuv > theora_testfile_1.ogg
File stream.yuv is 320x240 15.00 fps YUV12 video.
Number of Threads: 1
Compressing....
      0:00:07.60 audio: 0kbps video: 138kbps                 
done.


real    0m2.136s
user    0m1.920s
sys 0m0.083s
developer@developer-VirtualBox:~/theora-multithread$ time ./examples/encoder_example -v 1 -a 1 --number-of-threads 2 stream.yuv > theora_testfile_2.ogg
File stream.yuv is 320x240 15.00 fps YUV12 video.
Number of Threads: 2
Compressing....
      0:00:07.60 audio: 0kbps video: 139kbps                 
done.


real    0m2.043s
user    0m1.994s
sys 0m0.175s