'ion cuda'에 해당되는 글 2건

  1. 2012.06.02 CUDA devicequery - ION 330
  2. 2012.05.22 nvidia ion cuda core와 h.264 library
Programming/openCL & CUDA2012. 6. 2. 22:00
리플 룩 ion330 모델에 내장된 ion에 대한 devicequery이다.
2개의 MP가 존재해서 총 16개의 CUDA core가 존재한다.

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./deviceQuery
[deviceQuery] starting...

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Found 1 CUDA Capable device(s)

Device 0: "ION"
  CUDA Driver Version / Runtime Version          4.2 / 4.2
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 254 MBytes (266010624 bytes)
  ( 2) Multiprocessors x (  8) CUDA Cores/MP:    16 CUDA Cores
  GPU Clock rate:                                1100 MHz (1.10 GHz)
  Memory Clock rate:                             800 Mhz
  Memory Bus Width:                              64-bit
  Max Texture Dimension Size (x,y,z)             1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(8192) x 512, 2D=(8192,8192) x 512
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  768
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Concurrent copy and execution:                 No with 0 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   No
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           3 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA Runtime Version = 4.2, NumDevs = 1, Device = ION
[deviceQuery] test results...
PASSED

> exiting in 3 seconds:
3...2...1...done! 

그나저나.. 대역폭에서 내장형 그래픽이라 메인메모리를 공유하는데 왜 대역폭에서 이렇게 차이가 날까?
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./bandwidthTest
[bandwidthTest] starting...

./bandwidthTest Starting...

Running on...

 Device 0: ION
 Quick Mode

 Host to Device Bandwidth, 1 Device(s), Paged memory
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     887.0

 Device to Host Bandwidth, 1 Device(s), Paged memory
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     735.9

 Device to Device Bandwidth, 1 Device(s)
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     5345.2

[bandwidthTest] test results...
PASSED

> exiting in 3 seconds: 3...2...1...done! 

8800GT 에 비하면 확실히 nbody 에서의 연산속도와 fps가 많이 떨어지는 느낌
(8800GT에서는 150fps에 50GFLOP/s 정도 나옴)

2010/11/02 - [Programming/openCL / CUDA] - CUDA 예제파일 실행결과 + SLI


+ 리눅스에서 nvidia 드라이버 버전 보는 방법
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86 Kernel Module  295.40  Thu Apr  5 21:28:09 PDT 2012
GCC version:  gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

[링크 : http://www.nvnews.net/vbulletin/showthread.php?t=127289]   


Posted by 구차니
Programming/openCL & CUDA2012. 5. 22. 20:59
2개의 mp를 내장하고 있는데
음. H.264 인코딩/디코딩 라이브러리 제한이 몇개부터드라...

scalar processor는 cuda core인데 1개의 mp에 8개 있고, 4개의 mp가 32 scalar 프로세서이니 
h.264 인코딩은 ion에서는 불가능 할 것으로 보인다

MPEG-2/VC-1 support 
 Decode Acceleration for G8x, G9x (Requires Compute 1.1 or higher) 
 Full Bitstream Decode for MCP79, MCP89, G98, GT2xx, GF1xx 
MPEG-2 CUDA accelerated decode with a GPUs with 8+ SMs (64 CUDA cores).  (Windows) 
 Supports HD (1080i/p) playback including Bluray content 
 R185+ (Windows), R260+ (Linux) 
 
H.264/AVCHD support 
 Baseline, Main, and High Profile, up to Level 4.1  
 Full Bitstream Decoding in hardware including HD (1080i/p) Bluray content 
 Supports B-Frames, bitrates up to 45 mbps 
 Available on NVIDIA GPUs:  G8x, G9x, MCP79, MCP89, G98, GT2xx, GF1xx 
 R185+ (Windows), R260+ (Linux) 

[출처 : CUDA_VideoDecoder_Library.pdf] 

 Supported on all CUDA-enabled GPUs with 32 scalar processor cores or more 
[출처 : CUDA_VideoEncoder_Library.pdf] 
[링크 :  http://www.vpac.org/files/GPU-Slides/05.CudaOptimization.pdf ]

Device 0: "ION"
  CUDA Driver Version:                                             2.30
  CUDA Runtime Version:                                           2.30
  CUDA Capability Major revision number:                 1
  CUDA Capability Minor revision number:                 1
  Total amount of global memory:                                 268435456 bytes
  Number of multiprocessors:                                     2
  Number of cores:                                                         16
  Total amount of constant memory:                         65536 bytes
  Total amount of shared memory per block:         16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                                                     32
  Maximum number of threads per block:             512
  Maximum sizes of each dimension of a block:   512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                                           262144 bytes
  Texture alignment:                                                     256 bytes
  Clock rate:                                                                   1.10 GHz
  Concurrent copy and execution:                                 No
  Run time limit on kernels:                                     No
  Integrated:                                                                   Yes
  Support host page-locked memory mapping:         Yes
  Compute mode:                                                           Default (multiple host threads can use this devi
 
[링크 :  http://forums.nvidia.com/index.php?showtopic=100288 ]  

[링크 : http://www.nvidia.com/object/picoatom_specifications.html ]
[링크 : http://en.wikipedia.org/wiki/Nvidia_Ion ]

'Programming > openCL & CUDA' 카테고리의 다른 글

CUDA devicequery - ION 330  (0) 2012.06.02
cuda 5 preview  (0) 2012.06.02
CUDA API 메모리 종류  (0) 2012.05.18
Interoperability (상호운용성)  (0) 2012.05.04
cuda 내장변수  (0) 2012.04.30
Posted by 구차니