C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\bin\win32\Release>deviceQuery.exe
[deviceQuery.exe] starting...

deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Found 1 CUDA Capable device(s)

Device 0: "GeForce 8800 GT"
  CUDA Driver Version / Runtime Version          4.2 / 4.2
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 512 MBytes (536870912 bytes)
  (14) Multiprocessors x (  8) CUDA Cores/MP:    112 CUDA Cores
  GPU Clock rate:                                1500 MHz (1.50 GHz)
  Memory Clock rate:                             900 Mhz
  Memory Bus Width:                              256-bit
  Max Texture Dimension Size (x,y,z)             1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(8192) x 512, 2D=(8192,8192) x 512
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  768
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   No
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA Runtime Version = 4.2, NumDevs = 1, Device = GeForce 8800 GT
[deviceQuery.exe] test results...

> exiting in 3 seconds: 3...2...1...done! 

MP당 쓰레드가 768 이라면
1개의 MP에는 8개의 cuda core가 있고
1개의 cuda core에는 그럼 96개의 쓰레드가 존재하는건가?
32 Warp x 3 인것 같기도 하고 그러면 1개의 core에서는 3개의 warp 가능?

아.. 모르겠다 ㅠ.ㅠ 

'Programming > openCL & CUDA' 카테고리의 다른 글

cuda 내장변수  (0) 2012.04.30
kernel block 과 thread  (0) 2012.04.26
cuda 4.2 devicequey  (0) 2012.04.23
cuda 4.2 released  (0) 2012.04.22
CUDA 장치별 cuda core 갯수  (0) 2012.04.09
AMD APP SDK 예제 컴파일  (0) 2012.03.12
Posted by 구차니

댓글을 달아 주세요