'Programming > openCL & CUDA' 카테고리의 다른 글
| CUDA Programming Model Overview 내용중 일부 (0) | 2011.01.05 |
|---|---|
| CUDA training (0) | 2011.01.05 |
| CUDA Toolkit 3.2 (0) | 2011.01.02 |
| deviceQuery on 8600GT 512MB + CUDA 하드웨어 구조 (0) | 2011.01.02 |
| CUDA on Linux (0) | 2010.12.07 |
| CUDA Programming Model Overview 내용중 일부 (0) | 2011.01.05 |
|---|---|
| CUDA training (0) | 2011.01.05 |
| CUDA Toolkit 3.2 (0) | 2011.01.02 |
| deviceQuery on 8600GT 512MB + CUDA 하드웨어 구조 (0) | 2011.01.02 |
| CUDA on Linux (0) | 2010.12.07 |
Last Updated:
12
/
22
/
2010CUDA Toolkit 3.2 (November 2010)
Release Highlights New and Improved CUDA Libraries
[링크 : http://developer.nvidia.com/object/cuda_3_2_downloads.html] |
| CUDA training (0) | 2011.01.05 |
|---|---|
| Visual Studio 2008 에서 CUDA 프로젝트 만들기 (2) | 2011.01.04 |
| deviceQuery on 8600GT 512MB + CUDA 하드웨어 구조 (0) | 2011.01.02 |
| CUDA on Linux (0) | 2010.12.07 |
| CUDA 예제 컴파일시 오류 (0) | 2010.12.05 |
| A multithreaded program is partitioned into blocks of threads that execute independently from each other, so that a GPU with more cores will automatically execute the program in less time than a GPU with fewer cores. => 멀티쓰레드화된 프로그램은 서로 독립적으로 실행되는 쓰레드의 블럭으로 나누어지며, 그러한 이유로 더욱 많은 코어를 포함하는 CPU는 적은 코어를 지닌 GPU보다 짧은 시간에 프로그램을 실행할 수 있다. |
| MatAdd<<<numBlocks, threadsPerBlock>>>(A, B, C); |
| dim3 threadsPerBlock(16, 16); dim3 numBlocks(N / threadsPerBlock.x, N / threadsPerBlock.y); MatAdd<<<numBlocks, threadsPerBlock>>>(A, B, C); |
| B.15 Execution Configuration Any call to a __global__ function must specify the execution configuration for that call. The execution configuration defines the dimension of the grid and blocks that will be used to execute the function on the device, as well as the associated stream (see Section 3.3.10.1 for a description of streams). When using the driver API, the execution configuration is specified through a series of driver function calls as detailed in Section 3.3.3. When using the runtime API (Section 3.2), the execution configuration is specified by inserting an expression of the form <<< Dg, Db, Ns, S >>> between the function name and the parenthesized argument list, where: Dg is of type dim3 (see Section B.3.2) and specifies the dimension and size of the grid, such that Dg.x * Dg.y equals the number of blocks being launched; Dg.z must be equal to 1; Db is of type dim3 (see Section B.3.2) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block; Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in Section B.2.3; Ns is an optional argument which defaults to 0; S is of type cudaStream_t and specifies the associated stream; S is an optional argument which defaults to 0. As an example, a function declared as __global__ void Func(float* parameter); must be called like this: Func<<< Dg, Db, Ns >>>(parameter); The arguments to the execution configuration are evaluated before the actual function arguments and like the function arguments, are currently passed via shared memory to the device. The function call will fail if Dg or Db are greater than the maximum sizes allowed for the device as specified in Appendix G, or if Ns is greater than the maximum amount of shared memory available on the device, minus the amount of shared memory required for static allocation, functions arguments (for devices of compute capability 1.x), and execution configuration. |
| Visual Studio 2008 에서 CUDA 프로젝트 만들기 (2) | 2011.01.04 |
|---|---|
| CUDA Toolkit 3.2 (0) | 2011.01.02 |
| CUDA on Linux (0) | 2010.12.07 |
| CUDA 예제 컴파일시 오류 (0) | 2010.12.05 |
| CUDA / Visual Studio 2008 (2) | 2010.12.05 |
| $ sudo vi /etc/ld.so.conf/libcuda.conf /usr/local/cuda/lib $ sudo ldconfig $ sudo apt-get install libglut3 $ sudo ln -s /usr/lib/libglut.so.3 /usr/lib/libglut.so $ sudo ln -s /usr/lib/libGLU.so.1 /usr/lib/libGLU.so $ sudo ln -s /usr/lib/libX11.so.6 /usr/lib/libX11.so $ sudo ln -s /usr/lib/libXi.so.6 /usr/lib/libXi.so $ sudo ln -s /usr/lib/libXmu.so.6 /usr/lib/libXmu.so |
| CUDA Toolkit 3.2 (0) | 2011.01.02 |
|---|---|
| deviceQuery on 8600GT 512MB + CUDA 하드웨어 구조 (0) | 2011.01.02 |
| CUDA 예제 컴파일시 오류 (0) | 2010.12.05 |
| CUDA / Visual Studio 2008 (2) | 2010.12.05 |
| CUDA + Visual Studio 2005 (0) | 2010.12.01 |
~/NVIDIA_GPU_Computing_SDK/C/src/deviceQuery$ make
/usr/bin/ld: cannot find -lcutil_i386
collect2: ld returned 1 exit status
make: *** [../../bin/linux/release/deviceQuery] 오류 1 |
| NVIDIA Cuda ¶ Before running the Makefile, you will need to install gcc 4.3 and g++ 4.3. This is because the NVIDIA Cuda SDK 3.0 has not yet worked with gcc 4.0 and g++ 4.0. There should be no issue compiling cuda files with gcc 4.3 and g++ 4.3 on newer NVIDIA Cuda SDK versions. For a successful compilation, please follow these steps: ... 3) Create a directory and create symlinks to gcc-4.3/g++-4.3 $ mkdir mygcc $ cd mygcc $ ln -s $(which g++-4.3) g++ $ ln -s $(which gcc-4.3) gcc [링크 : http://boinc.berkeley.edu/trac/wiki/GPUApp] |
| $ gcc --version gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ g++ --version g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ whereis g++ whwg++: /usr/bin/g++ /usr/share/man/man1/g++.1.gz $ whereis gcc gcc: /usr/bin/gcc /usr/lib/gcc /usr/share/man/man1/gcc.1.gz |
| libcudart.so.3: cannot open shared object file: No such file or directory |
| ~/NVIDIA_GPU_Computing_SDK/C/lib$ ll 합계 224 drwxr-xr-x 2 minimonk minimonk 4096 2010-12-05 23:58 ./ drwxr-xr-x 9 minimonk minimonk 4096 2010-12-05 23:58 ../ -rw-r--r-- 1 minimonk minimonk 142978 2010-12-05 23:58 libcutil_i386.a -rw-r--r-- 1 minimonk minimonk 30512 2010-12-05 23:58 libparamgl_i386.a -rw-r--r-- 1 minimonk minimonk 43034 2010-12-05 23:58 librendercheckgl_i386.a |
| ~/NVIDIA_GPU_Computing_SDK/C/src/marchingCubes$ make /usr/bin/ld: cannot find -lGLU collect2: ld returned 1 exit status make: *** [../../bin/linux/release/marchingCubes] 오류 1 $ sudo find / -name "*GLU*" /usr/lib/libGLU.so.1 /usr/lib/libGLU.so.1.3.070701 $ sudo ln -s /usr/lib/libGLU.so.1 /usr/lib/libGLU.so ~/NVIDIA_GPU_Computing_SDK/C/src/marchingCubes$ make /usr/bin/ld: cannot find -lX11 collect2: ld returned 1 exit status make[1]: *** [../../bin/linux/release/marchingCubes] 오류 1 $ sudo ln -s /usr/lib/libX11.so.6 /usr/lib/libX11.so $ sudo ln -s /usr/lib/libXi.so.6 /usr/lib/libXi.so $ sudo ln -s /usr/lib/libXmu.so.6 /usr/lib/libXmu.so /usr/bin/ld: cannot find -lglut collect2: ld returned 1 exit status make[1]: *** [../../bin/linux/release/marchingCubes] 오류 1 $ sudo apt-get install libglut3 $ sudo ln -s /usr/lib/libglut.so.3 /usr/lib/libglut.so |
| deviceQuery on 8600GT 512MB + CUDA 하드웨어 구조 (0) | 2011.01.02 |
|---|---|
| CUDA on Linux (0) | 2010.12.07 |
| CUDA / Visual Studio 2008 (2) | 2010.12.05 |
| CUDA + Visual Studio 2005 (0) | 2010.12.01 |
| nvcc for windows 제약사항? (0) | 2010.11.14 |
| CUDA on Linux (0) | 2010.12.07 |
|---|---|
| CUDA 예제 컴파일시 오류 (0) | 2010.12.05 |
| CUDA + Visual Studio 2005 (0) | 2010.12.01 |
| nvcc for windows 제약사항? (0) | 2010.11.14 |
| PTX - Parallel Thread Execution (0) | 2010.11.11 |
Version historyPrior Visual Studio Version 4.0 there were Visual Basic 3, Visual C++, Visual FoxPro and Source Safe as separate products.
|
| CUDA 예제 컴파일시 오류 (0) | 2010.12.05 |
|---|---|
| CUDA / Visual Studio 2008 (2) | 2010.12.05 |
| nvcc for windows 제약사항? (0) | 2010.11.14 |
| PTX - Parallel Thread Execution (0) | 2010.11.11 |
| ATI Stream / OpenCL 을 Nvidia에서 돌려보았더니! (0) | 2010.11.06 |
| Purpose of nvcc This compilation trajectory involves several splitting, compilation, preprocessing, and merging steps for each CUDA source file, and several of these steps are subtly different for different modes of CUDA compilation (such as compilation for device emulation, or the generation of device code repositories). It is the purpose of the CUDA compiler driver nvcc to hide the intricate details of CUDA compilation from developers. Additionally, instead of being a specific CUDA compilation driver, nvcc mimics the behavior of the GNU compiler gcc: it accepts a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process. All non-CUDA compilation steps are forwarded to a general purpose C compiler that is supported by nvcc, and on Windos platforms, where this compiler is an instance of the Microsoft Visual Studio compiler, nvcc will translate its options into appropriate ‘cl’ command syntax. This extended behavior plus ‘cl’ option translation is intended for support of portable application build and make scripts across Linux and Windows platforms. |
| CUDA / Visual Studio 2008 (2) | 2010.12.05 |
|---|---|
| CUDA + Visual Studio 2005 (0) | 2010.12.01 |
| PTX - Parallel Thread Execution (0) | 2010.11.11 |
| ATI Stream / OpenCL 을 Nvidia에서 돌려보았더니! (0) | 2010.11.06 |
| ATI STREAM - OpenCL 문서들 (0) | 2010.11.04 |
| CUDA + Visual Studio 2005 (0) | 2010.12.01 |
|---|---|
| nvcc for windows 제약사항? (0) | 2010.11.14 |
| ATI Stream / OpenCL 을 Nvidia에서 돌려보았더니! (0) | 2010.11.06 |
| ATI STREAM - OpenCL 문서들 (0) | 2010.11.04 |
| ATI Stream SDK (0) | 2010.11.03 |
| C:\> ConstantBandwidth.exe Error: clCreateContextFromType failed. Error code : CL_DEVICE_NOT_FOUND |
| nvcc for windows 제약사항? (0) | 2010.11.14 |
|---|---|
| PTX - Parallel Thread Execution (0) | 2010.11.11 |
| ATI STREAM - OpenCL 문서들 (0) | 2010.11.04 |
| ATI Stream SDK (0) | 2010.11.03 |
| GPU Gems 3 (2) | 2010.11.02 |