embeded/jetson
jetson nano nvcc 빌드
구차니
2026. 4. 6. 21:16
음.. cp로 하니 이상하게 안되는 군
| $ nvcc tt.cpp tt.cpp: In function ‘void kernel_test(int*, int*, int*)’: tt.cpp:14:12: error: ‘threadIdx’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~~ tt.cpp:14:12: note: suggested alternative: ‘pthread_t’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~~ pthread_t tt.cpp:14:25: error: ‘blockIdx’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ tt.cpp:14:25: note: suggested alternative: ‘clock’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ clock tt.cpp:14:38: error: ‘blockDim’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ tt.cpp:14:38: note: suggested alternative: ‘clock’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ clock tt.cpp:14:52: error: ‘gridDim’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~ tt.cpp: At global scope: tt.cpp:18:11: error: ‘::main’ must return ‘int’ void main() ^ tt.cpp: In function ‘int main()’: tt.cpp:61:15: error: expected primary-expression before ‘<’ token kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c); ^ tt.cpp:61:30: error: expected primary-expression before ‘>’ token kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c); ^ |
음.. cuda는 main이 int 형이여야 하는군
| $ nvcc tt.cu tt.cu(18): warning: return type of function "main" must be "int" tt.cu(18): warning: return type of function "main" must be "int" tt.cu:18:11: error: ‘::main’ must return ‘int’ void main() ^ |
[링크 : https://mangkyu.tistory.com/84]
싱글코어
| $ ./a.out cpu Time : 0.206937 gpu Time : 0.000106 |
어..? 멀티코어 돌리는게 왜 더 느려?!?!
| $ nvcc -Xcompiler -fopenmp tt.cu -o a.out.mp jetson@nano-4gb-jp451:~$ ./a.out.mp cpu Time : 0.231175 gpu Time : 0.000088 |
[링크 : https://forums.developer.nvidia.com/t/how-use-openmp-in-cu-file/2918/10]
2014.01.17 - [Programming/openCL & CUDA] - cuda + openmp 적용 예제