음.. cp로 하니 이상하게 안되는 군
| $ nvcc tt.cpp tt.cpp: In function ‘void kernel_test(int*, int*, int*)’: tt.cpp:14:12: error: ‘threadIdx’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~~ tt.cpp:14:12: note: suggested alternative: ‘pthread_t’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~~ pthread_t tt.cpp:14:25: error: ‘blockIdx’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ tt.cpp:14:25: note: suggested alternative: ‘clock’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ clock tt.cpp:14:38: error: ‘blockDim’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ tt.cpp:14:38: note: suggested alternative: ‘clock’ int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~~ clock tt.cpp:14:52: error: ‘gridDim’ was not declared in this scope int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y); ^~~~~~~ tt.cpp: At global scope: tt.cpp:18:11: error: ‘::main’ must return ‘int’ void main() ^ tt.cpp: In function ‘int main()’: tt.cpp:61:15: error: expected primary-expression before ‘<’ token kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c); ^ tt.cpp:61:30: error: expected primary-expression before ‘>’ token kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c); ^ |
음.. cuda는 main이 int 형이여야 하는군
| $ nvcc tt.cu tt.cu(18): warning: return type of function "main" must be "int" tt.cu(18): warning: return type of function "main" must be "int" tt.cu:18:11: error: ‘::main’ must return ‘int’ void main() ^ |
[링크 : https://mangkyu.tistory.com/84]
싱글코어
| $ ./a.out cpu Time : 0.206937 gpu Time : 0.000106 |
어..? 멀티코어 돌리는게 왜 더 느려?!?!
| $ nvcc -Xcompiler -fopenmp tt.cu -o a.out.mp jetson@nano-4gb-jp451:~$ ./a.out.mp cpu Time : 0.231175 gpu Time : 0.000088 |
[링크 : https://forums.developer.nvidia.com/t/how-use-openmp-in-cu-file/2918/10]
2014.01.17 - [Programming/openCL & CUDA] - cuda + openmp 적용 예제
'embeded > jetson' 카테고리의 다른 글
| jetson nvcc 실행하기 (0) | 2026.04.06 |
|---|---|
| jetson nano 2gb dev kit EOL 근접 (0) | 2026.04.05 |
| jetracer에 사용할 보호회로 없는 배터리 구매 (0) | 2026.04.04 |
| jetracer ina219 배터리 모니터링 ic (0) | 2026.04.04 |
| jetracer interactive-regression (1) | 2026.04.04 |






















