embeded/jetson2026. 4. 6. 21:16

음.. cp로 하니 이상하게 안되는 군

$ nvcc tt.cpp
tt.cpp: In function ‘void kernel_test(int*, int*, int*)’:
tt.cpp:14:12: error: ‘threadIdx’ was not declared in this scope
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
            ^~~~~~~~~
tt.cpp:14:12: note: suggested alternative: ‘pthread_t’
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
            ^~~~~~~~~
            pthread_t
tt.cpp:14:25: error: ‘blockIdx’ was not declared in this scope
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
                         ^~~~~~~~
tt.cpp:14:25: note: suggested alternative: ‘clock’
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
                         ^~~~~~~~
                         clock
tt.cpp:14:38: error: ‘blockDim’ was not declared in this scope
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
                                      ^~~~~~~~
tt.cpp:14:38: note: suggested alternative: ‘clock’
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
                                      ^~~~~~~~
                                      clock
tt.cpp:14:52: error: ‘gridDim’ was not declared in this scope
  int idx = threadIdx.x +blockIdx.x * blockDim.x + (gridDim.x * blockDim.x) * (blockIdx.y * blockDim.y + threadIdx.y);
                                                    ^~~~~~~
tt.cpp: At global scope:
tt.cpp:18:11: error: ‘::main’ must return ‘int’
 void main()
           ^
tt.cpp: In function ‘int main()’:
tt.cpp:61:15: error: expected primary-expression before ‘<’ token
  kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c);
               ^
tt.cpp:61:30: error: expected primary-expression before ‘>’ token
  kernel_test<<<block,thread>>>(dev_a,dev_b,dev_c);
                              ^

 

음.. cuda는 main이 int 형이여야 하는군

$ nvcc tt.cu
tt.cu(18): warning: return type of function "main" must be "int"

tt.cu(18): warning: return type of function "main" must be "int"

tt.cu:18:11: error: ‘::main’ must return ‘int’
 void main()
           ^

[링크 : https://mangkyu.tistory.com/84]

 

싱글코어

$ ./a.out
cpu Time : 0.206937
gpu Time : 0.000106

 

어..? 멀티코어 돌리는게 왜 더 느려?!?!

$ nvcc  -Xcompiler -fopenmp tt.cu -o a.out.mp
jetson@nano-4gb-jp451:~$ ./a.out.mp
cpu Time : 0.231175
gpu Time : 0.000088

[링크 : https://forums.developer.nvidia.com/t/how-use-openmp-in-cu-file/2918/10]

 

2014.01.17 - [Programming/openCL & CUDA] - cuda + openmp 적용 예제

 

 

'embeded > jetson' 카테고리의 다른 글

jetracer donkey car  (0) 2026.04.20
jetson nvcc 실행하기  (0) 2026.04.06
jetson nano 2gb dev kit EOL 근접  (0) 2026.04.05
jetracer에 사용할 보호회로 없는 배터리 구매  (2) 2026.04.04
jetracer ina219 배터리 모니터링 ic  (0) 2026.04.04
Posted by 구차니