프로그램 사용/gcc2022. 6. 2. 14:47

아래 에러들은 SIMD 명령으로 변환하는데 실패한 녀석들인것 같은데

아래와 같은 유형들이 에러로 발생했다.

 

반복문이 중첩되거나, 반복문 내에서 조건문이 있으면 안되는 것 같고

tt.c:180:3: note: ===== analyze_loop_nest =====
tt.c:180:3: note: === vect_analyze_loop_form ===
tt.c:180:3: note: not vectorized: control flow in loop.
tt.c:180:3: note: bad loop form.


tt.c:61:3: note: ===== analyze_loop_nest =====
tt.c:61:3: note: === vect_analyze_loop_form ===
tt.c:61:3: note: not vectorized: multiple nested loops.
tt.c:61:3: note: bad loop form.

 

아래부터는 어떤 에러인지 감이 안오는 녀석들..

지원하지 않는 패턴

tt.c:83:7: note: Unsupported pattern.
tt.c:83:7: note: not vectorized: unsupported use in stmt.
tt.c:83:7: note: unexpected pattern.

 

지원되지 않는 데이터 타입. 코드를 보니 for문의 비교문에

함수 포인터를 통한 참조(->) 로 보려고 할때는 타입을 추적 못하는 듯?

tt.c:107:5: note: not vectorized: unsupported data-type
tt.c:107:5: note: can't determine vectorization factor.

 

no grouped store가 어떤건지 모르겠다.

val = data[];

out = data / 255;

이런식으로 단순화 가능한 코드인데 배열과 포인터로 배열 인자가 선형으로 분석될수 없기 때문에 그런걸지도?

tt.c:106:3: note: not vectorized: no grouped stores in basic block.
tt.c:106:3: note: ===vect_slp_analyze_bb===
tt.c:106:3: note: ===vect_slp_analyze_bb===
tt.c:108:32: note: === vect_analyze_data_refs ===
tt.c:108:32: note: not vectorized: not enough data-refs in basic block.

 

모르겠고..

tt.c:228:3: note: not vectorized: data ref analysis failed _47 = *_46;
tt.c:228:3: note: bad data references.

 

모르겠다!!!

tt.c:238:5: note: not vectorized: not suitable for gather load _47 = *_46;
tt.c:238:5: note: bad data references.

 

 

아무튼 AVX로도 변환이 안되는데 .. NEON으로 최적화 될만한 코드는 더더욱 아닐 것 같네.

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc / 문자열 선언  (0) 2022.03.17
static link  (0) 2022.02.07
구조체 타입과 변수명은 구분된다?  (0) 2021.11.18
gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc vectorized loop  (0) 2021.06.30
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2022. 3. 17. 12:05

weston 소스를 보는데 희한한(?) 문자열 선언이 보여서 확인

static const char * const connector_type_names[] = {
[DRM_MODE_CONNECTOR_Unknown]     = "Unknown",
[DRM_MODE_CONNECTOR_VGA]         = "VGA",
[DRM_MODE_CONNECTOR_DVII]        = "DVI-I",
[DRM_MODE_CONNECTOR_DVID]        = "DVI-D",
[DRM_MODE_CONNECTOR_DVIA]        = "DVI-A",
[DRM_MODE_CONNECTOR_Composite]   = "Composite",
[DRM_MODE_CONNECTOR_SVIDEO]      = "SVIDEO",
[DRM_MODE_CONNECTOR_LVDS]        = "LVDS",
[DRM_MODE_CONNECTOR_Component]   = "Component",
[DRM_MODE_CONNECTOR_9PinDIN]     = "DIN",
[DRM_MODE_CONNECTOR_DisplayPort] = "DP",
[DRM_MODE_CONNECTOR_HDMIA]       = "HDMI-A",
[DRM_MODE_CONNECTOR_HDMIB]       = "HDMI-B",
[DRM_MODE_CONNECTOR_TV]          = "TV",
[DRM_MODE_CONNECTOR_eDP]         = "eDP",
[DRM_MODE_CONNECTOR_VIRTUAL]     = "Virtual",
[DRM_MODE_CONNECTOR_DSI]         = "DSI",
[DRM_MODE_CONNECTOR_DPI]         = "DPI",
};

 

느낌은 알겠는데.. 도대체 어디서 정의된 문법이냐...

$ cat str.c
#include <stdio.h>

static const char * const connector_type_names[] = {
        [0]     = "Unknown",
        [1]         = "VGA",
        [2]        = "DVI-I",
        [3]        = "DVI-D",
        [4]        = "DVI-A",
        [5]   = "Composite",
        [6]      = "SVIDEO",
        [7]        = "LVDS",
        [8]   = "Component",
        [9]     = "DIN",
        [10] = "DP",
        [11]       = "HDMI-A",
        [12]       = "HDMI-B",
        [13]          = "TV",
        [14]         = "eDP",
        [15]     = "Virtual",
        [16]         = "DSI",
        [17]         = "DPI",
};

void main()
{
        for(int i = 0; i < 10; i++)
                printf("%s\n",connector_type_names[i]);
}

$ gcc str.c
$ ./a.out
Unknown
VGA
DVI-I
DVI-D
DVI-A
Composite
SVIDEO
LVDS
Component
DIN

 

 

$ cat str.c
#include <stdio.h>

static const char * const connector_type_names[] = {
        [2]     = "Unknown",
        [1]         = "VGA",
        [0]        = "DVI-I",
        [3]        = "DVI-D",
        [4]        = "DVI-A",
        [5]   = "Composite",
        [6]      = "SVIDEO",
        [7]        = "LVDS",
        [8]   = "Component",
        [9]     = "DIN",
        [10] = "DP",
        [11]       = "HDMI-A",
        [12]       = "HDMI-B",
        [13]          = "TV",
        [14]         = "eDP",
        [15]     = "Virtual",
        [16]         = "DSI",
        [17]         = "DPI",
};

void main()
{
        for(int i = 0; i < 10; i++)
                printf("%s\n",connector_type_names[i]);
}

$ gcc str.c
$ ./a.out
DVI-I
VGA
Unknown
DVI-D
DVI-A
Composite
SVIDEO
LVDS
Component
DIN

 

 

+

ISO C99, GNU C90 에서 지원하는 듯.

In ISO C99 you can give the elements in any order, specifying the array indices or structure field names they apply to, and GNU C allows this as an extension in C90 mode as well. This extension is not implemented in GNU C++.
To specify an array index, write ‘[index] =’ before the element value. For example,

int a[6] = { [4] = 29, [2] = 15 };

[링크 : https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html]

 

$ gcc -std=c89 str.c
str.c: In function ‘main’:
str.c:26:2: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
  for(int i = 0; i < 10; i++)
  ^~~
str.c:26:2: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code

$ gcc -std=c90 str.c
str.c: In function ‘main’:
str.c:26:2: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
  for(int i = 0; i < 10; i++)
  ^~~
str.c:26:2: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code

$ gcc -std=c99 str.c

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc vectorization 실패  (0) 2022.06.02
static link  (0) 2022.02.07
구조체 타입과 변수명은 구분된다?  (0) 2021.11.18
gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc vectorized loop  (0) 2021.06.30
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2022. 2. 7. 16:07

전부 정적으로 묶고 일부만 동적으로 묶는거나

동적으로 묶을건 냅두고 필요한(나머지) 것을 정적으로 묶는거나 그게 그건가?

 

-static-libgcc 한다고 해서 glibc 버전 안 맞다고 발생하는 에러를 해결할 순 없다

[링크 : https://stackoverflow.com/questions/26304531]

 

[링크 : https://kldp.org/node/136157]

  [링크 : https://enst.tistory.com/entry/liblibcso6-version-GLIBC27-not-found]

 

 

[링크 : https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html]

  [링크 : https://www.codeproject.com/Questions/1241890/How-to-link-to-libc-statically]

 

-lm 버전 문제 있으면 걍 해당 라이브러리를 static link 해버리면 되니까!

$ g++ -std=c++11 -o classify classify.cc -I/home/pi/work/coral/libedgetpu/tflite/public -I/home/pi/work/coral/tensorflow -I/home/pi/work/coral/tensorflow/tensorflow/lite/tools/make/downloads/flatbuffers/include -L/home/pi/work/coral/tensorflow/tensorflow/lite/tools/make/gen/rpi_armv7l/lib -L/home/pi/work/coral/pycoral/libedgetpu_bin/throttled/armv7a -ltensorflow-lite -static-libgcc -l:libedgetpu.so.1.0 -lpthread -ldl /usr/lib/arm-linux-gnueabihf/libm.a

[링크 : http://www.iamroot.org/xe/index.php?mid=Programming&document_srl=13406]

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc vectorization 실패  (0) 2022.06.02
gcc / 문자열 선언  (0) 2022.03.17
구조체 타입과 변수명은 구분된다?  (0) 2021.11.18
gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc vectorized loop  (0) 2021.06.30
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 11. 18. 12:46

gcc에서 해보는데 헐.. 이게 되네?

변수명과 구조체 타입명은 다른 영역으로 구분되나?

identifier로 동일하게 취급될줄 알았는데 아니라니?

 

$ cat t.c
#include <stdio.h>

struct help { int a; int b; int c;};

struct help a;
int help = 1;

void main()
{
        a.a = 5;
        a.b = 7;
        a.c = 9;
        printf("%d %d %d %d\n",help, a.a, a.b, a.c);
}

 

$ gcc t.c
$ ./a.out
1 5 7 9

 

[링크 : https://cpp.hotexamples.com/site/file?hash=0x3b8107de9c96fe7f1b348af2ed3b6ff15fb2cb740a65ea0a91d33bcab75b25ae&fullName=wheatley-master/wlegl_handle.c&project=jekstrand/wheatley]

 

다만, 함수와 변수명의 식별자 영역은 동일한지 에러가 발생한다.

$ cat t.c
#include <stdio.h>

void help()
{
        printf("%s\n",__func__);
}

struct help { int a; int b; int c;};

struct help a;
int help = 1;

void main()
{
        a.a = 5;
        a.b = 7;
        a.c = 9;
        printf("%d %d %d %d\n",help, a.a, a.b, a.c);
        help();
}

 

$ gcc t.c
t.c:11:5: error: ‘help’ redeclared as different kind of symbol
 int help = 1;
     ^~~~
t.c:3:6: note: previous definition of ‘help’ was here
 void help()
      ^~~~
t.c: In function ‘main’:
t.c:19:2: error: called object ‘help’ is not a function or function pointer
  help();
  ^~~~
t.c:11:5: note: declared here
 int help = 1;
     ^~~~

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc / 문자열 선언  (0) 2022.03.17
static link  (0) 2022.02.07
gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc vectorized loop  (0) 2021.06.30
gcc unsigned to signed cast  (0) 2021.06.22
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 7. 8. 10:40

귀찮으니 날로먹는 코딩으로 테스트

$ cat 1.c
#include <stdio.h>

void main()
{
        unsigned char a = 0xFF;
        char b =a;
        short c = a;
        short d = (char)a;
        short e = (int)a;

        int f = a;
        int g = (int)a;
        int h = (int)(char)a;

        printf("a %d\n",a);
        printf("a %d\n",(int)a);
        printf("a %d\n",(char)a);
        printf("b %d\n",b);
        printf("c %d\n",c);
        printf("d %d\n",d);
        printf("e %d\n",e);
        printf("f %d\n",f);
        printf("g %d\n",g);
        printf("h %d\n",h);
}

 

컴파일러 버전과 아키텍쳐, 그리고 결과인데... 머냐..?!?!

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
$ arm-linux-gnueabihf-gcc -v
Using built-in specs.
COLLECT_GCC=arm-linux-gnueabihf-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabihf/7/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-multiarch --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror --enable-multilib --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabihf --program-prefix=arm-linux-gnueabihf- --includedir=/usr/arm-linux-gnueabihf/include
Thread model: posix
gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
$ ./1
a 255
a 255
a -1
b -1
c 255
d -1
e 255
f 255
g 255
h -1
# /mnt/1
a 255
a 255
a 255
b 255
c 255
d 255
e 255
f 255
g 255

 

라즈베리 파이 4에서 시도. arm 아키텍쳐용 컴파일러의 특성인가?

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/8/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 8.3.0-6' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=aarch64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --disable-libphobos --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu
Thread model: posix
gcc version 8.3.0 (Debian 8.3.0-6)
$ ./1
a 255
a 255
a 255
b 255
c 255
d 255
e 255
f 255
g 255
h 255

 

+

두개중에 하나만 주면 되는진 모르겠지만, 둘 다 주거나 -fsigned-char 만 주어도 결과가 -1로 나오긴 한다.

원문으로 보니 --signed-chars는 RVCT 컴파일러를 위한 옵션인 듯.

The ANSI C standard specifies a range for both signed (at least -127 to +127) and unsigned (at least 0 to 255) chars. Simple chars are not specifically defined and it is compiler dependent whether they are signed or unsigned. Although the ARM architecture has the LDRSB instruction, that loads a signed byte into a 32-bit register with sign extension, the earliest versions of the architecture did not. It made sense at the time for the compiler to treat simple chars as unsigned, whereas on the x86 simple chars are, by default, treated as signed.
One workaround for users of GCC is to use the -fsigned-char command line switch or --signed-chars for RVCT, that forces all chars to become signed, but a better practice is to write portable code by declaring char variables appropriately. Unsigned char must be used for accessing memory as a block of bytes or for small unsigned integers. Signed char must be used for small signed integers and simple char must be used only for ASCII characters and strings. In fact, on an ARM core, it is usually better to use ints rather than chars, even for small values, for performance reasons. You can read more on this in Optimizing Code to Run on ARM Processors.

[링크 : https://developer.arm.com/.../Miscellaneous-C-porting-issues/unsigned-char-and-signed-char]

 

LDRSB (Thumb*) - Load Register Signed Byte

[링크 : http://qcd.phys.cmu.edu/QCDcluster/intel/vtune/reference/LDRSB_(Thumb).htm]

 

LDRB - Load Register Byte

[링크 : http://qcd.phys.cmu.edu/QCDcluster/intel/vtune/reference/INST_LDRB.htm]

 

char -> signed char: -fsigned-char == -fno-unsigned-char
char -> unsigned char: -funsigned-char == -fno-signed-char

[링크 : https://jooojub.github.io/gcc-options-fsigned-char/]

 

-fsigned-char
Let the type char be signed, like signed char.
Note that this is equivalent to -fno-unsigned-char, which is the negative form of -funsigned-char. Likewise, the option -fno-signed-char is equivalent to -funsigned-char.

-funsigned-char
Let the type char be unsigned, like unsigned char.
Each kind of machine has a default for what char should be. It is either like unsigned char by default or like signed char by default.
Ideally, a portable program should always use signed char or unsigned char when it depends on the signedness of an object. But many programs have been written to use plain char and expect it to be signed, or expect it to be unsigned, depending on the machines they were written for. This option, and its inverse, let you make such a program work with the opposite default.
The type char is always a distinct type from each of signed char or unsigned char, even though its behavior is always just like one of those two.

[링크 : https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html]

 

+

2021.07.09

으잉? singed char로 하면 되긴 한다. char가 signed 아니었어?!

'프로그램 사용 > gcc' 카테고리의 다른 글

static link  (0) 2022.02.07
구조체 타입과 변수명은 구분된다?  (0) 2021.11.18
gcc vectorized loop  (0) 2021.06.30
gcc unsigned to signed cast  (0) 2021.06.22
gcc %p (nil)  (0) 2021.05.07
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 6. 30. 11:43

-O3 하면 자동으로 -ftree-vectorize가 추가되었다고.

아무튼 연산만 하고 출력을 안하니 사용하지 않는 코드로 해서 vadd가 안나와서 한참을 헤맸네..

 

$ g++ -O3 -mavx autovector.cpp -fopt-info-vec-all
autovector.cpp:22:22: missed: couldn't vectorize loop
autovector.cpp:25:19: missed: not vectorized: complicated access pattern.
autovector.cpp:23:21: missed: couldn't vectorize loop
autovector.cpp:25:14: missed: not vectorized: complicated access pattern.
autovector.cpp:16:23: optimized: loop vectorized using 32 byte vectors
autovector.cpp:10:5: note: vectorized 1 loops in function.
autovector.cpp:15:43: missed: statement clobbers memory: now = std::chrono::_V2::system_clock::now ();
autovector.cpp:27:77: missed: statement clobbers memory: D.189348 = std::chrono::_V2::system_clock::now ();
autovector.cpp:28:2: missed: statement clobbers memory: __assert_fail ("result[2] == ( 2.0f + 0.1335f)+( 1.50f*2.0f + 0.9383f)-(0.33f*2.0f+0.1172f)+3*(float)(noTests-1)", "autovector.cpp", 28, "int main()");
/usr/include/c++/9/ostream:570:18: missed: statement clobbers memory: std::__ostream_insert<char, std::char_traits<char> > (&cout, "CG> message -channel \"exercise results\" Time used: ", 51);
/usr/include/c++/9/ostream:221:29: missed: statement clobbers memory: _46 = std::basic_ostream<char>::_M_insert<double> (&cout, _42);
/usr/include/c++/9/ostream:570:18: missed: statement clobbers memory: std::__ostream_insert<char, std::char_traits<char> > (_46, "s, N * noTests=", 15);
autovector.cpp:29:112: missed: statement clobbers memory: _35 = std::basic_ostream<char>::operator<< (_46, 2000000000);
/usr/include/c++/9/ostream:113:13: missed: statement clobbers memory: std::endl<char, std::char_traits<char> > (_35);
/usr/include/c++/9/iostream:74:25: missed: statement clobbers memory: std::ios_base::Init::Init (&__ioinit);
/usr/include/c++/9/iostream:74:25: missed: statement clobbers memory: __cxa_atexit (__dt_comp , &__ioinit, &__dso_handle);
$ gcc -mcpu=native -march=native -Q --help=target
The following options are target specific:
  -mabi=                                aapcs-linux
  -mabort-on-noreturn                   [disabled]
  -mandroid                             [disabled]
  -mapcs                                [disabled]
  -mapcs-frame                          [disabled]
  -mapcs-reentrant                      [disabled]
  -mapcs-stack-check                    [disabled]
  -march=                               armv7ve+vfpv3-d16
  -marm                                 [enabled]
  -masm-syntax-unified                  [disabled]
  -mbe32                                [enabled]
  -mbe8                                 [disabled]
  -mbig-endian                          [disabled]
  -mbionic                              [disabled]
  -mbranch-cost=                        -1
  -mcallee-super-interworking           [disabled]
  -mcaller-super-interworking           [disabled]
  -mcmse                                [disabled]
  -mcpu=                                cortex-a7
  -mfix-cortex-m3-ldrd                  [disabled]
  -mflip-thumb                          [disabled]
  -mfloat-abi=                          hard
  -mfp16-format=                        none
  -mfpu=                                vfp
  -mglibc                               [enabled]
  -mhard-float
  -mlittle-endian                       [enabled]
  -mlong-calls                          [disabled]
  -mmusl                                [disabled]
  -mneon-for-64bits                     [disabled]
  -mpic-data-is-text-relative           [enabled]
  -mpic-register=
  -mpoke-function-name                  [disabled]
  -mprint-tune-info                     [disabled]
  -mpure-code                           [disabled]
  -mrestrict-it                         [disabled]
  -msched-prolog                        [enabled]
  -msingle-pic-base                     [disabled]
  -mslow-flash-data                     [disabled]
  -msoft-float
  -mstructure-size-boundary=            8
  -mthumb                               [disabled]
  -mthumb-interwork                     [disabled]
  -mtls-dialect=                        gnu
  -mtp=                                 cp15
  -mtpcs-frame                          [disabled]
  -mtpcs-leaf-frame                     [disabled]
  -mtune=
  -muclibc                              [disabled]
  -munaligned-access                    [enabled]
  -mvectorize-with-neon-double          [disabled]
  -mvectorize-with-neon-quad            [enabled]
  -mword-relocations                    [disabled]

  Known ARM ABIs (for use with the -mabi= option):
    aapcs aapcs-linux apcs-gnu atpcs iwmmxt

  Known __fp16 formats (for use with the -mfp16-format= option):
    alternative ieee none

  Known ARM FPUs (for use with the -mfpu= option):
    auto crypto-neon-fp-armv8 fp-armv8 fpv4-sp-d16 fpv5-d16 fpv5-sp-d16 neon neon-fp-armv8 neon-fp16 neon-vfpv3 neon-vfpv4 vfp vfp3 vfpv2 vfpv3 vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd
    vfpv3xd-fp16 vfpv4 vfpv4-d16

  Valid arguments to -mtp=:
    auto cp15 soft

  Known floating-point ABIs (for use with the -mfloat-abi= option):
    hard soft softfp

  TLS dialect to use:
    gnu gnu2


[링크 : https://www.raspberrypi.org/forums/viewtopic.php?t=155461]

[링크 : https://www.codingame.com/playgrounds/283/sse-avx-vectorization/autovectorization]

 

+

$ cat neon.c
#include <stdio.h>

void main()
{
        int a[256];
        int b[256];
        int c[256];

        int i;
        for(i = 0; i < 256; i++)
        {
                a[i] = b[i] + c[i];
        }

        printf("%d %d %d\n", a[0], b[0], c[0]);
}

 

$ gcc -O3 neon.c -mfpu=neon
$ objdump -d a.out  | grep v
   10320:       e1a01000        mov     r1, r0
   10328:       e1a0300d        mov     r3, sp
   1032c:       f4610add        vld1.64 {d16-d17}, [r1 :64]!
   10330:       f4622add        vld1.64 {d18-d19}, [r2 :64]!
   10334:       f26008e2        vadd.i32        q8, q8, q9
   10338:       f4430add        vst1.64 {d16-d17}, [r3 :64]!
   10360:       e3a0b000        mov     fp, #0
   10364:       e3a0e000        mov     lr, #0
   1036c:       e1a0200d        mov     r2, sp
   1043c:       e3a03001        mov     r3, #1
   10454:       e1a07000        mov     r7, r0
   1046c:       e1a08001        mov     r8, r1
   10470:       e1a09002        mov     r9, r2
   10480:       e3a04000        mov     r4, #0
   1048c:       e1a02009        mov     r2, r9
   10490:       e1a01008        mov     r1, r8
   10494:       e1a00007        mov     r0, r7

 

$ gcc neon.c -mfpu=neon
$ objdump -d a.out  | grep v
   10318:       e3a0b000        mov     fp, #0
   1031c:       e3a0e000        mov     lr, #0
   10324:       e1a0200d        mov     r2, sp
   103f4:       e3a03001        mov     r3, #1
   10418:       e3a03000        mov     r3, #0
   10490:       e1a00000        nop                     ; (mov r0, r0)
   104a4:       e1a07000        mov     r7, r0
   104bc:       e1a08001        mov     r8, r1
   104c0:       e1a09002        mov     r9, r2
   104d0:       e3a04000        mov     r4, #0
   104dc:       e1a02009        mov     r2, r9
   104e0:       e1a01008        mov     r1, r8
   104e4:       e1a00007        mov     r0, r7

 

-fopt-info-vec-all 추가. -all 때문인지 어마어마하게 나오네

-fopt-info-vec 으로만 하니 깔끔하게 vectorized 라고 뜬다.

$ gcc neon.c -mfpu=neon -fopt-info-vec -O3
neon.c:10:2: note: loop vectorized

'프로그램 사용 > gcc' 카테고리의 다른 글

구조체 타입과 변수명은 구분된다?  (0) 2021.11.18
gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc unsigned to signed cast  (0) 2021.06.22
gcc %p (nil)  (0) 2021.05.07
gcc -D 옵션 인자를 printf로 출력하기  (0) 2021.04.08
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 6. 22. 19:15

예전에는 문제없이 unsigned char에 대한 int 형으로의 캐스팅이 문제없이 되었던 것 같은데 안되서

C99나 C11의 영향인지 조금 더 찾아보는 중.

 

[링크 : https://gcc.gnu.org/onlinedocs/gcc/Characters-implementation.html#Characters-implementationl]

[링크 : https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html]

[링크 : https://gcc.gnu.org/wiki/NewWconversion]

 

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc unsigned to signed upcast 테스트  (0) 2021.07.08
gcc vectorized loop  (0) 2021.06.30
gcc %p (nil)  (0) 2021.05.07
gcc -D 옵션 인자를 printf로 출력하기  (0) 2021.04.08
Auto-vectorization in GCC  (0) 2021.03.25
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 5. 7. 17:30

%p로 출력하니 0일 경우 (nil)로 출력되네?

 

[링크 : https://stackoverflow.com/questions/4744650/nil-pointer-in-c-c]

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc vectorized loop  (0) 2021.06.30
gcc unsigned to signed cast  (0) 2021.06.22
gcc -D 옵션 인자를 printf로 출력하기  (0) 2021.04.08
Auto-vectorization in GCC  (0) 2021.03.25
gcc -march 옵션  (0) 2021.01.24
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 4. 8. 12:02

와.. 먼가 쓸데없는 뻘짓 ㅋㅋㅋ

 

makefile 에서 \" \" 로 감싸주니 의외로 간단하게 해결

#CROSS_PREFIX	= arm-buildroot-linux-gnueabihf
CFLAGS			= $(INCLUDEDIRS) -o -W -Wall -O2 -DCROSS_PREFIX=\"$(CROSS_PREFIX)\"

 

테스트 코드

void main()
{
    printf("%s\n", CROSS_PREFIX);
}

 

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc unsigned to signed cast  (0) 2021.06.22
gcc %p (nil)  (0) 2021.05.07
Auto-vectorization in GCC  (0) 2021.03.25
gcc -march 옵션  (0) 2021.01.24
g++ 은 정적 빌드가 안되나?  (0) 2021.01.19
Posted by 구차니

댓글을 달아 주세요

프로그램 사용/gcc2021. 3. 25. 16:53

어떤게 변환되는지도 확인이 가능한 듯.

-ftree-vectorizer-verbose=[n] controls vectorization reports, with n ranging from 0 (no information reported) to 6 (all information reported).

 

25가지 예가 있는데 for a = b + c 같은건 된다고 한다.

Example 12: Induction:
Example 13: Outer-loop:
Example 14: Double reduction:
Example 15: Condition in nested loop:
Example 16: Load permutation in loop-aware SLP:
Example 17: Basic block SLP:
Example 18: Simple reduction in SLP:
Example 19: Reduction chain in SLP:
Example 20: Basic block SLP with multiple types, loads with different offsets, misaligned load, and not-affine Example 21: Backward access:
Example 22: Alignment hints:
Example 23: Widening shift:
Example 24: Condition with mixed types:
Example 25: Loop with bool:

[링크 : https://gcc.gnu.org/projects/tree-ssa/vectorization.html#using]

'프로그램 사용 > gcc' 카테고리의 다른 글

gcc %p (nil)  (0) 2021.05.07
gcc -D 옵션 인자를 printf로 출력하기  (0) 2021.04.08
gcc -march 옵션  (0) 2021.01.24
g++ 은 정적 빌드가 안되나?  (0) 2021.01.19
gcc offloading support  (0) 2020.11.24
Posted by 구차니

댓글을 달아 주세요