변경된 파일 목록과 함께 보는 방법
'프로그램 사용 > Version Control' 카테고리의 다른 글
git blame (0) | 2021.06.21 |
---|---|
git pull rebase 설정 (0) | 2021.06.02 |
git stash (0) | 2021.05.09 |
git 저장소 합치기 해보았으나.. (0) | 2021.04.07 |
git 특정 디렉토리만 clone 하기 (0) | 2021.04.07 |
변경된 파일 목록과 함께 보는 방법
git blame (0) | 2021.06.21 |
---|---|
git pull rebase 설정 (0) | 2021.06.02 |
git stash (0) | 2021.05.09 |
git 저장소 합치기 해보았으나.. (0) | 2021.04.07 |
git 특정 디렉토리만 clone 하기 (0) | 2021.04.07 |
이번에 한번 날릴 각오로(!) 써봐야지 -_ㅠ
git stash [push]
git stash list
git stash apply | pop
SYNOPSIS git stash list [<options>] git stash show [<stash>] git stash drop [-q|--quiet] [<stash>] git stash ( pop | apply ) [--index] [-q|--quiet] [<stash>] git stash branch <branchname> [<stash>] git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet] [-u|--include-untracked] [-a|--all] [-m|--message <message>] [--] [<pathspec>...]] git stash clear git stash create [<message>] git stash store [-m|--message <message>] [-q|--quiet] <commit> |
[링크 : https://gmlwjd9405.github.io/2018/05/18/git-stash.html]
git pull rebase 설정 (0) | 2021.06.02 |
---|---|
git log --stat (0) | 2021.05.10 |
git 저장소 합치기 해보았으나.. (0) | 2021.04.07 |
git 특정 디렉토리만 clone 하기 (0) | 2021.04.07 |
git lfs (0) | 2021.04.06 |
gcc vectorized loop (0) | 2021.06.30 |
---|---|
gcc unsigned to signed cast (0) | 2021.06.22 |
gcc -D 옵션 인자를 printf로 출력하기 (0) | 2021.04.08 |
Auto-vectorization in GCC (0) | 2021.03.25 |
gcc -march 옵션 (0) | 2021.01.24 |
변수 추적해보니 그게 그거인가?
int output = interpreter->outputs()[0];
TfLiteIntArray* output_dims = interpreter->tensor(output)->dims;
// assume output dims to be something like (1, 1, ... ,size)
auto output_size = output_dims->data[output_dims->size - 1];
const float* detection_locations = interpreter->tensor(interpreter->outputs()[0])->data.f;
const float* detection_classes=interpreter->tensor(interpreter->outputs()[1])->data.f;
const float* detection_scores = interpreter->tensor(interpreter->outputs()[2])->data.f;
const int num_detections = *interpreter->tensor(interpreter->outputs()[3])->data.f;
//there are ALWAYS 10 detections no matter how many objects are detectable
//cout << "number of detections: " << num_detections << "\n";
const float confidence_threshold = 0.5;
for(int i = 0; i < num_detections; i++){
if(detection_scores[i] > confidence_threshold){
int det_index = (int)detection_classes[i]+1;
float y1=detection_locations[4*i ]*cam_height;
float x1=detection_locations[4*i+1]*cam_width;
float y2=detection_locations[4*i+2]*cam_height;
float x2=detection_locations[4*i+3]*cam_width;
Rect rec((int)x1, (int)y1, (int)(x2 - x1), (int)(y2 - y1));
rectangle(src,rec, Scalar(0, 0, 255), 1, 8, 0);
putText(src, format("%s", Labels[det_index].c_str()), Point(x1, y1-5) ,FONT_HERSHEY_SIMPLEX,0.5, Scalar(0, 0, 255), 1, 8, 0);
}
}
typedef struct {
int size;
#if !defined(__clang__) && defined(__GNUC__) && __GNUC__ == 6 && \
__GNUC_MINOR__ >= 1
int data[0];
#else
int data[];
#endif
} TfLiteIntArray;
typedef union {
int* i32;
int64_t* i64;
float* f;
char* raw;
const char* raw_const;
uint8_t* uint8;
bool* b;
int16_t* i16;
TfLiteComplex64* c64;
int8_t* int8;
} TfLitePtrUnion;
typedef struct {
TfLiteType type;
TfLitePtrUnion data;
TfLiteIntArray* dims;
TfLiteQuantizationParams params;
TfLiteAllocationType allocation_type;
size_t bytes;
const void* allocation;
const char* name;
TfLiteDelegate* delegate;
TfLiteBufferHandle buffer_handle;
bool data_is_stale;
bool is_variable;
TfLiteQuantization quantization;
} TfLiteTensor;
[링크 : https://android.googlesource.com/platform/external/tensorflow/.../tensorflow/lite/c/c_api_internal.h]
현재 소스에서는 common.h 로 옮겨진듯
[링크 : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/c/common.h]
tflite common.h (0) | 2021.05.21 |
---|---|
imx6q neon tensorlow lite (0) | 2021.05.10 |
tflite example (0) | 2021.04.19 |
tflite convert (0) | 2021.04.16 |
LSTM - Long short-term memory (0) | 2021.04.16 |
distcc 만세! (0) | 2021.05.12 |
---|---|
rpi distcc 성공인데 실패 (0) | 2021.04.28 |
distcc hosts 파일과 순서 (0) | 2016.10.19 |
distcc-pump 시도.. (0) | 2016.10.18 |
distcc 를 DHCP 에서.. 2? (0) | 2016.10.18 |
distcc 패키지 설치하고, tensorflow lite 빌드 시도
원래는 30분 정도 걸렸는데 (rpi 3b, 4core 기준) 얼마나 줄어들려나?
(느낌으로는 SD 메모리라 disk io로 인해 오히려 더 느려질지도 모르겠다는 불안감이..)
접속이 안되는 것 같아서 다른 문서들을 자세히 보니 설정을 제대로 안했네!
distcc[946] (dcc_build_somewhere) Warning: failed to distribute, running locally instead distcc[946] (dcc_parse_hosts) Warning: /home/pi/.distcc/zeroconf/hosts contained no hosts; can't distribute work distcc[946] (dcc_zeroconf_add_hosts) CRITICAL! failed to parse host file. |
/etc/default/ditscc 파일에서 allow와 listener를 수정해주고 service distcc restart 하면 끝!
$ cat /etc/default/distcc # Defaults for distcc initscript # sourced by /etc/init.d/distcc # # should distcc be started on boot? # STARTDISTCC="true" #STARTDISTCC="false" # # Which networks/hosts should be allowed to connect to the daemon? # You can list multiple hosts/networks separated by spaces. # Networks have to be in CIDR notation, e.g. 192.168.1.0/24 # Hosts are represented by a single IP address # # ALLOWEDNETS="127.0.0.1" ALLOWEDNETS="127.0.0.1 192.168.0.0/16" # # Which interface should distccd listen on? # You can specify a single interface, identified by it's IP address, here. # # LISTENER="127.0.0.1" LISTENER="" # # You can specify a (positive) nice level for the distcc process here # # NICE="10" NICE="10" # # You can specify a maximum number of jobs, the server will accept concurrently # # JOBS="" JOBS="" # # Enable Zeroconf support? # If enabled, distccd will register via mDNS/DNS-SD. # It can then automatically be found by zeroconf enabled distcc clients # without the need of a manually configured host list. # ZEROCONF="true" #ZEROCONF="false" |
MAKEFLAGS에 CC=/usr/lib/distcc/gcc 이 포인트 이긴 한데..
tensorflow/tensorflow/lite/tools/make $ cat ./build_rpi_lib.sh
#!/bin/bash
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
set -x
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TENSORFLOW_DIR="${SCRIPT_DIR}/../../../.."
FREE_MEM="$(free -m | awk '/^Mem/ {print $2}')"
# Use "-j 4" only memory is larger than 2GB
if [[ "FREE_MEM" -gt "2000" ]]; then
NO_JOB=4
else
NO_JOB=1
fi
export MAKEFLAGS="CXX=/usr/lib/distcc/g++ CC=/usr/lib/distcc/gcc"
make -j 8 TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@
#make -j ${NO_JOB} CC=/usr/lib/distcc/gcc TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@
/etc/distcc/hosts 에 사용할 노드 이름을 넣으면 되는데 자기 자신이 들어가지 않으면
distcc 에서는 슬레이브 노드들로만 빌드를 하게 된다.
# As described in the distcc manpage, this file can be used for a global # list of available distcc hosts. # # The list from this file will only be used, if neither the # environment variable DISTCC_HOSTS, nor the file $HOME/.distcc/hosts # contains a valid list of hosts. # # Add a list of hostnames in one line, seperated by spaces, here. # tf2 tf3 +zeroconf |
가끔 이런거 나오는데 그냥 무시하면 zeroconf에 의해서 붙는지 슬레이브 노드(?) 쪽 cpu를 빨아먹긴 한다.
distcc[1323] (dcc_build_somewhere) Warning: failed to distribute, running locally instead distcc[1332] (dcc_build_somewhere) Warning: failed to distribute, running locally instead |
[링크 : http://openframeworks.cc/ko/setup/raspberrypi/raspberry-pi-distcc-guide/]
[링크 : http://jtanx.github.io/2019/04/19/rpi-distcc-node/]
+
/var/log/distcc.log를 보는데
정상적으로 잘되면 COMPILE_OK가 뜨지만
어느순간 갑자기 client fd disconnected가 뜨면서 빌드가 멈춘다.
근데 time:305000ms 정도 대충 5분 timewait 걸리는것 같아서
오히려 안하니만 못한 상황..
distccd[14090] (dcc_job_summary) client: 192.168.52.209:40940 COMPILE_OK exit:0 sig:0 core:0 ret:0 time:16693ms g++ tensorflow/lite/kernels/cpu_backend_gemm_eigen.cc
distccd[14091] (dcc_collect_child) ERROR: Client fd disconnected, killing job
distccd[14091] (dcc_writex) ERROR: failed to write: Broken pipe
distccd[14091] (dcc_job_summary) client: 192.168.52.209:40932 CLI_DISCONN exit:107 sig:0 core:0 ret:107 time:307172ms
아무튼 위와 같은 에러를 내며 뻗을때 개별 노드에서는 이런식으로 IO가 미쳐 날뛴다.
--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai stl| read writ| recv send| in out | int csw 5 2 10 83 0| 928k 4048k|1063B 252B| 68k 2040k|1830 3320 0 3 27 69 0|7840M 27M|2919k 73k|1512k 11M| 245k 402k missed 238 ticks 2 1 0 97 0| 176k 0 | 0 0 |8192B 0 | 19 23 missed 2 ticks |
+
cpp,lzo를 넣어서 해볼까?
[링크 : https://wiki.gentoo.org/wiki/Distcc/ko]
+
export MAKEFLAGS="CXX=/usr/lib/distcc/g++ CC=/usr/lib/distcc/gcc" #export MAKEFLAGS="CXX=/usr/bin/distcc-pump CC=/usr/bin/distcc-pump" make -j 8 TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@ #make -j ${NO_JOB} CC=/usr/lib/distcc/gcc TARGET=rpi -C "${TENSORFLOW_DIR}" -f tensorflow/lite/tools/make/Makefile $@ |
되는데 pump가 아닌거랑 동일하게 io가 폭주해서 뻗는건 동일하다.
$ distcc-pump ./build_rpi_lib.sh |
+
distccmon-text 는 slave node가 아니라 server node에서 해야 하는구나..
distcc 만세! (0) | 2021.05.12 |
---|---|
rpi distcc with ccache 실패 ㅠㅠ (0) | 2021.04.30 |
distcc hosts 파일과 순서 (0) | 2016.10.19 |
distcc-pump 시도.. (0) | 2016.10.18 |
distcc 를 DHCP 에서.. 2? (0) | 2016.10.18 |
imx6q neon tensorlow lite (0) | 2021.05.10 |
---|---|
tflite type (0) | 2021.05.01 |
tflite convert (0) | 2021.04.16 |
LSTM - Long short-term memory (0) | 2021.04.16 |
quantization: 0.003921568859368563 * q (0) | 2021.04.15 |
[링크 : http://www.tensorflow.org/lite/api_docs/python/tf/lite/Optimize]
[링크 : http://www.tensorflow.org/lite/guide/ops_select]
[링크 : http://medium.com/sclable/model-quantization-using-tensorflow-lite-2fe6a171a90d]
[링크 : http://www.tensorflow.org/lite/performance/quantization_spec]
[링크 : http://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter]
tflite type (0) | 2021.05.01 |
---|---|
tflite example (0) | 2021.04.19 |
LSTM - Long short-term memory (0) | 2021.04.16 |
quantization: 0.003921568859368563 * q (0) | 2021.04.15 |
tflite_converter quantization (0) | 2021.04.14 |
tensorflow model 뒤져보다 보니 lstm 이라는 용어는 본적이 있는데
귀찮아서 넘기다가 이번에도 또 검색중에 걸려나와서 조사.
RNN(Recurrent nerural network) 에서 사용하는 기법(?)으로 문맥을 강화해주는 역활을 하는 듯.
[링크 : http://euzl.github.io/hackday_1/]
[링크 : https://en.wikipedia.org/wiki/Long_short-term_memory]
tflite example (0) | 2021.04.19 |
---|---|
tflite convert (0) | 2021.04.16 |
quantization: 0.003921568859368563 * q (0) | 2021.04.15 |
tflite_converter quantization (0) | 2021.04.14 |
tensorboard graph (0) | 2021.04.14 |
tflite로 변환시 unit8로 양자화 하면
분명 범위는 random으로 들어가야 해서 quantization 범위가 조금은 달라질 것으로 예상을 했는데
항상 동일한 0.003921568859368563 * q로 나와 해당 숫자로 검색을 하니
0~255 범위를 float로 정규화 하면 해당 숫자가 나온다고..
0.00392 * 255 = 0.9996 이 나오긴 하네?
quantization of input tensor will be close to (0.003921568859368563, 0). mean is the integer value from 0 to 255 that maps to floating point 0.0f. std_dev is 255 / (float_max - float_min). This will fix one possible problem |
[링크 : https://stackoverflow.com/questions/54830869/]
[링크 : https://github.com/majidghafouri/Object-Recognition-tf-lite/issues/1]
+
output_format: Output file format. Currently must be {TFLITE, GRAPHVIZ_DOT}. (default TFLITE) quantized_input_stats: Dict of strings representing input tensor names mapped to tuple of floats representing the mean and standard deviation of the training data (e.g., {"foo" : (0., 1.)}). Only need if inference_input_type is QUANTIZED_UINT8. real_input_value = (quantized_input_value - mean_value) / std_dev_value. (default {}) default_ranges_stats: Tuple of integers representing (min, max) range values for all arrays without a specified range. Intended for experimenting with quantization via "dummy quantization". (default None) post_training_quantize: Boolean indicating whether to quantize the weights of the converted float model. Model size will be reduced and there will be latency improvements (at the cost of accuracy). (default False) |
[링크 : http://man.hubwiz.com/.../python/tf/lite/TFLiteConverter.html]
TOCO(Tensorflow Lite Optimized Converter)
[링크 : https://junimnjw.github.io/%EA%B0%9C%EB%B0%9C/2019/08/09/tensorflow-lite-2.html]
tflite convert (0) | 2021.04.16 |
---|---|
LSTM - Long short-term memory (0) | 2021.04.16 |
tflite_converter quantization (0) | 2021.04.14 |
tensorboard graph (0) | 2021.04.14 |
generate_tfrecord.py (0) | 2021.04.13 |