구차니의 잡동사니 모음

claude에 물어보니 openai 의 api가 랩핑한것일뿐

결국에는 신경망에 특수 토큰으로 인식될 만한 녀석으로 감싸고 그걸 임베딩 해서 벡터로 넣어줄 뿐

무언가(?) 대단한 방법으로 들어가는건 아니라고 한다. imagenary monologue 줄여서 im prefix

다른 모델과의 비교

모델블록 시작블록 끝방식

ChatML (Qwen, Hermes)	<\|im_start\|>role	<\|im_end\|>	범용 role 삽입
LLaMA-3	<\|start_header_id\|>role<\|end_header_id\|>	<\|eot_id\|>	헤더 분리
LLaMA-2	[INST]	[/INST]	role 구분 없음
Mistral	[INST]	[/INST]	role 구분 없음
DeepSeek-R1	<\|User\|> / <\|Assistant\|>	<\|end▁of▁sentence\|>	role별 고정 토큰

[링크 : https://claude.ai/share/0bfc9c76-e5a8-46d2-8346-4f84170625db]

#define CHATML_TEMPLATE_SRC                                                               \
    "{%- for message in messages -%}\n"                                                   \
    "  {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' -}}\n" \
    "{%- endfor -%}\n"                                                                    \
    "{%- if add_generation_prompt -%}\n"                                                  \
    "  {{- '<|im_start|>assistant\n' -}}\n"                                               \
    "{%- endif -%}"

    const std::string TOOL_CALL_START = "<|tool_call_start|>";
    const std::string TOOL_CALL_END   = "<|tool_call_end|>";
    const std::string THINK_START     = "<think>";
    const std::string THINK_END       = "</think>";
    const std::string GEN_PROMPT      = "<|im_start|>assistant\n";

[링크 : https://github.com/ggml-org/llama.cpp/blob/master/common/chat.cpp#L567]

음.. gemma4-e4b 에서는 im_start가 특수 토큰으로 배우진 않은듯?

/mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable
     2 -> '<bos>'
236820 -> '<'
236909 -> '|'
   548 -> 'im'
236779 -> '_'
  3041 -> 'start'
111038 -> '|>'
  9731 -> 'system'

/mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable  --no-parse-special
     2 -> '<bos>'
236820 -> '<'
236909 -> '|'
   548 -> 'im'
236779 -> '_'
  3041 -> 'start'
111038 -> '|>'
  9731 -> 'system'

/mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable  --no-parse-special --ids
[2, 236820, 236909, 548, 236779, 3041, 111038, 9731]

모델 읽어들이다가 발견 한 특수 토큰..?

일단은 control token 이라고 나온다.

$ ./llama-b8925/llama-cli -m model/gemma4-e4b/gemma-4-E4B-it-Q4_K_M.gguf  --verbose
load: control token: 258884 '<|video|>' is not marked as EOG
load: control token: 255999 '<|image>' is not marked as EOG
load: control token: 258882 '<image|>' is not marked as EOG
load: control token: 258883 '<audio|>' is not marked as EOG
load: control token:     98 '<|think|>' is not marked as EOG
load: control token:    105 '<|turn>' is not marked as EOG
load: control token: 258880 '<|image|>' is not marked as EOG
load: control token:      2 '<bos>' is not marked as EOG
load: control-looking token:    212 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token:      0 '<pad>' is not marked as EOG
load: control-looking token:     50 '<|tool_response>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token:     46 '<|tool>' is not marked as EOG
load: control token:     47 '<tool|>' is not marked as EOG
load: control token: 256000 '<|audio>' is not marked as EOG
load: control token:      3 '<unk>' is not marked as EOG
load: control token: 258881 '<|audio|>' is not marked as EOG
load: control-looking token:      1 '<eos>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token:      4 '<mask>' is not marked as EOG
|load: printing all EOG tokens:
load:   - 1 ('<eos>')
load:   - 50 ('<|tool_response>')
load:   - 106 ('<turn|>')
load:   - 212 ('</s>')
load: special_eog_ids contains '<|tool_response>', removing '</s>' token from EOG list

print_info: BOS token             = 2 '<bos>'
print_info: EOS token             = 106 '<turn|>'
print_info: UNK token             = 3 '<unk>'
print_info: PAD token             = 0 '<pad>'
print_info: MASK token            = 4 '<mask>'
print_info: LF token              = 107 '
'
print_info: EOG token             = 1 '<eos>'
print_info: EOG token             = 50 '<|tool_response>'
print_info: EOG token             = 106 '<turn|>'
print_info: max token length      = 93

init: chat template, example_format: '<|turn>system
<|think|>
You are a helpful assistant<turn|>
<|turn>user
Hello<turn|>
<|turn>model
Hi there<turn|>
<|turn>user
How are you?<turn|>
<|turn>model
'
srv          init: init: chat template, thinking = 1

저작자표시 (새창열림)

'프로그램 사용 > ai 프로그램' 카테고리의 다른 글

comfyui 실행 (0)	2026.05.19
llama.cpp kv quantization (0)	2026.05.19
RAG 시도 - postgresql(14) + pgvector (1)	2026.05.15
gpt님 만세! - pip torch 버전 낮추기 (0)	2026.05.15
wan2.2 (0)	2026.05.13

Posted by 구차니

구차니의 잡동사니 모음

openai api

다른 모델과의 비교

'프로그램 사용 > ai 프로그램' 카테고리의 다른 글

카테고리

공지사항

태그목록

최근에 올라온 글

최근에 달린 댓글

티스토리툴바