[링크 : https://velog.io/@euisuk-chung/%EC%A0%95%EB%A6%AC-OpenAI-API-Document]
[링크 : https://developers.openai.com/api/reference/resources/responses]
[링크 : https://developers.openai.com/api/docs/guides/text]
[링크 : https://model-spec.openai.com/2025-02-12.html#chain_of_command]
claude에 물어보니 openai 의 api가 랩핑한것일뿐
결국에는 신경망에 특수 토큰으로 인식될 만한 녀석으로 감싸고 그걸 임베딩 해서 벡터로 넣어줄 뿐
무언가(?) 대단한 방법으로 들어가는건 아니라고 한다. imagenary monologue 줄여서 im prefix
다른 모델과의 비교
모델블록 시작블록 끝방식
| ChatML (Qwen, Hermes) | <|im_start|>role | <|im_end|> | 범용 role 삽입 |
| LLaMA-3 | <|start_header_id|>role<|end_header_id|> | <|eot_id|> | 헤더 분리 |
| LLaMA-2 | [INST] | [/INST] | role 구분 없음 |
| Mistral | [INST] | [/INST] | role 구분 없음 |
| DeepSeek-R1 | <|User|> / <|Assistant|> | <|end▁of▁sentence|> | role별 고정 토큰 |
[링크 : https://claude.ai/share/0bfc9c76-e5a8-46d2-8346-4f84170625db]
| #define CHATML_TEMPLATE_SRC \ "{%- for message in messages -%}\n" \ " {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' -}}\n" \ "{%- endfor -%}\n" \ "{%- if add_generation_prompt -%}\n" \ " {{- '<|im_start|>assistant\n' -}}\n" \ "{%- endif -%}" const std::string TOOL_CALL_START = "<|tool_call_start|>"; const std::string TOOL_CALL_END = "<|tool_call_end|>"; const std::string THINK_START = "<think>"; const std::string THINK_END = "</think>"; const std::string GEN_PROMPT = "<|im_start|>assistant\n"; |
[링크 : https://github.com/ggml-org/llama.cpp/blob/master/common/chat.cpp#L567]
음.. gemma4-e4b 에서는 im_start가 특수 토큰으로 배우진 않은듯?
| /mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable 2 -> '<bos>' 236820 -> '<' 236909 -> '|' 548 -> 'im' 236779 -> '_' 3041 -> 'start' 111038 -> '|>' 9731 -> 'system' /mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable --no-parse-special 2 -> '<bos>' 236820 -> '<' 236909 -> '|' 548 -> 'im' 236779 -> '_' 3041 -> 'start' 111038 -> '|>' 9731 -> 'system' /mnt/Downloads/model/gemma4-e4b$ ../../llama-b8925/llama-tokenize -m gemma-4-E4B-it-Q4_K_M.gguf -p "<|im_start|>system" --log-disable --no-parse-special --ids [2, 236820, 236909, 548, 236779, 3041, 111038, 9731] |
+
모델 읽어들이다가 발견 한 특수 토큰..?
일단은 control token 이라고 나온다.
| $ ./llama-b8925/llama-cli -m model/gemma4-e4b/gemma-4-E4B-it-Q4_K_M.gguf --verbose load: control token: 258884 '<|video|>' is not marked as EOG load: control token: 255999 '<|image>' is not marked as EOG load: control token: 258882 '<image|>' is not marked as EOG load: control token: 258883 '<audio|>' is not marked as EOG load: control token: 98 '<|think|>' is not marked as EOG load: control token: 105 '<|turn>' is not marked as EOG load: control token: 258880 '<|image|>' is not marked as EOG load: control token: 2 '<bos>' is not marked as EOG load: control-looking token: 212 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden load: control token: 0 '<pad>' is not marked as EOG load: control-looking token: 50 '<|tool_response>' was not control-type; this is probably a bug in the model. its type will be overridden load: control token: 46 '<|tool>' is not marked as EOG load: control token: 47 '<tool|>' is not marked as EOG load: control token: 256000 '<|audio>' is not marked as EOG load: control token: 3 '<unk>' is not marked as EOG load: control token: 258881 '<|audio|>' is not marked as EOG load: control-looking token: 1 '<eos>' was not control-type; this is probably a bug in the model. its type will be overridden load: control token: 4 '<mask>' is not marked as EOG |load: printing all EOG tokens: load: - 1 ('<eos>') load: - 50 ('<|tool_response>') load: - 106 ('<turn|>') load: - 212 ('</s>') load: special_eog_ids contains '<|tool_response>', removing '</s>' token from EOG list print_info: BOS token = 2 '<bos>' print_info: EOS token = 106 '<turn|>' print_info: UNK token = 3 '<unk>' print_info: PAD token = 0 '<pad>' print_info: MASK token = 4 '<mask>' print_info: LF token = 107 ' ' print_info: EOG token = 1 '<eos>' print_info: EOG token = 50 '<|tool_response>' print_info: EOG token = 106 '<turn|>' print_info: max token length = 93 init: chat template, example_format: '<|turn>system <|think|> You are a helpful assistant<turn|> <|turn>user Hello<turn|> <|turn>model Hi there<turn|> <|turn>user How are you?<turn|> <|turn>model ' srv init: init: chat template, thinking = 1 |
'프로그램 사용 > ai 프로그램' 카테고리의 다른 글
| comfyui 실행 (0) | 2026.05.19 |
|---|---|
| llama.cpp kv quantization (0) | 2026.05.19 |
| RAG 시도 - postgresql(14) + pgvector (1) | 2026.05.15 |
| gpt님 만세! - pip torch 버전 낮추기 (0) | 2026.05.15 |
| wan2.2 (0) | 2026.05.13 |
