DB : postgres, VectorDB : pgvector
embedding
[링크 : https://github.com/gulcin/pgvector-rag-app]
[링크 : https://edbkorea.com/blog/postgres-및-pgvector가-포함된-rag-앱/]
| import PyPDF2 import torch from transformers import pipeline def generate_embeddings(tokenizer, model, device, text): inputs = tokenizer( text, return_tensors="pt", truncation=True, max_length=512 ).to(device) with torch.no_grad(): outputs = model(**inputs, output_hidden_states=True) return text, outputs.hidden_states[-1].mean(dim=1).tolist() def read_pdf_file(pdf_path): pdf_document = PyPDF2.PdfReader(pdf_path) lines = [] for page_number in range(len(pdf_document.pages)): page = pdf_document.pages[page_number] text = page.extract_text() lines.extend(text.splitlines()) return lines |
[링크 : https://github.com/gulcin/pgvector-rag-app/blob/master/embedding.py]
[링크 : https://github.com/gulcin/pgvector-rag-app/blob/master/commands/import_data.py]
langchain, kure(임베딩 벡터 생성)
fast api, streamlit
[링크 : https://lsjsj92.tistory.com/686]
[링크 : https://huggingface.co/nlpai-lab/KURE-v1]
ChromaDB, langchain
DB : postgres, VectorDB : pgvector
[링크 : https://velog.io/@judy_choi/PGVector-와-프롬프트를-이용한-RAG-고도화]
'프로그램 사용 > ai 프로그램' 카테고리의 다른 글
| LLM 소비전력 단상 (0) | 2026.05.13 |
|---|---|
| stable diffusion 모델 변경(SD -> SDXL) (0) | 2026.05.12 |
| 텔레그램 봇 업데이트 / 그림 그리기 (0) | 2026.05.11 |
| litert-lm 와 gemma4-e2b mtp 일단 실패 (0) | 2026.05.10 |
| vLLM (0) | 2026.05.10 |
