프로그램 사용/ai 프로그램
rag 관련 라이브러리, 방법 조사중
구차니
2026. 5. 12. 21:53
DB : postgres, VectorDB : pgvector
embedding
[링크 : https://github.com/gulcin/pgvector-rag-app]
[링크 : https://edbkorea.com/blog/postgres-및-pgvector가-포함된-rag-앱/]
| import PyPDF2 import torch from transformers import pipeline def generate_embeddings(tokenizer, model, device, text): inputs = tokenizer( text, return_tensors="pt", truncation=True, max_length=512 ).to(device) with torch.no_grad(): outputs = model(**inputs, output_hidden_states=True) return text, outputs.hidden_states[-1].mean(dim=1).tolist() def read_pdf_file(pdf_path): pdf_document = PyPDF2.PdfReader(pdf_path) lines = [] for page_number in range(len(pdf_document.pages)): page = pdf_document.pages[page_number] text = page.extract_text() lines.extend(text.splitlines()) return lines |
[링크 : https://github.com/gulcin/pgvector-rag-app/blob/master/embedding.py]
[링크 : https://github.com/gulcin/pgvector-rag-app/blob/master/commands/import_data.py]
langchain, kure(임베딩 벡터 생성)
fast api, streamlit
[링크 : https://lsjsj92.tistory.com/686]
[링크 : https://huggingface.co/nlpai-lab/KURE-v1]
ChromaDB, langchain
DB : postgres, VectorDB : pgvector
[링크 : https://velog.io/@judy_choi/PGVector-와-프롬프트를-이용한-RAG-고도화]