Pinecone Vector DB セットアップガイド

RAG構築に必須のベクトルデータベースPineconeの設定手順。

費用

無料

Starter無料枠あり
100,000ベクトルまで

必要なもの

メールアドレス

手順

01. Pineconeにサインアップ

https://www.pinecone.io/

02. アカウント作成

Google/GitHub/メールでサインアップ

03. APIキー取得

左メニュー → API Keys → Copy

04. インデックス作成

Create Index → 名前とDimensionsを設定

Dimensionsは使用するEmbeddingモデルに合わせる

05. 完了

使い方

インストール:

pip install pinecone-client

基本的な使い方:

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
index = pc.Index("your-index-name")

ベクトルをupsert
index.upsert(vectors=[
    {"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"text": "..."}}
])

検索
results = index.query(vector=[0.1, 0.2, ...], top_k=3)

参考リンク

ここから先は、よくある質問と詰まりポイントを詳しく解説します。

Q&A - 基本概念

ベクトルDBとは？

テキストや画像を数値ベクトル（埋め込み）に変換して保存・検索するDB。類似度検索が高速。RAGの構築に必須。

Dimensionsとは？

ベクトルの次元数。使用するEmbeddingモデルに合わせる。

OpenAI text-embedding-3-small: 1536
OpenAI text-embedding-3-large: 3072

ServerlessとPod-basedの違い

Serverless: 従量課金、スケール自動、無料枠あり（推奨）
Pod-based: 固定インスタンス、予測可能なコスト

Q&A - 無料枠

無料枠の制限

1プロジェクト（インデックス）
100,000ベクトルまで
Serverlessのみ

制限を超えたら？

有料プランへのアップグレードが必要。または古いデータを削除。

Q&A - トラブルシューティング

「Dimension mismatch」エラー

インデックス作成時のDimensionsと、upsertするベクトルの次元が一致していない。

古いコードが動かない

2024年にSDKが大幅変更。pinecone.init() は廃止。

新: Pinecone(api_key="...")

environmentパラメータ

Serverlessでは不要（廃止）。Pod-basedのみ使用。

LangChain連携

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings

vectorstore = PineconeVectorStore(
    index_name="your-index",
    embedding=OpenAIEmbeddings()
)

代替サービス

Chroma: ローカル・無料・永続化対応
Weaviate: オープンソース・セルフホスト可
Qdrant: オープンソース・高性能
FAISS: Meta製・インメモリ・軽量

参考リンク

タグ: #Pinecone #VectorDB #RAG #Embedding #LLM