Gem is a conversational AI shopping agent that replaces the traditional search-and-scroll experience on Amazon with a natural dialogue. Users describe what they're looking for in their own words, and Gem asks smart follow-up questions, remembers their preferences across sessions, and presents curated product results inside the chat.
I designed and built Gem end-to-end as an exploration of agentic commerce, then transferred the full source code and intellectual property to a client who is using it for prospective customer demonstrations and product research.

What Gem Does
-
Conversational Search: Users describe their needs in plain language instead of constructing keyword queries. Gem interprets intent, asks clarifying questions about budget and use case, and refines the search through dialogue.
-
Visual Search: Users can upload an image and Gem finds visually similar products on Amazon.
-
Advanced Filtering: Filter by price range, category, ratings, best-sellers, today's deals, and active discounts.
-
Semantic Reranking: Raw product results from Amazon are passed through Cohere's rerank-v3.5 model, which reorders them by true semantic relevance to the user's query rather than keyword overlap alone.
-
Persistent Personalization: Gem builds a long-term memory of each user's preferences, lifestyle context, and shopping patterns. Brand affinities, recurring gift occasions, and decision-making style are remembered across sessions; one-time queries and ephemeral preferences are not.
Architecture Highlights
A few design decisions worth calling out:
Two-tier memory model. Short-term context (the last few conversation turns) lives in Redis for fast recall. Long-term memory is curated by a dedicated memory manager LLM that decides what is worth persisting based on an explicit policy. The result is an agent that remembers the user as a person without polluting its memory with one-off queries.
Tool-driven product retrieval with semantic reranking. Instead of relying purely on keyword search or a vector database, Gem combines live Amazon results with semantic reranking. This gives the agent fresh inventory data while still ranking by what the user actually meant.
Streaming-first frontend. The chat interface is designed around SSE streaming from the very first event. Product cards, text responses, and tool execution feedback all flow through the same event pipeline, which keeps the UI responsive even during long agent runs.


