luplo.core.glossary

CRUD + query expansion for the glossary tables.

The glossary is luplo’s strict-first terminology layer. Terms are extracted from items, normalised, and grouped. Approved groups power the search pipeline’s query expansion — e.g. "vendor" expands to (vendor | shop | NPC벤더).

Three tables: glossary_groups, glossary_terms, glossary_rejections. No aggressive clustering — strict LLM matching only, with a human curation queue for anything uncertain.

Functions

create_glossary_group(→ luplo.core.models.GlossaryGroup)

Create a glossary group (a semantic unit with one canonical term).

get_glossary_group(...)

Fetch a glossary group by ID or hex prefix (≥8 chars).

list_glossary_groups(...)

List glossary groups, optionally filtering to those needing review.

create_glossary_term(→ luplo.core.models.GlossaryTerm)

Create a glossary term (a surface form belonging to a group).

list_pending_terms(→ list[luplo.core.models.GlossaryTerm])

List terms awaiting human curation.

approve_term(→ luplo.core.models.GlossaryTerm | None)

Approve a pending term into a group.

reject_term(→ luplo.core.models.GlossaryRejection | None)

Reject a term — the system will never re-propose this match.

merge_groups(→ luplo.core.models.GlossaryGroup | None)

Merge source group into target — move all terms, delete source.

split_term(→ luplo.core.models.GlossaryGroup | None)

Split a term out of its group into a new group.

create_glossary_group_with_canonical(...)

Create a glossary group AND its canonical surface term in one shot.

add_term_to_group(→ luplo.core.models.GlossaryTerm)

Add a new surface term to an existing group.

delete_glossary_term(→ bool)

Permanently remove a glossary term.

fetch_glossary_map(→ dict[str, list[str]])

Look up glossary aliases for words in a single project.

expand_query(→ str)

Expand a search query using the glossary (legacy, plain-word only).

Module Contents

async luplo.core.glossary.create_glossary_group(conn: psycopg.AsyncConnection[Any], *, project_id: str, canonical: str, definition: str | None = None, scope: str = 'project', scope_id: str | None = None, created_by: str | None = None, id: str | None = None) luplo.core.models.GlossaryGroup

Create a glossary group (a semantic unit with one canonical term).

Parameters:
  • conn – Async psycopg connection.

  • project_id – Owning project.

  • canonical – The canonical surface form for this concept.

  • definition – Optional one-line definition.

  • scope – Scope level — "project" (default) or "system".

  • scope_id – System ID when scope is "system".

  • created_by – Actor who created this group.

  • id – Optional ID override.

Returns:

The new GlossaryGroup.

async luplo.core.glossary.get_glossary_group(conn: psycopg.AsyncConnection[Any], group_id: str, *, project_id: str | None = None) luplo.core.models.GlossaryGroup | None

Fetch a glossary group by ID or hex prefix (≥8 chars).

Returns None when nothing matches; raises AmbiguousIdError when a prefix matches multiple groups. Pass project_id to scope prefix lookups.

async luplo.core.glossary.list_glossary_groups(conn: psycopg.AsyncConnection[Any], project_id: str, *, needs_review: bool = False, limit: int = 100, offset: int = 0) list[luplo.core.models.GlossaryGroup]

List glossary groups, optionally filtering to those needing review.

needs_review=True returns groups that have pending terms.

async luplo.core.glossary.create_glossary_term(conn: psycopg.AsyncConnection[Any], *, group_id: str | None, surface: str, normalized: str, is_protected: bool = False, status: str = 'pending', source_item_id: str | None = None, context_snippet: str | None = None, id: str | None = None) luplo.core.models.GlossaryTerm

Create a glossary term (a surface form belonging to a group).

async luplo.core.glossary.list_pending_terms(conn: psycopg.AsyncConnection[Any], project_id: str, *, limit: int = 50) list[luplo.core.models.GlossaryTerm]

List terms awaiting human curation.

Includes both grouped pending terms (via group → project) and orphan pending terms (via source_item → project).

async luplo.core.glossary.approve_term(conn: psycopg.AsyncConnection[Any], term_id: str, *, group_id: str, actor_id: str, as_canonical: bool = False) luplo.core.models.GlossaryTerm | None

Approve a pending term into a group.

Parameters:
  • conn – Async psycopg connection.

  • term_id – The term to approve.

  • group_id – Target group.

  • actor_id – Who approved.

  • as_canonical – If True, set status to "canonical" (group should have at most one). Otherwise "alias".

Returns:

The updated term, or None if not found.

async luplo.core.glossary.reject_term(conn: psycopg.AsyncConnection[Any], term_id: str, *, actor_id: str, reason: str | None = None) luplo.core.models.GlossaryRejection | None

Reject a term — the system will never re-propose this match.

Sets the term’s status to "rejected" and inserts a permanent record into glossary_rejections.

Returns:

The rejection record, or None if the term was not found.

async luplo.core.glossary.merge_groups(conn: psycopg.AsyncConnection[Any], source_group_id: str, target_group_id: str, *, actor_id: str) luplo.core.models.GlossaryGroup | None

Merge source group into target — move all terms, delete source.

Returns:

The target GlossaryGroup after merge, or None if either group was not found.

async luplo.core.glossary.split_term(conn: psycopg.AsyncConnection[Any], term_id: str, *, new_canonical: str, actor_id: str) luplo.core.models.GlossaryGroup | None

Split a term out of its group into a new group.

The term becomes the canonical member of the new group.

Parameters:
  • conn – Async psycopg connection.

  • term_id – The term to split out.

  • new_canonical – Canonical name for the new group.

  • actor_id – Who performed the split.

Returns:

The new GlossaryGroup, or None if the term was not found.

async luplo.core.glossary.create_glossary_group_with_canonical(conn: psycopg.AsyncConnection[Any], *, project_id: str, canonical: str, definition: str | None = None, created_by: str | None = None) tuple[luplo.core.models.GlossaryGroup, luplo.core.models.GlossaryTerm]

Create a glossary group AND its canonical surface term in one shot.

The CLI lp glossary group create flow assumes a group always has a canonical term — separating the two creation steps would let users leave the system in a broken intermediate state.

Parameters:
  • conn – Async psycopg connection.

  • project_id – Owning project.

  • canonical – Canonical surface form. Becomes both the group’s canonical field and the surface of the seeded term.

  • definition – Optional one-line definition stored on the group.

  • created_by – Actor who created this.

Returns:

Tuple of (group, canonical_term).

async luplo.core.glossary.add_term_to_group(conn: psycopg.AsyncConnection[Any], group_id: str, *, surface: str, actor_id: str, as_canonical: bool = False) luplo.core.models.GlossaryTerm

Add a new surface term to an existing group.

Default status is alias. When as_canonical is true the existing canonical (if any) is demoted to alias first — there is at most one canonical per group.

Parameters:
  • conn – Async psycopg connection.

  • group_id – Target group (full UUID or ≥8 hex prefix).

  • surface – New surface form. Stored verbatim; normalized is the lowercased copy.

  • actor_id – Who added this term.

  • as_canonical – Promote this term to canonical, demoting the current canonical to alias.

Returns:

The newly created GlossaryTerm.

Raises:

NotFoundError – If the group does not exist.

async luplo.core.glossary.delete_glossary_term(conn: psycopg.AsyncConnection[Any], term_id: str, *, actor_id: str) bool

Permanently remove a glossary term.

Cascade rule (option B): removing the last canonical/alias term in a group deletes the group as well, including any rejection records and any leftover pending/rejected terms in the same group. Removing the canonical while aliases still exist is refused — promote one alias to canonical first, or remove the aliases.

Parameters:
  • conn – Async psycopg connection.

  • term_id – Term ID or hex prefix (≥8 chars).

  • actor_id – Who is removing the term (audit trail).

Returns:

True if a term was removed, False if the term was not found.

Raises:

GlossaryGroupHasActiveTermsError – When removing the canonical would leave aliases without a canonical anchor.

async luplo.core.glossary.fetch_glossary_map(conn: psycopg.AsyncConnection[Any], words: list[str], project_id: str) dict[str, list[str]]

Look up glossary aliases for words in a single project.

Parameters:
  • conn – Async psycopg connection.

  • words – Lowercased words to look up.

  • project_id – Project scope for glossary lookup.

Returns:

Mapping lowercased_word [surface1, surface2, ...]. Missing keys (no glossary hit) are simply absent from the mapping. Each value contains every approved surface in the matching group, including the lookup word itself.

async luplo.core.glossary.expand_query(conn: psycopg.AsyncConnection[Any], query: str, project_id: str) str

Expand a search query using the glossary (legacy, plain-word only).

Each whitespace-delimited word in query is looked up; approved aliases are OR’d, groups are AND’d. This helper predates the web-search-style query dialect and does not understand phrases, negations, or OR keywords — pipeline callers should parse with luplo.core.search.tsquery.parse_user_query() and consult fetch_glossary_map() directly. Kept for backwards compatibility with any external caller.

Example:

>>> await expand_query(conn, "vendor budget", "proj-1")
"(vendor | shop | NPC벤더) & budget"