luplo.core.import_pipeline.sources

Construct SourceFile records from caller-supplied path + content.

Files are never read by this module. The caller (lp import begin CLI, luplo_import_begin MCP wrapper, or the /lp-import slash command) reads markdown content client-side and passes it inline. This keeps the import pipeline filesystem-free on the server, which is what allows it to run on the multi-tenant cloud MCP where the server has no access to the user’s working tree.

The path is treated as a string identifier — used for display and for the context.source_paths audit field — and is never resolved or opened. Dedup is keyed off the sorted set of content hashes (context.content_hash_set), not paths, so the same content imported under different paths or from different working directories still collapses to a single bundle.

Functions

make_source_file(...)

Build a SourceFile from caller-provided path + UTF-8 content.

content_hash_set(→ tuple[str, Ellipsis])

Return the canonical dedup key for a sources bundle.

Module Contents

luplo.core.import_pipeline.sources.make_source_file(*, path: str, content: str) luplo.core.import_pipeline.manifest.SourceFile

Build a SourceFile from caller-provided path + UTF-8 content.

The hash is computed server-side from content so a malicious caller cannot pre-fabricate a hash to bypass dedup.

Parameters:
  • path – Caller-supplied identifier (filesystem path, URL, etc.). Stored verbatim — the server does not parse or resolve it.

  • content – UTF-8 markdown content. Hashed (sha256) to populate content_hash.

Returns:

A frozen SourceFile with path, content_hash, and raw_markdown populated.

luplo.core.import_pipeline.sources.content_hash_set(sources: list[luplo.core.import_pipeline.manifest.SourceFile]) tuple[str, Ellipsis]

Return the canonical dedup key for a sources bundle.

The key is the sorted tuple of sha256 hashes — order-independent, path-independent. Two callers passing the same content under different filenames or from different working directories produce identical keys, so dedup find_existing_import_wu matches.

Parameters:

sources – Non-empty list of source files; order is irrelevant.

Returns:

Sorted tuple of content hash hex digests.

Raises:

ValueError – When sources is empty.