> ## Documentation Index
> Fetch the complete documentation index at: https://bulkgrid.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Core Concepts

> Understand the core Bulkgrid concepts and how they relate at a high level.

Bulkgrid is easier to adopt once the core concepts are clear. Customers usually care about five things:

* what content is being ingested
* how that content is grouped
* how processing is tracked
* what output is produced
* how those outputs are consumed later

## Core objects

<CardGroup cols={3}>
  <Card title="Sources" icon="database" href="#sources">
    The origin of content Bulkgrid processes.
  </Card>

  <Card title="Collections" icon="layers" href="#collections">
    Retrieval and access boundaries for grouped content.
  </Card>

  <Card title="Runs" icon="play" href="#runs">
    The top-level record for asynchronous work.
  </Card>

  <Card title="Results" icon="file-text" href="#results">
    Per-item outputs produced by a run.
  </Card>
</CardGroup>

## Sources

A source is the content origin Bulkgrid processes.

In practical terms, a source is usually one of these:

* a public website or site section
* a known list of URLs
* a starting URL for deep crawl
* a document discovered during processing

Customers usually think about sources in terms of scope and trust:

* which domains should be included
* which paths should be excluded
* which source types are allowed in a given workflow
* whether the source is stable enough for production retrieval

## Collections

A collection is the boundary used to group content for retrieval and access control.

Collections matter because most workspaces do not want one undifferentiated search corpus. They want to separate knowledge by product, workflow, audience, or trust level.

Typical collection patterns:

* public documentation
* internal operations knowledge
* support content
* product-specific content domains

## Runs

A run is the top-level record for asynchronous work.

Runs are created for workflows such as:

* extraction
* crawl
* deep crawl
* run-based API operations

Each run tracks operational state such as:

* status
* timestamps
* URL scope
* progress counters
* error fields
* retry state

## Results

Results are the per-item outputs of a run.

A single run can produce many results. A result usually represents one processed page, document, or item-level output.

Results can include:

* URL and title
* status code
* extraction output
* generated content references
* screenshot-related data
* error information for that item

## How the concepts fit together

```mermaid theme={null}
flowchart LR
  Source["Source"] --> Collection["Collection"]
  Source --> Run["Run"]
  Collection --> Search["Search Scope"]
  Run --> Result["Result"]
```

## Practical rule

Customers should think about the model in this order:

1. define the source boundary
2. decide which collection the content belongs to
3. create the run
4. monitor results
5. consume only the result outputs your application actually needs
