`warehouse_read`

Public callable

Read a table from a Microsoft Fabric warehouse.

This uses Fabric Spark's synapsesql connector to read from a warehouse configured in the framework CONFIG mapping. In Source → Unified → Product workflows, this is commonly used when curated inputs are stored in Fabric Warehouse instead of Lakehouse tables.

Parameters:

Name	Type	Description	Default
`env`	`str`	Environment name in the config mapping, for example `"Sandbox"` or `"DE"`.	required
`target`	`str`	Warehouse target name under the selected environment, for example `"Warehouse"` or `"wh_Bronze"`.	required
`schema`	`str`	Warehouse schema name, for example `"dbo"`.	required
`table`	`str`	Warehouse table name.	required
`config`	`dict`	Config mapping from the config notebook. Expected shape: `config[environment][target] = Housepath(...)`.	`None`
`spark_session`	`object`	Spark session to use. If omitted, the helper uses the notebook global `spark`.	`None`

Returns:

Type	Description
`DataFrame`	Spark DataFrame loaded from the Fabric warehouse table.

Raises:

Type	Description
`RuntimeError`	If the Microsoft Fabric Spark connector is unavailable.
`ValueError`	If the selected environment or target is missing from the config.

Examples:

>>> df = warehouse_read(
...     env="EDLH",
...     target="wh_Bronze",
...     schema="dbo",
...     table="Customer",
...     config=CONFIG,
... )

Source code in src/fabricops_kit/fabric_io.py

def warehouse_read(env, target, schema, table, config=None, spark_session=None):
    """Read a table from a Microsoft Fabric warehouse.

    This uses Fabric Spark's `synapsesql` connector to read from a warehouse
    configured in the framework `CONFIG` mapping. In Source → Unified →
    Product workflows, this is commonly used when curated inputs are stored in
    Fabric Warehouse instead of Lakehouse tables.

    Parameters
    ----------
    env : str
        Environment name in the config mapping, for example `"Sandbox"` or `"DE"`.
    target : str
        Warehouse target name under the selected environment, for example
        `"Warehouse"` or `"wh_Bronze"`.
    schema : str
        Warehouse schema name, for example `"dbo"`.
    table : str
        Warehouse table name.
    config : dict, optional
        Config mapping from the config notebook. Expected shape:
        `config[environment][target] = Housepath(...)`.
    spark_session : object, optional
        Spark session to use. If omitted, the helper uses the notebook global
        `spark`.

    Returns
    -------
    pyspark.sql.DataFrame
        Spark DataFrame loaded from the Fabric warehouse table.

    Raises
    ------
    RuntimeError
        If the Microsoft Fabric Spark connector is unavailable.
    ValueError
        If the selected environment or target is missing from the config.

    Examples
    --------
    >>> df = warehouse_read(
    ...     env="EDLH",
    ...     target="wh_Bronze",
    ...     schema="dbo",
    ...     table="Customer",
    ...     config=CONFIG,
    ... )
    """
    spark_obj = _get_spark(spark_session)
    p = get_path(env, target, config=config)

    try:
        import com.microsoft.spark.fabric
        from com.microsoft.spark.fabric.Constants import Constants
    except Exception as exc:
        raise RuntimeError(
            "This function must run inside Microsoft Fabric Spark with "
            "com.microsoft.spark.fabric available."
        ) from exc

    return (
        spark_obj.read.option(Constants.WorkspaceId, p.workspace_id)
        .option(Constants.DatawarehouseId, p.house_id)
        .synapsesql(f"{p.house_name}.{schema}.{table}")
    )