`lakehouse_csv_read`

Public callable

Read a CSV file from a Fabric lakehouse Files path.

This reads from the lakehouse Files/ area using the ABFSS root stored in a Housepath. In the Source step, use it for raw file ingestion before standardisation or conversion to Delta tables.

Parameters:

Name	Type	Description	Default
`lh`	`Housepath`	Lakehouse path object returned by `get_path`.	required
`relative_path`	`str`	Path to the CSV file or folder under the lakehouse root, for example `"Files/raw/orders.csv"` or `"Files/raw/orders/"`.	required
`spark_session`	`object`	Spark session to use. If omitted, the helper uses the notebook global `spark`.	`None`
`header`	`bool`	Whether the first row of the CSV file contains column names.	`True`

Returns:

Type	Description
`DataFrame`	Spark DataFrame loaded from the CSV path.

Raises:

Type	Description
`ValueError`	If `lh.root` or `relative_path` is missing.
`RuntimeError`	If no Spark session is available.

Examples:

>>> lh_source = get_path("Sandbox", "Source", config=CONFIG)
>>> df = lakehouse_csv_read(lh_source, "Files/raw/orders.csv")

Source code in src/fabricops_kit/fabric_io.py

def lakehouse_csv_read(lh, relative_path, spark_session=None, header=True):
    """Read a CSV file from a Fabric lakehouse Files path.

    This reads from the lakehouse `Files/` area using the ABFSS root stored in
    a `Housepath`. In the Source step, use it for raw file ingestion before
    standardisation or conversion to Delta tables.

    Parameters
    ----------
    lh : Housepath
        Lakehouse path object returned by `get_path`.
    relative_path : str
        Path to the CSV file or folder under the lakehouse root, for example
        `"Files/raw/orders.csv"` or `"Files/raw/orders/"`.
    spark_session : object, optional
        Spark session to use. If omitted, the helper uses the notebook global
        `spark`.
    header : bool, default True
        Whether the first row of the CSV file contains column names.

    Returns
    -------
    pyspark.sql.DataFrame
        Spark DataFrame loaded from the CSV path.

    Raises
    ------
    ValueError
        If `lh.root` or `relative_path` is missing.
    RuntimeError
        If no Spark session is available.

    Examples
    --------
    >>> lh_source = get_path("Sandbox", "Source", config=CONFIG)
    >>> df = lakehouse_csv_read(lh_source, "Files/raw/orders.csv")
    """
    if not getattr(lh, "root", None):
        raise ValueError("lh.root is required.")
    if not relative_path:
        raise ValueError("relative_path is required.")

    spark_obj = _get_spark(spark_session)
    path = f"{lh.root.rstrip('/')}/{relative_path.lstrip('/')}"
    return spark_obj.read.option("header", header).csv(path)