lakehouse_excel_read_as_spark
Read an Excel file from a Fabric lakehouse Files path.
Spark does not natively read Excel files. This helper reads the Excel file as binary from the lakehouse, writes it to a temporary local file, loads it with pandas, then converts it into a Spark DataFrame.
This is intended for small reference files, mapping tables, and manually maintained business inputs. Large source datasets should be stored as Delta, Parquet, or CSV instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lh
|
Housepath
|
Lakehouse path object returned by |
required |
relative_path
|
str
|
Path to the Excel file under the lakehouse root, for example
|
required |
sheet_name
|
str or int
|
Worksheet name or index to read. Defaults to the first worksheet. |
0
|
spark_session
|
object
|
Spark session to use. If omitted, the helper uses the notebook global
|
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Spark DataFrame converted from the selected Excel worksheet. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
FileNotFoundError
|
If the Excel file cannot be found at the resolved lakehouse path. |
RuntimeError
|
If no Spark session is available. |
Examples:
>>> lh_source = get_path("Sandbox", "Source", config=CONFIG)
>>> df_mapping = lakehouse_excel_read_as_spark(
... lh_source,
... "Files/reference/faculty_mapping.xlsx",
... sheet_name="Mapping",
... )
Notes
-----
Side effects:
- Creates a temporary local file during conversion.
- Materializes rows through pandas before creating a Spark DataFrame.
Source code in src/fabricops_kit/fabric_io.py
655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 | |