TidyPanel is an industrial-grade parser designed to
extract clean, standardized data frames from heavily malformed,
human-readable Excel reports. If you have ever struggled to parse
financial statements, ERP exports, or complex tables with N-dimensional
headers, decoy rows, and embedded subtotals, TidyPanel is
built for you.
return_audit parameter.The core function is read_messy_panel(). The following
example creates a minimal “messy” Excel file in a temporary location and
demonstrates the parsing pipeline.
library(TidyPanel)
# Create a minimal example with financial multiplier notation
tmp <- tempfile(fileext = ".xlsx")
df_raw <- data.frame(
Category = c("Revenue", "Cost"),
`FY2022` = c("1.5M", "800k"),
`FY2023` = c("2.0M", "950k"),
check.names = FALSE
)
writexl::write_xlsx(df_raw, tmp)
# Parse and auto-pivot the wide table to long format
result <- read_messy_panel(tmp, auto_pivot = TRUE)
head(result)
#> category time_period value
#> 1 Revenue fy2022 1500000
#> 2 Cost fy2022 800000
#> 3 Revenue fy2023 2000000
#> 4 Cost fy2023 950000
unlink(tmp)By default, TidyPanel standardizes variable names using the
underlying clean_variable_names() engine. This function
converts text to snake_case, removes special currency/percentage
symbols, and maps common synonyms (like gvkey or
permno) to standard terms (id).
One of TidyPanel’s unique features is its ability to understand
structure. Financial tables often use indentation to group rows.
TidyPanel extracts this structure and places it in a new column called
parent_category.
Additionally, data parsing should never be a “black box”. TidyPanel allows you to generate an Audit Log that explicitly records every transformation applied to your data.
# Create a table with a mid-table subtotal row (to be amputated)
tmp2 <- tempfile(fileext = ".xlsx")
df_with_subtotal <- data.frame(
Item = c("Product A", "Product B", "Total", "Product C"),
Revenue = c("100", "200", "300", "150")
)
writexl::write_xlsx(df_with_subtotal, tmp2)
# Parse with audit log enabled
result <- read_messy_panel(tmp2, return_audit = TRUE)
# View the cleaned data (Total row should be amputated)
head(result$data)
#> item revenue
#> 1 Product A 100
#> 2 Product B 200
# View what the engine did
print(result$audit)
#> Operation Count
#> 1 Ghost Bottom Rows Dropped 2
unlink(tmp2)Example Audit Log Output:
| Operation | Count |
|---|---|
| Mid-Table Subtotals Amputated | 1 |
This table guarantees 100% transparency. You can save this log alongside your processed data to maintain a strict audit trail of how the messy raw data was restructured.