ImportError: Missing optional dependency 'pyarrow'
ImportError: Missing optional dependency ‘pyarrow’
$ python - <<'PY'
import pandas as pd
pd.read_parquet('file.parquet')
PY
Traceback (most recent call last):
File "<string>", line 2, in <module>
ImportError: Missing optional dependency 'pyarrow'
Why this happens
Pandas delegates reading/writing certain file formats (like Parquet) to optional libraries such as pyarrow or fastparquet. If those packages are not installed, the pandas I/O helper will raise an ImportError asking you to install the required dependency.
Fix
- Install the required package:
python -m pip install pyarroworpython -m pip install fastparquet. - Or explicitly select an engine:
pd.read_parquet('file.parquet', engine='pyarrow')(after installing it).
Wrong code
import pandas as pd
# pyarrow not installed
pd.read_parquet('data.parquet') # raises ImportError
Fixed code
# Install required engine
python -m pip install pyarrow
# After installing, use read_parquet as intended
import pandas as pd
pd.read_parquet('data.parquet')
Notes:
- On some environments (like conda), prefer
conda install -c conda-forge pyarrowto binary-compatible wheels. - If you need to reduce dependencies, use CSV/JSON instead of Parquet where practical.