Skip to content

Foof.group peak_params documentation doesn't mention additional column that is used as a index #192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danieltomasz opened this issue Feb 12, 2021 · 6 comments

Comments

@danieltomasz
Copy link

danieltomasz commented Feb 12, 2021

According to the Sphinx documentation on the fooof page (here and here) the code below should work

from fooof import FOOOFGroup
import pandas as pd
from fooof.sim.gen import gen_freqs, gen_group_power_spectra
n_spectra = 2
freq_range = [3, 40]
ap_params = [[0.5, 1], [1, 1.5]]
pe_params = [[10, 0.4, 1], [10, 0.2, 1, 22, 0.1, 3]]
nlv = 0.02
# Simulate a group of power spectra
freqs, powers, sim_params = gen_group_power_spectra(n_spectra, freq_range, ap_params,
                                            pe_params, nlv, return_params=True)
fg = FOOOFGroup(peak_width_limits=[1, 8],min_peak_height=0.05, max_n_peaks=6)
fg.fit(freqs, powers, freq_range=[3, 40],n_jobs=-1, progress='tqdm')
peaks = fg.get_params('peak_params') # prepare peaks dataframe
peaks_df = pd.DataFrame(peaks)
peaks_df.columns = ['CF', 'PW', 'BW']

Code that doesn't work.

ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements

Documentation doesn't mention the additional column used as an index.
I was able to run it by changing it after reading this comment in code documentation

 peaks = fg.get_params('peak_params') # prepare peaks dataframe
 peaks_df = pd.DataFrame(peaks)
 peaks_df.columns = ['CF', 'PW', 'BW', 'index']
 peaks_df['index'] = peaks_df['index'].astype(int)

It was probably reworked in fooof 1.0 since my earlier code using the previous version doesn't have that problem (the 1st version worked). And why the index column is not an int?

@danieltomasz danieltomasz changed the title Foof.group peak_params documentation doesn't mention additinal column that is used as a index Foof.group peak_params documentation doesn't mention additional column that is used as a index Feb 12, 2021
@TomDonoghue
Copy link
Member

TomDonoghue commented Feb 12, 2021

Hey @danieltomasz - thanks for trying out the tool, and letting us know about unexpected behaviour!

As you noticed, the get_params method, when requested a whole set of parameters (such as peak_params) will append a column of which model each peak comes from (otherwise, since there are potentially multiple peaks per model - it won't be clear which model each peak is from). This it seems is under documented.

As a quick note on the column being float - this is because the columns are stored together, so the [n_peaks, 4] array is type float. I'm not sure if there is a way to have mixed type arrays? In hindsight, a different strategy may have been to have multiple returns, the parameter array, and a integer column with indices (however, this gets a little messy since indices aren't needed for aperiodic parameters, which can also be returned from get_params).

(I don't remember the exact details, but in a previous version, it might not have returned indices, but this would have made it impossible to map peaks -> spectra, once extracted).

As it is, I think we should definitely add some details to the get_params docstring noting that indices are included when getting peak parameters (I'm thinking something in the Notes section, and/or add more info the the Returns section). There is a quick note of this in tutorial 6, although that could be extended. Do you have any other suggestions of what would be useful to help document this behaviour?

As a sidenote - depending what you are trying to do with the peaks, some of the stand-alone functions may help you process / organize the peaks: https://fooof-tools.github.io/fooof/api.html#analyze-periodic-components

@danieltomasz
Copy link
Author

danieltomasz commented Feb 12, 2021

Thank you for a quick reply.

I am transforming fooof group object results into a pandas data frame to visualize in seaborn distributions of some parameters across different datasets and conditions. Do utilities for such export exist? As one can observe, FOOOF is living in a world of arrays, not data frames :)

Indeed,

NumPy's main object is a homogeneous multidimensional array.
There are structured arrays, but they could add too much complexity and break other things

In documentation, a short remark would be sufficient, FOOOFgroup is inheriting most of the documentation from simple FOOOF object, and there such problem doesn't exist

@ryanhammonds
Copy link
Contributor

ryanhammonds commented Feb 12, 2021

Thanks for bringing this up. I'll add a note about the organization of results in the FOOOFGroup object.

There currently isn't a utility to easily export results to a dataframe. I plan to explore an additional fit parameter that allows the results to be stored as a dictionary, so conversion to dataframe is easy: results = pd.DataFrame(fm.peaks_params_). Or maybe creating a results class that would allow something like fm.peaks_params_.to_dict() or fm.peak_params.to_pandas().

@danieltomasz
Copy link
Author

danieltomasz commented Feb 13, 2021

As you noticed, the get_params method, when requested a whole set of parameters (such as peak_params) will append a column of which model each peak comes from (otherwise, since there are potentially multiple peaks per model - it won't be clear which model each peak is from). This it seems is under documented.

Also this specific fragment of code used to create additional column seems to be deprecated in latest numpy

/fooof/objs/group.py:378: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  out = np.array([np.insert(getattr(data, name), 3, index, axis=1)

@TomDonoghue
Copy link
Member

TomDonoghue commented Feb 25, 2021

Hey @danieltomasz - thanks for the note on the array issue. I had a look, and it turns out we didn't need to be typecasting to arrays in that way at all - this is fixed now in #197. In the same PR, I updated the FOOOFGroup.get_params docstring to explicitly describe the additional column it returns.

Also, potentially of interest - we are currently exploring adding direct support for converting parameters to DataFrames, for example in #196.

I think this issue is now addressed - if you have any suggestions related to the updates and/or works-in-progress, let me know! Otherwise, I think we can close this.

@danieltomasz
Copy link
Author

Hi, I also think it is solved and addressed, I am closing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants