How to convert pandas dataframe rows into their own dataframes?

0

Issue

I have a set of data like:

     values                site_name    timezone    variable_name                                                      
0    [{'value':            SAN JOAQUIN  PST         degC
     [{'value': '9.3', 
     'qualifiers': ['P'], 
     'date': '2022-01-05'
     },
     {'value': '9.4', 
     'qualifiers': ['P'], 
     'date': '2022-01-05'
     }]
     }]
1    [{'value':            SAN JOAQUIN  PST         pH
     [{'value': '7.5', 
     'qualifiers': ['P'], 
     'date': '2022-01-05'
     },
     {'value': '7.8', 
     'qualifiers': ['P'], 
     'date': '2022-01-05'
     }]
     }]

where the values are a long list and each row has these nested sets of values. How can I use pandas to convert each of the variable_name’s to their own dataframe?

I’d like to have something like:

degC table
     value   date        qualifier
0    9.3     2022-01-05  P
1    9.4     2022-01-05  P 

pH table
     value   date        qualifier
0    7.5     2022-01-05  P
1    7.8     2022-01-05  P

Here’s what I have tried so far:

df = pd.json_normalize(file)
for i in range(len(df.index)):
  pd.json_normalize(df.iloc[i])

Raw input which is shown above:

df = pd.DataFrame({'values':[[{'value': [{'value': '9.3', 'qualifiers': ['P'], 'date': '2022-01-05'},
                                         {'value': '9.4', 'qualifiers': ['P'], 'date': '2022-01-05'}]
                              }], 
                             [{'value': [{'value': '7.5', 'qualifiers': ['P'], 'date': '2022-01-05'},
                                         {'value': '7.8', 'qualifiers': ['P'], 'date': '2022-01-05'}]
                              }]],
                   'variable_name':['degC','pH']})

Solution

With data frame:

>>> df
                                                 values variable_name
0  [{'value': [{'value': '9.3', 'qualifiers': ['P'] ...          degC
1  [{'value': [{'value': '7.5', 'qualifiers': ['P'] ...            pH

Using this:

new_df = df['values'].apply(lambda x: pd.DataFrame(x[0]['value']))
new_df.index = df['variable_name']

>>> new_df.loc['degC']
  value qualifiers        date
0   9.3        [P]  2022-01-05
1   9.4        [P]  2022-01-05

>>> new_df.loc['pH']
  value qualifiers        date
0   7.5        [P]  2022-01-05
1   7.8        [P]  2022-01-05

Answered By – MYousefi

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More