Applying function to a list of values in Pandas row, why am I only getting first result?

0

Issue

I have a data frame that consists of two columns and I would like to clean the second column ‘tweets’. Each value in the second column ‘tweets’ consists of a list that contains ~ 100 items.

I would like to iterate through each list in each row to clean the text.

A sample of my data frame (each item in the lists is a string with quotes):

data = ({'user_id':['324','242'],
     'tweets':[["NEWS FLASH: popcorn-flavored Tic-Tacs taste as crap as you imagine.",
                 "The 1970s is here to show us the way: https:xxxx",
                 "FB needs to hurry up and add a laugh/cry button 😬😭😓🤢🙄😱"],
               ["You don't feel like hiding in your personal cave quite so much",
                "More for Cancer https://xxxx",
                "You prefer to keep things to yourself today"]]})
df=pd.DataFrame(data)

I wrote this regex to remove http tags:

#function to remove HTML tags 
def remove_html(mylist):
    for item in mylist:
        text =re.sub(r'http\S+','',item,flags=re.MULTILINE)
        return text

and I applied to each row in the data frame using this code:

df['tweets']=df['tweets'].apply(remove_html)

the problem is that when I apply the function to the data frame, I only get the first element in each list. For some reason, the function only returns the first element.

The output I get:

0    NEWS FLASH: popcorn-flavored Tic-Tacs taste as crap as you imagine.
1    You don't feel like hiding in your personal cave quite so much     
Name: tweets, dtype: object

Any tip would be helpful

Solution

The problem is in your remove_html() function.
You’re returning early, and only the first element of the list.
Use the function below, and notice how the return statement is outside of the for loop.

def remove_html(mylist): 
    return_list = [] 
    for item in mylist: 
        text = re.sub(r'http\S+','',item,flags=re.MULTILINE) 
        return_list.append(text)  
    return return_list 

Answered By – mechanical_meat

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More