python: having trouble returning a pandas data frame from a user defined function (probably user error)
I have a function that creates a DataFrame. Within the function i can have it printed. But I am doing something wrong in the return process, because I can't seem to call the DataFrame after running the function. Below is my dummy code and the attached error. import pandas as pd def testfunction(new_df_to_output): new_df_to_output = pd.DataFrame() S1 = pd.Series([33,66], index=['a', 'b']) S2 = pd.Series([22,44], index=['a', 'b']) S3 = pd.Series([11,55], index=['a', 'b']) new_df_to_output = new_df_to_output.append([S1, S2, S3], ignore_index=True) print new_df_to_output print type(new_df_to_output) print dir() return new_df_to_output testfunction('Desired_DF_name') print dir() print Desired_DF_name The DataFrame prints properly within the function. The directory shows that the DataFrame is not returned after the function. Trying to print that dataframe returns returns the following error Traceback (most recent call last): File "functiontest.py", line 21, in print Desired_DF_name NameError: name 'Desired_DF_name' is not defined I am sure it is a simple mistake but I can't find the solution after searching Stackoverflow and python tutorials. Any guidance is greatly appreciated.
Inside testfunction, the variable new_df_to_output is essentially a label that you are assigning to the passed in object. testfunction('Desired_DF_name') doesn't do what you think; it is assigning the value of the string 'Desired_DF_name' to the variable new_df_to_output; it is not creating a new variable named Desired_DF_name. Basically it's the same as writing new_df_to_output = 'Desired_DF_name'. You want to save the DataFrame that is returned from the function into a variable. So instead of testfunction('Desired_DF_name') you want def testfunction(): ... Desired_DF_name = testfunction() (You can change the definition of testfunction to remove the new_df_to_output parameter. The function wasn't doing anything with it anyway because you immediately reassign the variable: new_df_to_output = pd.DataFrame().)
I think you really want something like this: import pandas as pd def testfunction(): result = pd.DataFrame() S1 = pd.Series([33,66], index=['a', 'b']) S2 = pd.Series([22,44], index=['a', 'b']) S3 = pd.Series([11,55], index=['a', 'b']) result.append([S1, S2, S3], ignore_index=True) return result Desired_DF_name = testfunction() You should carefully read Defining Functions and More on Defining Functions in the documentation.
print Desired_DF_name I'm guessing print is expecting a DataFrame instance, but there is no DataFrame instance in your code snippet that is named Desired_DF_name.
How do I limit the amount of values in a list - Python 3
While loops and duplicate code in python
egg package import error in python project deploy
Having trouble installing the Quartz python module
How to specify list size using FactoryBoy
Redirect a part of traffic to other server
How to use a Chat script on a LAN?
Is any way to set success_url from urlconf parameter?
Reshaping numpy array without using two for loops
Match second number that follows a given pattern
Finding a multiple function keeps returning true and is there a simpler way to write this function in python [duplicate]
API doesn't accept requests.put HTTP call, JSON formatting
Not able to include widgets in a Toplevel container in Tkinter
-1 returns second to last item in python list
How to Convert Each Character in a String using Python
Difference between linear regression in Python (and R) and Stata