python


What is the most efficient method for accessing and manipulating a pandas df


I am working on an agent based modelling project and have a 800x800 grid that represents a landscape. Each cell in this grid is assigned certain variables. One of these variables is 'vegetation' (i.e. what functional_types this cell posses). I have a data fame that looks like follows:
Each cell is assigned a landscape_type before I access this data frame. I then loop through each cell in the 800x800 grid and assign more variables, so, for example, if cell 1 is landscape_type 4, I need to access the above data frame, generate a random number for each functional_type between the min and max_species_percent, and then assign all the variables (i.e. pollen_loading, succession_time etc etc) for that landscape_type to that cell, however, if the cumsum of the random numbers is <100 I grab function_types from the next landscape_type (so in this example, I would move down to landscape_type 3), this continues until I reach a cumsum closer to 100.
I have this process working as desired, however it is incredibly slow - as you can imagine, there are hundreds of thousands of assignments! So far I do this (self.model.veg_data is the above df):
def create_vegetation(self, landscape_type):
if landscape_type == 4:
veg_this_patch = self.model.veg_data[self.model.veg_data['landscape_type'] <= landscape_type].copy()
else:
veg_this_patch = self.model.veg_data[self.model.veg_data['landscape_type'] >= landscape_type].copy()
veg_this_patch['veg_total'] = veg_this_patch.apply(lambda x: randint(x["min_species_percent"],
x["max_species_percent"]), axis=1)
veg_this_patch['cum_sum_veg'] = veg_this_patch.veg_total.cumsum()
veg_this_patch = veg_this_patch[veg_this_patch['cum_sum_veg'] <= 100]
self.vegetation = veg_this_patch
I am certain there is a more efficient way to do this. The process will be repeated constantly, and as the model progresses, landscape_types will change, i.e. 3 become 4. So its essential this become as fast as possible! Thank you.
As per the comment: EDIT.
The loop that creates the landscape objects is given below:
for agent, x, y in self.grid.coord_iter():
# check that patch is land
if self.landscape.elevation[x,y] != -9999.0:
elevation_xy = int(self.landscape.elevation[x, y])
# calculate burn probabilities based on soil and temp
burn_s_m_p = round(2-(1/(1 + (math.exp(- (self.landscape.soil_moisture[x, y] * 3)))) * 2),4)
burn_s_t_p = round(1/(1 + (math.exp(-(self.landscape.soil_temp[x, y] * 1))) * 3), 4)
# calculate succession probabilities based on soil and temp
succ_s_m_p = round(2 - (1 / (1 + (math.exp(- (self.landscape.soil_moisture[x, y] * 0.5)))) * 2), 4)
succ_s_t_p = round(1 / (1 + (math.exp(-(self.landscape.soil_temp[x, y] * 1))) * 0.5), 4)
vegetation_typ_xy = self.landscape.vegetation[x, y]
time_colonised_xy = self.landscape.time_colonised[x, y]
is_patch_colonised_xy = self.landscape.colonised[x, y]
# populate landscape patch with values
patch = Landscape((x, y), self, elevation_xy, burn_s_m_p, burn_s_t_p, vegetation_typ_xy,
False, time_colonised_xy, is_patch_colonised_xy, succ_s_m_p, succ_s_t_p)
self.grid.place_agent(patch, (x, y))
self.schedule.add(patch)
Then, in the object itself I call the create_vegetation function to add the functional_types from the above df. Everything else in this loop comes from a different dataset so isn't relevant.
You need to extract as many calculations as you can into a vectorized preprocessing step. For example in your 800x800 loop you have:
burn_s_m_p = round(2-(1/(1 + (math.exp(- (self.landscape.soil_moisture[x, y] * 3)))) * 2),4)
Instead of executing this line 800x800 times, just do it once, during initialization:
burn_array = np.round(2-(1/(1 + (np.exp(- (self.landscape.soil_moisture * 3)))) * 2),4)
Now in your loop it is simply:
burn_s_m_p = burn_array[x, y]
Apply this technique to the rest of the similar lines.

Related Links

KeyError stopping App in kivy
how to run an individual test in python unittest
Pandas DataFrame fails on index but Series succeeds
balance numpy array with over-sampling
Reading in a list python [duplicate]
Adding attachments to TestCaseResults using pyral 0.9.3
Detecting and altering the time delta based on daylight savings for GMT (London)?
Error handlers in python
How to parse json file havin dictionary with in dictionary
Plot XLabel date format from object 'dateIndex'
Extracting BLAST output columns in CSV form with python
How to center text vertically inside a text input in kv file?
Using joblib makes the program run much slower, why?
python alexa result parsing with lxml.etree
The smallest python distribtion to run Sympy, Scipy, Numpy and Matplotlib
Why combining flask with apache2 server is better?

Categories

HOME
project-management
paramiko
puppet
memory-leaks
rocketmq
flyway
terrain
spring-cloud-contract
uiactivityviewcontroller
x-frame-options
future
typeahead
bar-chart
cairo
yeoman-generator
jogl
eclipselink
ip-camera
distance
google-maps-android-api-2
multiplayer
profile
cep
autofill
twitter-bootstrap-2
backup-strategies
modelandview
phpspreadsheet
powershell-remoting
windows-10-iot-core
mms
es-shell
codesys
axis-labels
strstr
duktape
deepstream.io
ssh.net
productivity
rich-text-editor
bids
email-parsing
rdw
fqdn
double-buffering
range-v3
livescribe
spring-data-hadoop
slidesjs
pluck
text-classification
react-native-fbsdk
opencpu
cfeclipse
gameanalytics
asteriskami
mercurial-hook
qt-linguist
license-key
schtasks.exe
jenkins-scriptler
enyo
document-oriented-db
eyeql
application-loader
base32
urn
calibration
sparse-file
bridge
gray-code
cpu-speed
node-inspector
ember-components
jstat
method-overriding
eclipse-classpath
io.js
picat
voldemort
xenocode
astyanax
fortran77
awesomeprint
reporting-tools
abnf
lync-server-2010
mvccontrib
autostart
cadisplaylink
getusermedia
deploying
macruby
die
emacs23
cross-domain-policy
handwriting
qtembedded
nvelocity
post-redirect-get
revert
graph-layout
kpi
outlook-form
cots

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App