python


pass multiple params to a groupby function in pandas


I wanna do a custom function over a groupby, so for example if my data has the following format.
personid jobid start_date end_date
1 1 2015-01-01 2016-01-30
1 2 2016-01-01 2017-01-01
I wanna compute the overlap between the two dates of the two different jobs for the same person. Would it be wise to use
df.groupby(personid).agg(x)
But then how would i reference both start date and end date for different records in the function x.
The output of code would be something like
personid overlap
1 30
I think you need groupby with custom function where select first and last value of start and end datetime, get date_range and then find length of intersection by numpy.intersect1d:
def f(x):
a = pd.date_range(x['start_date'].iat[0], x['end_date'].iat[0], unit='d')
b = pd.date_range(x['start_date'].iat[-1], x['end_date'].iat[-1], unit='d')
return pd.Series(len(np.intersect1d(a,b)), index=['overlap'])
df = df.groupby('personid').apply(f).reset_index()
print (df)
personid overlap
0 1 366
1 2 6
Sample:
df = pd.DataFrame({'start_date': [pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-05 00:00:00')], 'personid': [1, 1, 2, 2], 'end_date': [pd.Timestamp('2016-01-30 00:00:00'), pd.Timestamp('2016-01-01 00:00:00'), pd.Timestamp('2015-01-25 00:00:00'), pd.Timestamp('2015-01-10 00:00:00')], 'jobid': [1, 2, 1, 2]})
print (df)
end_date jobid personid start_date
0 2016-01-30 1 1 2015-01-01
1 2016-01-01 2 1 2015-01-01
2 2015-01-25 1 2 2015-01-01
3 2015-01-10 2 2 2015-01-05

Related Links

Django - not showing properly one of my tables in template
Datalab: How to export Big Query standard SQL query to dataframe?
Pandas: count difference between dates
Skip one line in .csv file by using genfromtxt function in python
Dealing with NaNs in Pandas
Python Script? - Logging serial output from arduino
Iterate links from selenium into bs4 and print stripped strings
Query for only part of objects of related_name (from ForeignKey)
How to fill in missing sequence lines in a TSV file
Django website optimization: Too many calls to core python functions?
How to post issues to gitlab using python?
Coverage and nose shows files from django and not just my tests
Hexbin scatter plot between two 2D numpy arrays
How to assign a ForeignKey field when a new instance is created
Click a specific button and a checkbox using Selenium
Searching a directory for specific XML in Python

Categories

HOME
xbox-live
cil
batch-processing
jsviews
wmic
warnings
antivirus
decorator
youtube-livestreaming-api
qpython3
memory-leaks
angular2-directives
safari
vlc
iis-7.5
ext.net
tizen-wearable-sdk
spring-cloud-contract
xcode8.3
contact-form-7
packer
activesync
rebol2
ida
uicollectionview
kadanes-algorithm
koa
raphael
searchbar
non-deterministic
osmdroid
lync-2013
kong
pitest
strophe
entity-system
rhomobile
rhino
passenger
pox
sidr
office365connectors
android-download-manager
cgo
dart-pub
knockout-3.0
mod-fcgid
startapp
automator
floor
jags
broadcast
rpostgresql
gce
infinite-scroll
collapsingtoolbarlayout
adler32
festival
verbose
wcf-ria-services
radians
bacnet
linkageerror
boost-hana
inotifypropertychanged
spring-repositories
faraday
scriptlet
picking
twgl.js
xmp
test-class
eyeql
code-readability
unity5.3
evo
pyrocms
harp
guzzle6
robocode
sysfs
pl-i
ejabberd-saas
mser
formvalidation-plugin
collabnet
phpdocx
kraken.js
joox
visualstatemanager
zipalign
opensocial
.net-cf-3.5
installshield-2009
execvp
vmware-server
nuspec
fileconveyor
device-emulation
domain-calculus
mvs
sequelpro
usn
moq-3
continuous-testing
parameterization
jpf
maven-ant-tasks
nemerle
coff
collect
j-interop
osx-leopard
revert
file-encodings
gwt-2.2-celltable
winverifytrust

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App