python


pass multiple params to a groupby function in pandas


I wanna do a custom function over a groupby, so for example if my data has the following format.
personid jobid start_date end_date
1 1 2015-01-01 2016-01-30
1 2 2016-01-01 2017-01-01
I wanna compute the overlap between the two dates of the two different jobs for the same person. Would it be wise to use
df.groupby(personid).agg(x)
But then how would i reference both start date and end date for different records in the function x.
The output of code would be something like
personid overlap
1 30
I think you need groupby with custom function where select first and last value of start and end datetime, get date_range and then find length of intersection by numpy.intersect1d:
def f(x):
a = pd.date_range(x['start_date'].iat[0], x['end_date'].iat[0], unit='d')
b = pd.date_range(x['start_date'].iat[-1], x['end_date'].iat[-1], unit='d')
return pd.Series(len(np.intersect1d(a,b)), index=['overlap'])
df = df.groupby('personid').apply(f).reset_index()
print (df)
personid overlap
0 1 366
1 2 6
Sample:
df = pd.DataFrame({'start_date': [pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-01 00:00:00'), pd.Timestamp('2015-01-05 00:00:00')], 'personid': [1, 1, 2, 2], 'end_date': [pd.Timestamp('2016-01-30 00:00:00'), pd.Timestamp('2016-01-01 00:00:00'), pd.Timestamp('2015-01-25 00:00:00'), pd.Timestamp('2015-01-10 00:00:00')], 'jobid': [1, 2, 1, 2]})
print (df)
end_date jobid personid start_date
0 2016-01-30 1 1 2015-01-01
1 2016-01-01 2 1 2015-01-01
2 2015-01-25 1 2 2015-01-01
3 2015-01-10 2 2 2015-01-05

Related Links

CMake output name for dynamic-loaded library?
Python: undo write to file
Python: “1-2-3-4” to [1, 2, 3, 4]
Problem sub-classing BaseException in Python
How can I render a tree structure (recursive) using a django template?
Is there a simple, elegant way to define singletons? [closed]
python cgi on IIS
Is it pythonic for a function to return multiple values?
Python dictionary from an object's fields
Python subprocess issue with ampersands
Directory listing in Python
SVG rendering in a PyGame application
Do you use the “global” statement in Python? [closed]
How do I successfully pass a function reference to Django’s reverse() function?
Python scope [duplicate]
Showing the stack trace from a running Python application

Categories

HOME
converter
google-cloud-bigtable
answer-set-programming
nlp
bower
ubuntu-16.04
label
otrs
google-sheets-api
survival-analysis
apollo
adsense
rebol
xtext
algorithmic-trading
user
jframe
eval
contact-form-7
dropbox
dropbear
rfid
image-recognition
game-physics
public-key-encryption
rebol2
ejs
titan
android-toolbar
jpeg2000
flexlm
thumbnails
koa
iup
profile
bcrypt
npm-install
cronet
sumo
guile
cep
winscp
multichoiceitems
crt
fractals
m2e
google-drive-realtime-api
body-parser
pace
issue-tracking
freetts
cross-entropy
floating-accuracy
onbackpressed
mediawiki-extensions
react-native-router-flux
distributed-transactions
azure-availability-set
range-v3
tiddlywiki
spring-data-hadoop
struts-layout
libtiff.net
tomee
vst
pydio
mailcatcher
freetype2
multiple-file-upload
multifile-uploader
search-regex
interactive-brokers
asp.net-mvc-2
self-hosting
evo
ffserver
hana-xs
kallithea
errorprovider
fragment-tab-host
collabnet
cppdepend
wss
slick-2.0
resource-files
exiv2
magicalrecord-2.2
real-time-updates
modeshape
fogbugz-api
nuspec
domain-calculus
trailing-slash
tridion2009
ubuntu-11.10
parameterization
law-of-demeter
separation-of-concerns
jquery-selectbox
nsindexpath
zookeeper
scraperwiki
datacontract
serp
collect
out-of-browser
getresource
querypath
google-instant
usability-testing
internals
data-entry

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App