python


Deducting the median from each column


I have a dataframe, df with numbers, like so:
1 1 1
2 1 1
2 1 3
I'd like to deduct the median from each column so that the median of each becomes 0.
-1 0 0
0 0 0
0 0 2
How do I do this in a pythandic way? I'm guessing it is possible without iterating over the values, computing the median and then deducting. I'd like to do it tersely, approximately like so:
from numpy import median
df -= median(df) #does not work, deducts median for whole dataframe
Just like this
df -= df.median(axis=0)
median of numpy computes median of overall data.
To accomplish using numpy, try this code instead.
df -= median(df, axis=0)
for more detail, see the document: http://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html
Some testing in ipython showed:
In [23]: A = numpy.arange(9)
In [24]: B = A.reshape((3,3))
In [25]: C = numpy.median(B,axis=0)
In [26]: D = B - C[None,:]
In [27]: B
Out[27]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [28]: D
Out[28]:
array([[-3., -3., -3.],
[ 0., 0., 0.],
[ 3., 3., 3.]])
In [29]: C
Out[29]: array([ 3., 4., 5.])
So the next line gets the median along the columns
C = numpy.median(B,axis=0)
And the next line subtracts it from the matrix, column by column
D = B - C[None,:]

Related Links

Python and assignment [duplicate]
'ErrorDict' object has no attribute 'status_code' while validating form
PIL - Images not rotating
Compare file to list and copy accordingly
Is there a proper way to evaluate against Python tuples? [duplicate]
GroupBy All possible permutations
Python strings: variables with LaTex-style formatting
Insert missing datetime rows in pandas dataframe
Bad window path name when assigning labelwidget
Subtracting one list from another in Python [duplicate]
print the string in between 2 conditions of text file in Python
Python+PyQt4, QGraphicsScene too small
Plotting system of equations of max functions in python [on hold]
I want to make my pytest unittests run with the “python setup.py test” command
Adding a specified value to each in a pandas data frame
Django tables2 dispalys time as am/pm instead of 24H standard

Categories

HOME
sql-server
semantic-ui
meshlab
checkbox
voip
performancecounter
wxwidgets
json-ld
laravel-5.2
extract
matplotlib
automated-tests
requirejs
gimp
viber
sign
zoomcharts
off-canvas-menu
chocolatey
internet-explorer-8
bcrypt
auditing
geopandas
unpack
fileinfo
clockwork
passenger
openshift-enterprise
ksoap
fltk
jquery-multidatespicker
stress-testing
apiary
php-ews
lumen-5.3
taskmanager
appstore-approval
broadcast
component-pascal
crystal-reports-8.5
google-maps-ios
volume
photon-controller
pagefile
castle-dynamicproxy
collapsingtoolbarlayout
gmt
cfeclipse
push-diffusion
xib
galleriffic
asteriskami
clob
spring-lemon
dac
acm
android-cursoradapter
function-fitting
schtasks.exe
file-diffs
selendroid
fabric-twitter
swift2.1
google-earth-plugin
java-melody
alphablending
odftoolkit
and-operator
bullet
coin-flipping
reserved-words
resource-files
roxygen
javascriptmvc
google-closure-library
manage.py
marmalade-edk
returnurl
redirectstandardoutput
fileutils
qtembedded
junit3
symbol-server
pci-bus
rtsp-client
graph-layout
handheld

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App