python


Index JSON files in elasticsearch using Python?


I have a bunch of JSON files(100), which are named as merged_file 1.json, merged_file 2. json and so on.
How do I index all these files into elasticsearch using python(elasticsearch_dsl) ?
I am using this code, but it doesn't seem to work:
from elasticsearch_dsl import Elasticsearch
import json
import os
import sys
es = Elasticsearch()
json_docs =[]
directory = sys.argv[1]
for filename in os.listdir(directory):
if filename.endswith('.json'):
with open(filename,'r') as open_file:
json_docs.append(json.load(open_file))
es.bulk("index_name", "type_name", json_docs)
The JSON looks like this:
{"one":["some data"],"two":["some other data"],"three":["other data"]}
What can I do to make this correct ?
For this task you should be using elasticsearch-py (pip install elasticsearch):
from elasticsearch import Elasticsearch, helpers
import sys, json
es = Elasticsearch()
def load_json(directory):
" Use a generator, no need to load all in memory"
for filename in os.listdir(directory):
if filename.endswith('.json'):
with open(filename,'r') as open_file:
yield json.load(open_file)
helpers.bulk(es, load_json(sys.argv[1]), index='my-index', doc_type='my-type')

Related Links

How to get list from list of list of tuples in python
How can I distribute my python desktop application?
How to load images and labels to image_data_layer using Python?
Getting the connected components in a graph
Flask hangs after importing pandas (also numpy, matplotlib etc.)
saving pandas dataframe as hdf5
Odoo 9 | search Product by categories in website search
conda 32-bit keep installing 64-bit python 3.5.2
xpath with lxml for Python to get data
passing optional arguments as strings or arays
return value of a function when assigned to variable,adding extra (),in python
Openerp v7 API function field 'one2many' type
Recursion, out of memory?
Fastest Pairwise Difference of Rows
Python uninstalled itself and will not reinstall
Python: How can I create multiple plots for the same function but with different variables?

Categories

HOME
paypal
mql4
freepascal
performancecounter
google-tag-manager
backup
angular2-routing
couchdb-2.0
sbt-assembly
apache2
flyway
specflow
simpy
window
spinnaker
viber
nuxt.js
glpk
react-leaflet
jquery-terminal
bootstrap-popover
scenebuilder
zend-framework-mvc
jsonserializer
ejbca
multiple-columns
nmake
cruisecontrol.net
remove-method
cronet
pipelinedb
continuous-deployment
i3
solidworks
dpi
weinre
configure
jboss5.x
svnkit
geomesa
issue-tracking
sql-like
tic-tac-toe
flex4.5
nashorn
knockout-3.0
rkt
sgmlreader
teiid
galleriffic
rstudio-server
cron-task
webkit2
image-editing
spatial-query
elastix
getrusage
coldfusion-7
intellitest
android-snackbar
futuretask
wif
calibration
prezto
ibm-data-studio
cocoascript
libsndfile
em
magic-numbers
crystal-reports-10
lov
dynamic-proxy
bsod
visual-studio-addins
java.util.date
carddav
argb
fortran77
device-emulation
fieldset
git-filter-branch
returnurl
maven-ear-plugin
qsqltablemodel
autostart
caliper
gdt
point-sprites
ubuntu-11.04
path-manipulation
posting
noir
dashcode
gethashcode
exitstatus
lazy-c++

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App