python


Index JSON files in elasticsearch using Python?


I have a bunch of JSON files(100), which are named as merged_file 1.json, merged_file 2. json and so on.
How do I index all these files into elasticsearch using python(elasticsearch_dsl) ?
I am using this code, but it doesn't seem to work:
from elasticsearch_dsl import Elasticsearch
import json
import os
import sys
es = Elasticsearch()
json_docs =[]
directory = sys.argv[1]
for filename in os.listdir(directory):
if filename.endswith('.json'):
with open(filename,'r') as open_file:
json_docs.append(json.load(open_file))
es.bulk("index_name", "type_name", json_docs)
The JSON looks like this:
{"one":["some data"],"two":["some other data"],"three":["other data"]}
What can I do to make this correct ?
For this task you should be using elasticsearch-py (pip install elasticsearch):
from elasticsearch import Elasticsearch, helpers
import sys, json
es = Elasticsearch()
def load_json(directory):
" Use a generator, no need to load all in memory"
for filename in os.listdir(directory):
if filename.endswith('.json'):
with open(filename,'r') as open_file:
yield json.load(open_file)
helpers.bulk(es, load_json(sys.argv[1]), index='my-index', doc_type='my-type')

Related Links

Why is pip installation of pyOpenSSL 0.13 failing?
Python Pandas - Add a new column with value based on first and last name in multiple columns
Filter in template to arrange data specifically in django
Retrieving data using Beautiful Soup
How can I get this series to a pandas dataframe?
Python - How to create a folder with a user entered name?
python win32com powerpoint ribbon xml
Read and write data to new file Python
How can I create a figure with optimal resolution for printing?
creating a for loop where xpath increases
Beautifulsoup - scraping everything but table data
Django-Haystack returns no results
How did I end up creating list from dataframe in Python Spark?
pandas: create single size & sum columns after group by multiple columns
How to check the shape of multiple arrays contained in a list?
Understanding net_surgery in caffe

Categories

HOME
dotnetrdf
jdbc
semantic-ui
voip
ncurses
webrtc
label
thunderbird-addon
windows-store-apps
chaiscript
phaser-framework
ng-admin
wicket
cartodb
ssr
jboss7.x
uiscrollview
symfony-forms
squarespace
event-log
zoomcharts
richfaces
rworldmap
csh
hanami
emv
widevine
reactiveui
datadog
ivy
jackson-dataformat-csv
getjson
geo
.net-assembly
delphi-10.1-berlin
soundjs
ddms
jrules
powermta
pytest-django
rkt
email-parsing
archer
rapidweaver
vcf
ti-basic
volume
modelattribute
gce
static-code-analysis
iphone-developer-program
opencpu
date-range
redux-router
ingres
iron.io
cjson
iostat
icefaces
cron-task
radians
nssplitview
achievements
console-redirect
lemoon
self-hosting
android-cursoradapter
affix
apigee-baas
paypal-nvp
mptcp
type-mismatch
isml
uos
adobe-indesign
git-checkout
alphablending
candidate-key
cg
mov
lumx
mbox
unison
crystal-reports-10
db4o
document-database
dynamic-binding
vmware-server
shortcuts
pascals-triangle
qsqltablemodel
vim-powerline
applicationcontext
maven-ant-tasks
separation-of-concerns
electronic-signature
surf
paintcomponent
kpi
weborb
associativity

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App