pandas extractall matching
How can I match the below with a pandas extractall regex: stringwithinmycolumn stuff, Duration: 15h:22m:33s, notstuff, stuff, Duration: 18h:22m:33s, notstuff, Currently, I am using the below: df.message.str.extractall(r',([^,]*?): ([^,:]*?,').reset_index() Expected output: 0 1 match 0 Duration 15h:22m:33s 1 Duration 18h:22m:33s I am not able to match so far.
You may use ,\s*([^,:]+):\s*([^,]+), See the regex demo It matches: , - a comma \s* - 0+ whitespaces ([^,:]+) - Group 1: - 0+ chars other than , and : : - a colon \s* - 0+ whitespaces ([^,]+) - Group 2: one or more chars other than , , - a comma (this actually can be removed, but may stay to ensure safer matching.) Note that you may consider making your regex more precise when you need to extract structured information from long strings. So, you may want to use letter matching pattern to match Duration, and only digits, colon, h, m or s to extract the time value. So, the pattern will become a bit more verbose: ,\s*([A-Za-z]+):\s*([\d:hms]+) but much safer. See another regex demo.
In : x.message.str.extractall(r',\s*(\w+):\s*([^,]*)').reset_index(level=0, drop=True) Out: 0 1 match 0 Duration 15h:22m:33s 0 Duration 18h:22m:33s
Not able to include widgets in a Toplevel container in Tkinter
-1 returns second to last item in python list
How to Convert Each Character in a String using Python
Difference between linear regression in Python (and R) and Stata
Default value of Django's model doesn't appear in SQL
Errno 2 - No such file or directory
Emails generated in loop not sending subject
OpenCV - Create multichannel Mat from numpy array
Python urlencode don't encode special characters
Making a sequence of tuples unique by a specific element
Can we make many views.py in Django as a Controller?
What is the status of Functional Reactive Programming in Python?
How to send a request by a private protocol with Python
Django+MongoDb connection error
Odoo/OpenERP failed mail handling
Makefile cannot find module in Python3