Regular Expression Hell

Whether you're a newbie or an experienced programmer, any questions, help, or just talk of any language will be welcomed here.

Moderator: Coders of Rage

Post Reply
User avatar
M_D_K
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1087
Joined: Tue Oct 28, 2008 10:33 am
Favorite Gaming Platforms: PC
Programming Language of Choice: C/++
Location: UK

Regular Expression Hell

Post by M_D_K »

OK I really hope someone knows regular expressions. I need to read the output of ps(linux program), and extract the fields. The output looks like this

Code: Select all

//PID, USER, ARGS
 6461 mdk      /bin/sh /usr/bin/x-session-manager
 6560 root     start_kdeinit --new-startup +kcminit_startup
How the hell do I represent that as a regular expression. I have only dabbled in regular expressions once before, and this is out of my league.
So far I can only tell that its formatted correctly using:

Code: Select all

"^[ ]*[[:digit:]]"
And even that is a partial expression its pretty much checks the spaces at the start and then for the PID.
EDIT: Got it cracked for the most part

Code: Select all

^\\s+*([0-9]*)\\s+*([A-Za-z0-9_]*)\\s+*([A-Za-z0-9_]*)
gotta use the double back slashes to stop gcc bitching about "unknown escape sequence". The only thing left to do is make it so apps that are enclosed in brackets([]) are read too.
Last edited by M_D_K on Mon Jan 26, 2009 3:04 pm, edited 1 time in total.
Gyro Sheen wrote:you pour their inventory onto my life
IRC wrote: <sparda> The routine had a stack overflow, sorry.
<sparda> Apparently the stack was full of shit.
User avatar
trufun202
Game Developer
Game Developer
Posts: 1105
Joined: Sun Sep 21, 2008 12:27 am
Location: Dallas, TX
Contact:

Re: Regular Expression Hell

Post by trufun202 »

Sorry dude, I'd love to help you, but there are few things in this world that I despise more than regular expressions... :evil:

I usually go to regexlib.com, find something close to what I need, then hack it. However, in my case, I usually need regex for input validation.
-Chris

YouTube | Twitter | Rad Raygun

“REAL ARTISTS SHIP” - Steve Jobs
User avatar
dandymcgee
ES Beta Backer
ES Beta Backer
Posts: 4709
Joined: Tue Apr 29, 2008 3:24 pm
Current Project: https://github.com/dbechrd/RicoTech
Favorite Gaming Platforms: NES, Sega Genesis, PS2, PC
Programming Language of Choice: C
Location: San Francisco
Contact:

Re: Regular Expression Hell

Post by dandymcgee »

What in god's name is that used for?!
Falco Girgis wrote:It is imperative that I can broadcast my narcissistic commit strings to the Twitter! Tweet Tweet, bitches! :twisted:
User avatar
MarauderIIC
Respected Programmer
Respected Programmer
Posts: 3406
Joined: Sat Jul 10, 2004 3:05 pm
Location: Maryland, USA

Re: Regular Expression Hell

Post by MarauderIIC »

M_D_K wrote:OK I really hope someone knows regular expressions. I need to read the output of ps(linux program), and extract the fields. The output looks like this

Code: Select all

//PID, USER, ARGS
 6461 mdk      /bin/sh /usr/bin/x-session-manager
 6560 root     start_kdeinit --new-startup +kcminit_startup
I'm not sure, but what about
^\s*(\d)+\s+(\w)+\s+(^\s)*\s*$

Start of line, zero or more whitespace characters, one or more digits (PID), one or more whitespace, one word (USER), one or more whitespace, zero or more of everything that's not whitespace (APP), zero or more whitespace to end of string? That's what I intended anyway.

"\s" is whitespace so it includes a tab, too. But if it's really spaces then \s+, I think, right?
I realized the moment I fell into the fissure that the book would not be destroyed as I had planned.
User avatar
M_D_K
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1087
Joined: Tue Oct 28, 2008 10:33 am
Favorite Gaming Platforms: PC
Programming Language of Choice: C/++
Location: UK

Re: Regular Expression Hell

Post by M_D_K »

damn yours is close to what I did. I did it the crazy way

Code: Select all

^\\s+*([0-9]*)\\s+*([A-Za-z0-9_]*)\\s+*([A-Za-z0-9_\\./\\-\\s]*)
didn't put $ at the end cause I only wanted the app name and not all the crap that got passed to it. Oh BTW \w wouldn't work i'm not using advance regex just extended hence the second backreference.

And yeah \s+ is one or more whitespaces.
Gyro Sheen wrote:you pour their inventory onto my life
IRC wrote: <sparda> The routine had a stack overflow, sorry.
<sparda> Apparently the stack was full of shit.
User avatar
LeonBlade
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1314
Joined: Thu Jan 22, 2009 12:22 am
Current Project: Trying to make my first engine in C++ using OGL
Favorite Gaming Platforms: PS3
Programming Language of Choice: C++
Location: Blossvale, NY

Re: Regular Expression Hell

Post by LeonBlade »

Oh god not these things...
There's no place like ~/
User avatar
MarauderIIC
Respected Programmer
Respected Programmer
Posts: 3406
Joined: Sat Jul 10, 2004 3:05 pm
Location: Maryland, USA

Re: Regular Expression Hell

Post by MarauderIIC »

^\s for app name won't work for you? ... duh yeah, of course it won't, they can have spaces. So I would say first alphanumeric char and everything after that to newline or end?
I realized the moment I fell into the fissure that the book would not be destroyed as I had planned.
User avatar
M_D_K
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1087
Joined: Tue Oct 28, 2008 10:33 am
Favorite Gaming Platforms: PC
Programming Language of Choice: C/++
Location: UK

Re: Regular Expression Hell

Post by M_D_K »

OK I'm confused. My pattern works I can extract pid, user, and appname (which is part of args) for example.

Code: Select all

16373 mdk      kio_file [kdeinit] file /tmp/ksocket-mdk/klauncherizZalb.s
[...after extraction]
16373 kio_file
the extracted tuff then gets put into a list control. Also You'll be hard pressed to find a linux app with spaces in its name. There is a long standing tradition of using underscores in place of spaces.

But I added it anywayz

Code: Select all

//third back reference
([A-Za-z0-9_\\./\\-\\s]*) //that \\s is the whitespace double slash because well I allready said in my first post.
I wrote: gotta use the double back slashes to stop gcc bitching about "unknown escape sequence".
Gyro Sheen wrote:you pour their inventory onto my life
IRC wrote: <sparda> The routine had a stack overflow, sorry.
<sparda> Apparently the stack was full of shit.
User avatar
MarauderIIC
Respected Programmer
Respected Programmer
Posts: 3406
Joined: Sat Jul 10, 2004 3:05 pm
Location: Maryland, USA

Re: Regular Expression Hell

Post by MarauderIIC »

"//that \\s is the whitespace double slash because well I allready said in my first post."

Yeah I know. Did I forget to change it? Sorry, my bad.


Confused about what? Is the problem that it's not grabbing everything? Looks like we just need another \\s+(whatever) ? I only had three because I missed the single space between appname and path.
I realized the moment I fell into the fissure that the book would not be destroyed as I had planned.
User avatar
M_D_K
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1087
Joined: Tue Oct 28, 2008 10:33 am
Favorite Gaming Platforms: PC
Programming Language of Choice: C/++
Location: UK

Re: Regular Expression Hell

Post by M_D_K »

MarauderIIC wrote:^\s for app name won't work for you? ... duh yeah, of course it won't, they can have spaces. So I would say first alphanumeric char and everything after that to newline or end?
Thats what confused me.
Gyro Sheen wrote:you pour their inventory onto my life
IRC wrote: <sparda> The routine had a stack overflow, sorry.
<sparda> Apparently the stack was full of shit.
User avatar
LeonBlade
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1314
Joined: Thu Jan 22, 2009 12:22 am
Current Project: Trying to make my first engine in C++ using OGL
Favorite Gaming Platforms: PS3
Programming Language of Choice: C++
Location: Blossvale, NY

Re: Regular Expression Hell

Post by LeonBlade »

What exactly are you making?
There's no place like ~/
User avatar
M_D_K
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1087
Joined: Tue Oct 28, 2008 10:33 am
Favorite Gaming Platforms: PC
Programming Language of Choice: C/++
Location: UK

Re: Regular Expression Hell

Post by M_D_K »

It's a secret. Marauder knows, and I'm trusting him not to tell. All will be revealed when its done.
Gyro Sheen wrote:you pour their inventory onto my life
IRC wrote: <sparda> The routine had a stack overflow, sorry.
<sparda> Apparently the stack was full of shit.
User avatar
LeonBlade
Chaos Rift Demigod
Chaos Rift Demigod
Posts: 1314
Joined: Thu Jan 22, 2009 12:22 am
Current Project: Trying to make my first engine in C++ using OGL
Favorite Gaming Platforms: PS3
Programming Language of Choice: C++
Location: Blossvale, NY

Re: Regular Expression Hell

Post by LeonBlade »

Ahh, I understand...

I always do things the hard way not using expressions to get data like this...
I guess if you took the time to learn feom scratch, it would be easy
There's no place like ~/
User avatar
MarauderIIC
Respected Programmer
Respected Programmer
Posts: 3406
Joined: Sat Jul 10, 2004 3:05 pm
Location: Maryland, USA

Re: Regular Expression Hell

Post by MarauderIIC »

M_D_K wrote:
MarauderIIC wrote:So I would say first alphanumeric char and everything after that to newline or end?
Thats what confused me.
First alphanumeric char is [A-Za-z0-9], or whatever, add all the chars that can possibly be the first character in a filename.
([A-Za-z0-9]+.*)
All characters after that is .*
I realized the moment I fell into the fissure that the book would not be destroyed as I had planned.
Post Reply