Jump to content

[CMD] Extract specific string in text file


Recommended Posts

Hi everyone, I'm having difficulty extracting some string in text file, an example the content of mytext.txt

Drive : SystemList Directory : System:\Backup\Scripting\PHP\header.phpList Directory : System:\Backup\Scripting\PHP\content.phpList Directory : System:\Backup\Scripting\PHP\footer.phpList Directory : System:\Backup\Scripting\HTML\header.phpList Directory : System:\Backup\Scripting\HTML\content.phpList Directory : System:\Backup\Scripting\HTML\footer.php

how to output only "PHP" or "HTML" string with batch command? with php I can use regex like :

$r = file_get_contents('mytext.txt');preg_match_all('/\\Scripting\\(\w+)\\/', $r, $w);print_r($w[1]);

how do I use batch command to output similar like that?

Edited by ar_seven_am
Link to comment
Share on other sites


You may get away with this:

findstr/i "\\scripting\\[a-z]." mytext.txt

I don't think that will work, but this might as long as you wanted to find any lines with "ANYTHING_AT_ALL\Scripting\ANYTHING\WHATEVER":

findstr/ri "\\scripting\\.*\\.*" mytext.txt
or this, if you only wanted to find lines with PHP or HTML, and not any others, if they existed:

findstr/ri "\\scripting\\php\\.* \\scripting\\html\\.*" mytext.txt
You can look here for further syntax and examples.

Cheers and Regards

Link to comment
Share on other sites

You may get away with this:

findstr/i "\\scripting\\[a-z]." mytext.txt

yzowl, for "[a-z]" is it included all character (a-z and 0-9)? I mean if the string started with number? thx in advance for your help

It covers members of the alphabet only, I was assuming that your use of \w was for word characters as opposed to numeric, for both you could use both a-z and 0-9 inside the square brackets or alternatively just:

findstr/i "\\scripting\\.*" mytext.txt

even the double quotes shouldn't be necessary.

Or the more simple:

findstr/i \\scripting\\ mytext.txt

It just depends what exactly you're trying to match/filter out!

Link to comment
Share on other sites

jaclaz, yzowl and bhlpt, thx for the advice as for

From; http://www.pagecolumn.com/tool/pregtest.htm

it seems;

\w+ is 'matches any alphanumeric character including the underscore', 'one or more times'.

and matchable character range is actually this: A-Za-z0-9_

actually the "\w+" is only match for alphanumeric character (a-z, A-Z, 0-9), but not underscore or other simbolic character

Link to comment
Share on other sites

as for ar_seven_am (EDIT:ar_seven_am apparently never understood the next line is quoting him - but,hey why prolong the thread.)
"actually the "\w+" is only match for alphanumeric character (a-z, A-Z, 0-9), but

not underscore or other simbolic (sic) character "
if you could be so kind next time to state at the start exactly which regular

expression engine (regex) you are using, or custom RE engine you are using,

or minimally that the engine you are using is not ECMAScript compliant, that

would help prevent time being wasted...
as CharacterClassEscape 'w' has since oh, year 2000, been True for the 63

characters cited [ A-Za-z0-9_ ] in all ECMAScript compliant expression engines

- buy yourself a cheap copy of ECMA-262, 3rd Edition, Section 15.10.2.12

and 2.6 if you have any doubts...

Edited by buyerninety
Link to comment
Share on other sites

as for ar_seven_am

"actually the "\w+" is only match for alphanumeric character (a-z, A-Z, 0-9), but

not underscore or other simbolic (sic) character "

if you could be so kind next time to state at the start exactly which regular

expression engine (regex) you are using, or custom RE engine you are using,

or minimally that the engine you are using is not ECMAScript compliant, that

would help prevent time being wasted...

as CharacterClassEscape 'w' has since oh, year 2000, been True for the 63

characters cited [ A-Za-z0-9_ ] in all ECMAScript compliant expression engines

- buy yourself a cheap copy of ECMA-262, 3rd Edition, Section 15.10.2.12

and 2.6 if you have any doubts...

I'm just including ur previous comment :

From; http://www.pagecolumn.com/tool/pregtest.htm

it seems;

\w+ is 'matches any alphanumeric character including the underscore', 'one or more times'.

and matchable character range is actually this: A-Za-z0-9_

Link to comment
Share on other sites

Besides the \w+ side discussion, (which seems to be any alphanumeric character including underscore according to php documentation), I'm curious which batch solution you decided to use. Then you could add [sOLVED] to the thread title.

Cheers and Regards

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...