renaming files in CMD scripts

**bphlpt** · February 23, 2012

I would be interested in seeing a complete solution in PowerShell and one in CMD script, the OP's original effort, with or without using external apps such as wget - Start with nothing downloaded, you have the link as provided earlier - http://www.slv.dk/Dokumenter/dsweb/View/Collection-357 - and you end up with the PDF files stored locally and named correctly. That is the overall goal, isn't it?

I think it would be very interesting and instructive to be able to compare the various solutions to the vbscript solution already provided. I don't know anything about PowerShell, so that would be a real learning experience for me. It would help me figure out which part does what. For that matter, if anyone wants to volunteer a solution in any other programming language, even better. That would definitely fit in with the theme of this section of the board.

Cheers and Regards

Edited February 23, 2012 by bphlpt

**DosCode** · February 23, 2012

I thank to all who have brought some solution and some code here. Sure. One day more of patience for today I hadn't time to devote to the code.

bphlpt - yeah, it is the goal, to download files and rename them. But I personally more like individual scripts, every one servers to its own purpose. Because there are different situations, when I download different "folders" of pdf documents. So some files need to be moved, some need to be renamed, some need html documentation, some need no documentation. So this is something individual according what I download. I would be more glad if I would need not these scripts, but I don't know about such browser addon, that would enable you to download more levels of documents, create folders and move documents to folders according organization on html page, and rename folders by chapters or by their purpose and title in html page.

Edited February 24, 2012 by DosCode

**jaclaz** · February 24, 2012

..... but I don't know about such browser addon, that would enable you to download more levels of documents, create folders and move documents to folders according organization on html page, and rename folders by chapters or by their purpose and title in html page.

Just for the record, those "browser addons" do not exist (AFAIK) but javascript does .

UNrelated example:

http://weblogs.asp.net/cumpsd/archive/2004/02/13/72404.aspx

Additionally:

http://wiki.imacros.net/Introducing_iMacros

jaclaz

**CoffeeFiend** · February 24, 2012

but javascript does .

Yes. And using Javascript, you could use a XMLHttpRequest or ServerXMLHTTP object to get the pages and then parse them using regular expressions. Now if only someone here thought of that before...

**bphlpt** · February 24, 2012

From what I remember about the differences between vbscript and Javascript I think the solutions would be very similar, correct?

So how about it CoffeFiend? You want to show a complete PowerShell solution? And jaclaz, would you mind showing a complete CMD script solution for comparison?

Cheers and Regards

**DosCode** · February 24, 2012

@echo off
setlocal EnableDelayedExpansion
set "source=GEN 0 GENERAL.html"
echo In file:%source%


FOR /F "tokens=* delims=" %%A IN ('FIND ".pdf" "%source%" ^|FIND "href"^|FIND "class="') DO (

This is a sub-function. This tells me that you did not set %pdf% file to search for (in the html file). What do you try to do in the loop? Do you try to find all pfds that are in html document? Because what I tried to do was to find one certain pdf, that is in the html document. You can start either by searching pdfs, so you should set the name of pdf file. This instructs where the anchor and title is placed.

But as a complete code, that should find all pdfs and for each pdf to open the html document which is associated with the pdf. So if your code as sub-function should work fine, you must define pdf file you look for.

Late answer to bphlpt.

DosCode - No offense was meant, and I'm a CMD script fan.

„There are definitely cases, IMO, where CMD script is faster and more flexible than other options. “

That is not the reason why I’m interested in CMD. It is much simpler to me against VBS. VBS is Object orientated language and so is more harder to learn. CMD with connection with gnuwin results in something what seems to me similar to bash. So that is the reason for been it favourite for me.

„CoffeeFiend and I don't need to continue on. “

I have finished so I also need not to continue.

„Our two posts do not distract from your overall goal as much as your posts which have been scattered over multiple threads and have yet to come up with a working solution. “

So to you. You have seen I was distracted enough to overlooked two post. I explained that before.

„You have been asking about bits and pieces for a week now, and we didn't even know what your overall goal was until 18 hours ago. “

If I would told you what was my overall goal, than I would spent a lot of time with long time explaining and I would not get answers to my original questions which were the individual solutions of certain situations. I did not want you just all solved instead me. The second point is that is was simpler to me just to make what I want by myself, and then to paste the code and wait your reactions. This is just the way of demonstration. That's better then lots of words. Since the start I see I have learned something, I moved from nothing to something. And that’s what I personally feel to be satisfying. Also I see no reason why I should spend 24 hours 7 days week on forum, I don't mind this took more then week. So maybe you had wait, but I see no problem in it.

Late answer to Aacini's code

Hi. I am newbie at this forum. I was invited here by DosCode to post my last solution to this problem that was developed in detail in other site. So here it is:

Yes, works but there should be some condition, because it returns title "" and twice title ,,GEN 0.1 Preface", but you need not to do it, because I have already my code completed, so as you could see in the debuger's code version.

Replay to Yzöwl's CMD code

If that's all you want to do, here's a quick untested script.

This script prints

REN "gen_0_3.pdf" "GEN 0.3 Record of AIP Supplements.pdf"

REN "gen_2_1.pdf" "GEN 2.1 Measuring System, Aircraft Markings, Holidays.pdf"

REN "gen_4_1.pdf" "GEN 4.1 Charges for Aerodromes and Heliports.pdf"

REN "gen_4_2.pdf" "GEN 4.2 Charges for Air Navigation Services.pdf"

Press any key to continue...

So pdfs staring by digit are not listed.

Replay to CoffeeFiend


gc *.htm|?{$_ -match [regex]'".*/(.*pdf)".*?b>(.*?)<'}|%{ren $matches[1] ($matches[2]+".pdf")}

I cursorily searched internet for gnuwin PowerShell binaries but did not find so did not test your code.

Edited February 24, 2012 by DosCode

**jaclaz** · February 24, 2012

This is a sub-function. This tells me that you did not set %pdf% file to search for (in the html file). What do you try to do in the loop? Do you try to find all pfds that are in html document? Because what I tried to do was to find one certain pdf, that is in the html document. You can start either by searching pdfs, so you should set the name of pdf file. This instructs where the anchor and title is placed.
But as a complete code, that should find all pdfs and for each pdf to open the html document which is associated with the pdf. So if your code as sub-function should work fine, you must define pdf file you look for.

Look, the snippet I posted, when executed on the particular page you mentioned (GEN 0 GENERAL.html) provides this output:

In file:GEN 0 GENERAL.html
Filename="EK_GEN_0_1_en.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-408/EK_GEN_0_1_en.pdf"
FileTitle=GEN 0.1 Preface
Filename="EK_GEN_0_2_en.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-409/EK_GEN_0_2_en.pdf"
FileTitle=GEN 0.2 Record of AIP Amendments
Filename="gen_0_3.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-410/gen_0_3.pdf"
FileTitle=GEN 0.3 Record of AIP Supplements
Filename="EK_GEN_0_4_en.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-411/EK_GEN_0_4_en.pdf"
FileTitle=GEN 0.4 Checklist of AIP Pages
Filename="EK_GEN_0_5_en.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-412/EK_GEN_0_5_en.pdf"
FileTitle=GEN 0.5 List of Hand Amendments to the AIP
Filename="EK_GEN_0_6_en.pdf"
Filepath="/Dokumenter/dsweb/Get/Document-413/EK_GEN_0_6_en.pdf"
FileTitle=GEN 0.6 Table of Contents to Part 0 and 1

Now this may be in part or totally what you were attempting to do (or could be something else alltogether), but rest assured, that - having written the snippet - I perfectly know not only what it does, but also how it works, and - with all due respect - you don't seem like being in a condition to tell me what I should or must do.

The posted snippet is intended as a mere example of how to do something that is what I have understood (possibly wrongly) from your confusing, mixed up, incomplete description of a goal.

If you state EXACTLY what your goal is (in it's entirety) I may be able to "tune" the above base parsing routine to your requirements (IF I will feel like doing so).

If you prefer my attempt was INSTEAD of correcting your attempts (IMHO using a "wrong" approach to the problem) to provide you with a working example to provide you with some "alternative" ideas (which of course you are perfectly free to use in parts or totally or to completely ignore).

What I find inappropriate is:

the lack of even a little "thank you" for the time I spent in attempting to help you (no matter the result)
the use of "should" or "must" related to anything I might (or completely fail to) do

jaclaz

**DosCode** · February 24, 2012

Now this may be in part or totally what you were attempting to do (or could be something else alltogether), but rest assured, that - having written the snippet - I perfectly know not only what it does, but also how it works, and - with all due respect - you don't seem like being in a condition to tell me what I should or must do.

Well, then you know better what it does than me and I need not to tell you anything. I checked the code at the start and thought the loop looks for pdfs in html. That's why I asked you if my assumption is right, but you did not answer that question. And I run your code, but the console closed so I see no output. Then if you understand what I wanted to reach, I need not to tell you anything.

I thank to all who have brought some solution and some code here.

jaclaz, isn't that what you wanted to hear? I thought your solution simple doesn't work. You don't understand my habits. I usually give thanx when I found some solution for my problem or when I finish my discussion in the thread. So that you don't receive thank doesn't mean you wouldn't receive it later.

Edited February 24, 2012 by DosCode

**jaclaz** · February 24, 2012

And I run your code, but the console closed so I see no output. Then if you understand what I wanted to reach, I need not to tell you anything.

Are you telling me that you test batch files by double clicking on them?

For the record, the proper procedure is normally:

open a cmd prompt window
navigate to where the batch is stored
type in the command line the name of the batch

OR

add a PAUSE statement before the end of the file or before the final GOTO :EOF in "main".

jaclaz

**DosCode** · February 24, 2012

This is my final solution. It takes cca 20 seconds to rename the files, maybe some of the solutions here mentioned by others could be faster. But I am satisfied, that it works.

@echo off
setlocal EnableDelayedExpansion
for %%P in (*.pdf) do (
  set "pdfFile=%%P"
  set htmlMask="GEN !pdfFile:~0,1! *.html"
  REM echo !htmlMask!
  echo Testing "!pdfFile!": Looking for !htmlMask!
  for %%H in (!htmlMask!) do (
    echo "%%H"
    set "pdf=%%P"
    set "source=%%H"
    call :JUMP
  )
)
)

:JUMP
REM Get title for pdf from html file

rem Process each line in %source% file:
for /F "usebackq delims=" %%c in ("%source%") do (
   set "line=%%c"
   REM Test if the line contains pdf file I look for:
   SET "pdfline=!line:%pdf%=!"
   if not "!pdfline!" == "!line!" (
      REM Test if the pdfline contains tag b
      if not "!pdfline:*><b>=!" == "!pdfline!" (
         cls     
         set "tag=!pdfline:<b>=$!"
         set "tag=!tag:</b>=$!"
         for /F "tokens=2 delims=$" %%b in ("!tag!") do set title=%%b
	 set "title=!title::=-!"
	 set "title=!title:\=-!"
	 set "title=!title:/=-!"
	 set "title=!title:|=-!"
	 set "title=!title:?=-!"
	 set "title=!title:GEN =!"
         echo Title found: "!title!"
	 ren "%%P" "!title!.pdf"
      )
   )
)

Finally I decided to remove the "GEN " prefix in title.

Edit:

I've made some edit of the code, because I found some lines that was unneccesary, but did not test it. I tested the code before and it was really not good, so I made some changes in the code in my PC, but do not give it here, because the circumstances has changed.

Edited February 25, 2012 by DosCode

**DosCode** · February 24, 2012

Are you telling me that you test batch files by double clicking on them?
(...)
OR
add a PAUSE statement before the end of the file or before the final GOTO :EOF in "main".
jaclaz

I just click it. It is more faster. I sometimes run cmd to console but there happens to me I forget to close it so I have 4 or 5 cmd consoles, so more windows opened makes it confusing to orientate so I like it to autoclose. I usually add a pause, but I it could be so that you could use in the code somewhere command to go to end of file. So I did not add it there. You seen that I used pause everywhere. Also I edited the previous post to you, also bad habit.

**Yzöwl** · February 24, 2012

Replay to Yzöwl's CMD code
If that's all you want to do, here's a quick untested script.
This script prints
REN "gen_0_3.pdf" "GEN 0.3 Record of AIP Supplements.pdf"
REN "gen_2_1.pdf" "GEN 2.1 Measuring System, Aircraft Markings, Holidays.pdf"
REN "gen_4_1.pdf" "GEN 4.1 Charges for Aerodromes and Heliports.pdf"
REN "gen_4_2.pdf" "GEN 4.2 Charges for Air Navigation Services.pdf"
Press any key to continue...
So pdfs staring by digit are not listed.

That's because .pdfs starting with a digit only do so because you've already ran a completely unnecessary script on the files to change their name, that's not my fault, its yours.

I provided code in exact response to the question I asked, (and you answered).

I will certainly not be producing additional code to cater for your inability to use common sense approach to your work nor for you to bastardise into a script as bad as the one you've just posted.

**CoffeeFiend** · February 24, 2012

From what I remember about the differences between vbscript and Javascript I think the solutions would be very similar, correct?

Indeed. Just look at this previous post and see for yourself. The only real difference in both was that you have to use an enumerator in JScript vs the for each in VBScript. Other than that, JScript has proper error handling i.e. try/catch blocks, vs VBScript's ghetto "on error" abomination (by far the worst I've seen in any programming language ever).

So how about it CoffeFiend? You want to show a complete PowerShell solution

It's mainly a matter of using the DownloadString method of the WebClient class to get the page contents, then piping that to match with a regular expressions like <a.*"(.*?/View/.*?-\d*)"> to have the child pages' URLs. Then You do much of the same: you use WebClient.DownloadString to get the child pages' content, and the regex from my previous post to get the PDFs' URLs, and once more, you use WebClient to download the PDFs, but using the DownloadFile method instead (using the subgroups from the regex matches as URL and file name). I just don't have enough interest in the problem (simply renaming files...) or time to waste to write it again in other languages, sorry.

**DosCode** · February 25, 2012

That's because .pdfs starting with a digit only do so because you've already ran a completely unnecessary script on the files to change their name, that's not my fault, its yours.

I've written that already, but I will recall it to you. It is not a fault! I decided to rename the files (to remove prefix) before I have decided to write script that renames the files according html description.

Edited February 25, 2012 by DosCode

**DosCode** · February 25, 2012

I tried to do some script to rename pdf files from this site:

http://www.pl-vacc.org/pol3/files.php?d=maps&language=en


@echo off
setlocal EnableDelayedExpansion
for %%P in (*.pdf) do (
  set "pdfFile=%%P"
  set "htmlMask=info.htm"

  echo Testing "!pdfFile!": Looking in !htmlMask!
  for %%H in (!htmlMask!) do (
    echo "%%H"
    set "pdf=%%P"
    set "source=%%H"
    call :JUMP
    )
  )
)
)

:JUMP
REM Get title for pdf from html file
rem Process each line in %source% file:
for /F "delims=" %%c in ('find /N "%pdf%" "%source%"') do (
   set "line=%%c"
   REM Test if the line contains pdf file I look for:
   SET "pdfline=!line:%pdf%=!"
   if not "!pdfline!" == "!line!" (
      cls     
      SET "pdfline=!pdfline: = !"
      SET "pdfline=!pdfline:</a>= !"

      SET "pdfline=!pdfline:*>=!"
      SET "pdfline=!pdfline:*>=!"
      SET "pdfline=!pdfline:*>=!"
      SET "pdfline=!pdfline:*>=!"
      SET "title=!pdfline:*>=!"

      set "title=!title::=-!"
      set "title=!title:\=-!"
      set "title=!title:/=-!"
      set "title=!title:|=-!"
      set "title=!title:?=-!"
      set "title=!title:^==-!"
      set "title=!title:>=-!"
      set "title=!title:<=-!"
      ren "%%P" "!title!.pdf"
   )
)

This action was fast.

Edited February 25, 2012 by DosCode

Sign In

renaming files in CMD scripts

Recommended Posts

bphlpt

DosCode

jaclaz

CoffeeFiend

bphlpt

DosCode

jaclaz

DosCode

jaclaz

DosCode

DosCode

Yzöwl

CoffeeFiend

DosCode

DosCode

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Activity

Browse