Jump to content

How to merge two text files?


Recommended Posts

@jaclaz Why "!0,28!"? It won't work for longer filenames, will it?

You can upgrade it to 29.

Try:

@ECHO off
SETLOCAL ENABLEDELAYEDEXPANSION
SET Counter=0
FOR /F "tokens=*" %%A IN ('cabarc L test.cab') do (
SET /A Counter+=1
Echo %%A
IF !Counter!==6 ECHO ^<--- There are 29 dashes ---^>&ECHO 01234567890123456789012345678
)

When you will find a cab with a file name+ext longer than 29 characters we'll see how to manage the thingy, this is an idea, but it is overkill IMHO:

@ECHO off
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /F "tokens=1 delims=/" %%A IN ('cabarc L test.cab ^| FIND "/"') do (
ECHO %%A
ECHO 01234564789012345678901234567890123456789012345678901234567890123456789012345678
SET Line=%%A
CALL :parse_line
CALL :rem_trail_spaces !Line!
ECHO [!Line!]
)
GOTO :EOF

:rem_trail_spaces
SET Line=%*
GOTO :EOF

:parse_line
SET /a Length=80
SET /a token=3

:loop
SET Line=!Line:~0,%Length%!
SET Parse=!Line:~-2,2!
IF "%Parse%"==" " SET /a Length-=1&GOTO :Loop
SET /a Length-=1
SET /a Shorter=%Length%-1
SET Long=!Line:~0,%Length%!
SET Short=!Line:~0,%Shorter%!
IF "%Short% "=="%Long%" (
ECHO %Length%
SET /a token-=1
ECHO Begin token%token% %Length%
)
IF %Token% gtr 1 GOTO :Loop
SET Line=%Short%
GOTO :EOF

jaclaz

Link to comment
Share on other sites


@Yzöwl Expand.exe works but is extremely slow compared to "cabarc L".

In what context?

I know that you've already 'attempted' to show that uncompressing files with each utility on every file recursively is slower with expand, but you're only reading the name of the single file contained within this time. You should also be aware that the more involved your method of determining the file name contained within, the more likely it is that you will influence the time taken.

What are you trying to get the true expanded file name for?

Additionally if using a different version of cabarc.exe you may find more than one line containing a forward slash for the FIND command to pick up.

Example

Microsoft ® Cabinet Tool - Version 1.00.0601 (03/18/97)

Copyright © Microsoft Corp 1996-1997. All rights reserved.

Link to comment
Share on other sites

I know that it's only about reading the contents, not the actual extraction. Still, "expand.exe -D" is still very slow.

To process 2674 cabbed files from XP SP3 it takes:

Expand.exe 6.1.7600.16385

FOR /R test1 %%A IN (*.*_) DO (
FOR /F "skip=2 tokens=3* delims=:" %%B IN ('EXPAND -D "%%A"') DO (
ECHO %%B
)
)

Time: 6m 17s

Cabarc.exe 6.2.9200.16438

FOR /R test1 %%A IN (*.*_) DO (
FOR /F "skip=9" %%B IN ('cabarc l "%%A"') DO (
ECHO %%B
)
)

Time: 0m 51s

"Cabarc L" is more than 7 times faster than "Expand -D".

What are you trying to get the true expanded file name for?

I want to list all files from a Service Pack in order to update them later. At first I thought about just extracting all of them but the problem is that there many files which share same filenames (once extracted).

Edited by tomasz86
Link to comment
Share on other sites

How long would this take then, It's untested because I'm logged into Linux atm

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION
FOR /F "TOKENS=*" %%# IN ('CABARC L "abc.dl_"') DO (
CALL :SUB %%#
SET _I=!_F:* =!
CALL :SUB %%_F:!_I!=%%)
ECHO/%_F%
PING -n 4 127.0.0.1 1>NUL
GOTO :EOF
:SUB
SET _F=%*

BTW, you're only finding out the contents of a service pack once, it is not a common task, therefore a time difference is negligible. Also like I've already stated, the method used to arrive at the exact file name you require with no leading or trailing spaces is what will effect your times, not just the speed of the utility. You also need to understand that your question was how to parse the line from the reading of a single file not an entire directory full of them. Your previous attempt to test the speeds of the utilities indicated only 33 seconds to uncab all of the files, I would therefore suggest it will be much quicker to uncab them all and directory list the resultant files. It shouldn't matter if "a b c.dll" exists fifteen times it can only be updated once, therefore it only needs to be listed once. To overwrite without prompt try 'CABARC X "abc.dl_" Y' or alternatively use the -o switch!

Link to comment
Share on other sites

How long would this take then, It's untested because I'm logged into Linux atm

It took 0m 56s

BTW, you're only finding out the contents of a service pack once, it is not a common task, therefore a time difference is negligible.

Actually the speed does matter since such a list is going to be created each time I run the script. I don't want to create any lists in advance since it would have to be done for all language versions of each Service Pack. Also such a list would be invalid in case you want to update a service pack which has already been updated in the past.

Also like I've already stated, the method used to arrive at the exact file name you require with no leading or trailing spaces is what will effect your times, not just the speed of the utility. You also need to understand that your question was how to parse the line from the reading of a single file not an entire directory full of them.

100% true. I don't know what approach to choose yet. Maybe I should just assume that there are no spaces in filenames inside CABs and run "cabarc -L" without any subroutines...

Your previous attempt to test the speeds of the utilities indicated only 33 seconds to uncab all of the files, I would therefore suggest it will be much quicker to uncab them all and directory list the resultant files. It shouldn't matter if "a b c.dll" exists fifteen times it can only be updated once, therefore it only needs to be listed once.

Hmm but then the problem is that you'll have to repack all of them again after the files have been updated. This was actually the method which I originally tried to use but it got kind of complicated since it was necessary to store information of both the original cabbed filename and the unpacked version of it. It also required a lot of disk space.

Anyway, thank you for all the ideas. I need some time to think about it and test the different scripts.

Link to comment
Share on other sites

Why do you need to repack them all, just delete them and you're in the same situation as you would have been if you hadn't uncabbed them in the first place.!

BTW, I would suggest that against the 51 seconds your basic routine took to output 2764 lines, that a 5 second, (10%), hit to output just the full file name is pretty darn good.

Link to comment
Share on other sites

@Yzöwl Unfortunately your script from #230 also doesn't work with filenames with spaces :( Only the fist token is displayed.

I've come back to my original concept:


@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION

FOR /R test1 %%A IN (*.*_) DO CALL :Cabarc "%%A"

:Cabarc
SET File=
FOR /F "skip=9 tokens=*" %%A IN ('cabarc l %1') DO SET Line=%%A
SET tokens1=1
:loop
FOR /F "tokens=%tokens1%" %%A IN ("%Line%") DO (
SET Line=!Line:%%A=%%A/!
SET/A tokens1+=1
GOTO :loop
)
SET/A tokens1-=3
SET tokens2=1
:loop2
FOR /F "tokens=%tokens2%-%tokens1% delims=/" %%A IN ("%Line%") DO (
SET File=%File%%%A
SET/A tokens2+=1
GOTO :loop2
)
ECHO "%File%"
GOTO :EOF

:EOF

The script is still unpolished but seems to work regardless of how many spaces are in the filename. It took 1m 16s to process the same set of files so for me it's acceptable.

Edited by tomasz86
Link to comment
Share on other sites

@Yzöwl Unfortunately your script from #229 also doesn't work with filenames with spaces :( Only the fist token is displayed.

Well I've just tested that script on an outdated Windows 2000 machine by makecabbing a dll file with spaces in the file name and the cabbed file name with spaces in it too. It worked 100% with the version 1 of cabarc I alerted you too earlier.

I would suggest that the problem is either due to something you have done to my script when you've tried to implement it for your own purposes or that you are using a file containing two or more spaces together. I'm confident that if file names containing spaces are being used in service packs that there will be none where two spaces exist next to each other

Link to comment
Share on other sites

The script is still unpolished but seems to work regardless of how many spaces are in the filename. It took 1m 16s to process the same set of files so for me it's acceptable.

Well, IMHO the snippet in #222 should be way faster, as it loops not.

There is (and there will always be) debate over using "skip=" or "| FIND "something" if different versions of tools are used.

In the case Yzöwl pointed out, probably the best option is to combine them :

FOR /F "skip=2 tokens=*" %%A IN ('cabarc L test.cab ^| FIND "/"') do (

jaclaz

Link to comment
Share on other sites

As long as the person running the script is always using the same locale/system, using find alone will be fine.

It's bad enough that we're almost two years into this Topic, but what's even worse is that the questions are being asked without giving those expected to help the overall picture in order that the responses can be structured to suit the end goal and not exact question being asked. I would have thought, based on the assumption that the real file name may not match the cabbed file name, that all files should be in uncabbed state up until the final stages.

The following examples uses the same general idea as before, this time do not change anything with it at all, just place it along side the TEST1 directory. Double click them, (one at a time), and read the log file produced.

Version using SKIP

@ECHO OFF
SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION
PUSHD %~dp0
SET "SD=TEST1"
SET "OF=_LIST.LOG"
>%OF% TYPE NUL
FOR /R %SD% %%$ IN (*.*_) DO CALL :SUB1 %%$
GOTO :EOF
:SUB1
SET "CF=%*"
SET "LD=!CF:*%SD%\=!"
FOR /F "SKIP=9 DELIMS=" %%# IN ('CABARC L "%CF%"') DO (
CALL :SUB2 %%#
SET _I=!_F:* =!
CALL :SUB2 %%_F:!_I!=%%
>>%OF% ECHO/!LD!:!_F!)
GOTO :EOF
:SUB2
SET _F=%*

Version using MORE

@ECHO OFF
SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION
PUSHD %~dp0
SET "SD=TEST1"
SET "OF=_LIST.LOG"
>%OF% TYPE NUL
FOR /R %SD% %%$ IN (*.*_) DO CALL :SUB1 %%$
GOTO :EOF
:SUB1
SET "CF=%*"
SET "LD=!CF:*%SD%\=!"
FOR /F "DELIMS=" %%# IN ('CABARC L "%CF%"^|MORE +9') DO (
CALL :SUB2 %%#
SET _I=!_F:* =!
CALL :SUB2 %%_F:!_I!=%%
>>%OF% ECHO/!LD!:!_F!)
GOTO :EOF
:SUB2
SET _F=%*

(note: files with two or more spaces together will not work because they will not exist in the real world)

Link to comment
Share on other sites

I would suggest that the problem is either due to something you have done to my script when you've tried to implement it for your own purposes or that you are using a file containing two or more spaces together. I'm confident that if file names containing spaces are being used in service packs that there will be none where two spaces exist next to each other

The test file had more than one spaces together :ph34r: I'm sorry :blushing: As long a there are only single spaces together in filenames then the scripts seem to work fine. Thank you for help anyway.

As for the end goal, it's to have a script which can:

1) Merge updates / files into an update rollup.

2) Add updates / files to a service pack.

The point 1) is finished in the most part. The point 2) not yet. At this moment I'm probably able to finish it completely by myself but many of my own scripts are very inefficient so I've been working on performance improvements recently. That's why I've been asking those specific questions.

Link to comment
Share on other sites

I'm sorry but I'm curious, about the merging two files or maybe more. so the main goal is merging the inf(s) and collecting all the listed files into single updates, right? so the structure is exactly the same of each update? is there any "qfe" or "gdr" branch like the XP ones? is it different from service pack?

so why we're not write a program/software to automate it..? err sorry just my personal thought..

Link to comment
Share on other sites

so why we're not write a program/software to automate it..? err sorry just my personal thought..

That's what I've been doing...

As for GDR/QFE, the script uses only QFE files. The main difference between an update rollup and a service pack is that service packs are cumulative meaning that all previous service packs are included in the newest one, and also that service packs support the "/integrate" switch so you can slipstream them directly without using any 3rd party tools. As a person who prepares such a service pack you've got full control of the whole process. There's also a structural difference between them - if you unpack a service pack you'll see that everything is located in the "i386" folder and that some some files are packed to CAB (*.*_) and the others aren't (same as on Windows CD). Update rollups are just compilations of updates and their structure is same as single updates.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...