Jump to content

How to merge two text files?


Recommended Posts

I've done some Googling but couldn't really find any simple solution to this default behaviour of FOR /F. Is there simple way to preserve it other than using gsar before running the script to replace ",," with something else like ","","? :rolleyes:

No :(, "delims" interprets consecutive separators as a single one.

There is more than one way to skin a cat (the cat won't be happy anyway about it ;)), but the "real" issue as I see it is the comma inside the last field. :unsure:

You dont' need to use gsar for doing something *like*:

FOR /F "tokens=*" %%A IN ('TYPE 1.txt') DO SET Line=%%A
SET Line=%Line:,,=,"",%

and then go on with processing "Line" instead of "1.txt", but the script you posted (at first sight) seem to me a rather complex one.

If I get it right the idea should be to "ignore" the comma when it is inside a double quoted field....

I'll see which trick I can find to get that result in a simpler way (if I can find one) :blink: .

jaclaz

Link to comment
Share on other sites


That's right. "Normally" it would be easy to just divide everything by commas but these kind of fields:

"MSGOTHIC.TTC,MS UI Gothic"

are problematic. As for the script you suggested, isn't it going to be the same situation as there? That's why I'm thinking of gsar because the script just has to work in Windows 2000 with its default cmd.exe.

Link to comment
Share on other sites

Here is an idea you may be able to use to help you.

@ECHO OFF
SETLOCAL ENABLEEXTENSIONS DISABLEDELAYEDEXPANSION
FOR /F "TOKENS=*" %%# IN (1.TXT) DO CALL :SUB %%#
IF DEFINED _OUT (
>TEMP.TXT (
ECHO/%FIRST%
ECHO/%SECOND%
ECHO/%THIRD%
ECHO/%FOURTH%
ECHO/%FIFTH%
)
)
GOTO :EOF

:SUB
SET LINE=%*
SET _FIRST=%LINE:*,=%
CALL :OUT %%LINE:%_FIRST%=%%
SET FIRST=%_OUT:~,-1%
SET _SECOND=%_FIRST:*,=%
CALL :OUT %%_FIRST:%_SECOND%=%%
SET SECOND=%_OUT:~,-1%
SET _THIRD=%_SECOND:*,=%
CALL :OUT %%_SECOND:%_THIRD%=%%
SET THIRD=%_OUT:~,-1%
SET _FOURTH=%_THIRD:*,=%
CALL :OUT %%_THIRD:%_FOURTH%=%%
SET FOURTH=%_OUT:~,-1%
CALL :OUT %%LINE:*%FOURTH%,=%%
SET FIFTH=%_OUT%
GOTO :EOF

:OUT
SET _OUT=%*

I would be interested in your reasons for deconstructing .inf files line by line before reconstructing them.

Link to comment
Share on other sites

I want to remove duplicates, ex.

HKCR,"TypeLib\{662901fc-6951-4854-9eb2-d9a2570f2b2e}\5.1","",0x00000002,"Microsoft WinHTTP Services, version 5.1"
HKCR,TypeLib\{662901fc-6951-4854-9eb2-d9a2570f2b2e}\5.1,"",0x00000002,"Microsoft WinHTTP Services, version 5.1"
HKCR,TypeLib\{662901fc-6951-4854-9eb2-d9a2570f2b2e}\5.1,,0x00000002,"Microsoft WinHTTP Services, version 5.1"
HKCR,TypeLib\{662901fc-6951-4854-9eb2-d9a2570f2b2e}\5.1,,0x00000002, "Microsoft WinHTTP Services, version 5.1"

etc.

so I'd like to deconstruct and then reconstruct such lines so that they are exactly the same and can be deduped by the yanklines script.

Edited by tomasz86
Link to comment
Share on other sites

Would that be done manually?

HKCR,	TypeLib\{662901fc-6951-4854-9eb2-d9a2570f2b2e}\5.1,	,	0x2,	"Microsoft WinHTTP Services, version 5.1"

How do you deal with string variables since they use percent characters, do you have to convert them all to another character, then convert them back again later. What happens if a string variable is used from the strings section, and another key contains the same data but without implementing the string variable. How does a non manual method know that?

Link to comment
Share on other sites

All of the merging is done 100% automatically. As for strings, they have been replaced with their original variables using this script:

FOR /F "skip=1 delims=" %%B IN ([Strings].inf) DO (
FOR /F tokens^=1-2^ delims^=^=^" %%C IN ("%%B") DO (
FOR /F "delims=" %%E IN ('FINDSTR/ILM "%%%%C%%" "*.inf"') DO (
TOOLS\gsar.exe -i -o -s"%%%%C%%" -r"%%D" "%%E" >NUL
)
)
)

so there are no string variables present at this point.

[strings].inf uses this format:

[Strings]
LangTypeValue=9
WSEDIR="1033"
TSCLIENTDIR="Terminal Services Client"

etc.

Edit: @Yzöwl I'm not sure what you're exactly asking about above... Do you mean the spaces? I was thinking about lines like the one below which Microsoft somehow managed to produce.

Taken from Windows 2000 SP4's update.inf:

HKLM, "SYSTEM\CurrentControlSet\Services\RASMAN\PPP\EAP\25",                    Path,                   0x00020000, "%%SystemRoot%%\System32\rastls.dll"

Edited by tomasz86
Link to comment
Share on other sites

I've got a different problem now...

Basically speaking I'm searching for something like this:

FINDSTR/IR "sp.qfe" 1.inf

and the output is:

?"%sourcepath%\\SP2QFE\\bitsinst.exe /setupservice /resourcedll:%windir%\\system32\\xpob2res.dll"

I need to know that the string found is specifically "SP2QFE" so that I can use gsar to remove it like this:

gsar -o -s"SP2QFE\\" -r 1.inf

It's not possible to get such information directly from FINDSTR, or is it?

Edited by tomasz86
Link to comment
Share on other sites

The problem is that I don't know that it's "SP2QFE" before running the script.

I'm also struggling with another problem :}

1.inf

;abc, def

When

FOR /F "tokens=1,2 delims=, " %%a in (1.inf) do echo %%a,%%b

the line is ignored as it starts with

;

and the default EOL is set to it.

I want to disable EOL but also set delims to both comma and space at the same time. Is doing it this way

FOR /F tokens^=1-2^ eol^=^

delims^=^,^ %%a in (1.inf) do echo %%a,%%b

the only possible solution? I know about the other method of setting EOL to an uncommon character but I'd like to avoid it.

The desired output should be:

;abc,def

Edited by tomasz86
Link to comment
Share on other sites

1.inf

;abc, def

When

FOR /F "tokens=1,2 delims=, " %%a in (1.inf) do echo %%a,%%b

the line is ignored as it starts with

;

and the default EOL is set to it.

I want to disable EOL but also set delims to both comma and space at the same time. Is doing it this way

FOR /F tokens^=1-2^ eol^=^

delims^=^,^ %%a in (1.inf) do echo %%a,%%b

the only possible solution? I know about the other method of setting EOL to an uncommon character but I'd like to avoid it.

The desired output should be:

;abc,def

This should work:

FOR /F "tokens=1,2 delims=, eol=" %%a in (1.inf) do echo %%a,%%b

Edited by allen2
Link to comment
Share on other sites

This should work:

FOR /F "tokens=1,2 delims=, eol=" %%a in (1.inf) do echo %%a,%%b

I'd like to use both comma and space as delims. In your example it's only the former :( And

"eol="

actually sets EOL to double quotes so lines like this

"abc, def

will be ignored.

Link to comment
Share on other sites

Edit: @Yzöwl I'm not sure what you're exactly asking about above... Do you mean the spaces?

No not just spaces. Tabs are often used too, but the main thing was the hex string you used.

Your automatic routine would need to know how, for instance, the following relate when checking for duplicates

  • 0x00000000
  • 0x0
  • 0
  • ,,
  • ,<one or more spaces>,
  • ,<one or more tabs>,
  • ,<one or more spaces with one or more tabs>,

As for your query about findstr…

I've got a different problem now...

Basically speaking I'm searching for something like this:

FINDSTR/IR "sp.qfe" 1.inf

and the output is:

?"%sourcepath%\\SP2QFE\\bitsinst.exe /setupservice /resourcedll:%windir%\\system32\\xpob2res.dll"

I need to know that the string found is specifically "SP2QFE" so that I can use gsar to remove it like this:

gsar -o -s"SP2QFE\\" -r 1.inf

It's not possible to get such information directly from FINDSTR, or is it?

It's not findstr that's causing you the problem, its thinking through what your goal is before structuring your code.

you are not looking for sp.qfe, you are looking for example for sp1qfe\\, sp2qfe\\, or sp3qfe\\.

anyhow regardless of that, here is an example script using your search and findstr strings:

@ECHO OFF
SETLOCAL ENABLEEXTENSIONS DISABLEDELAYEDEXPANSION
(SET VAR=)
FOR /F "TOKENS=*" %%# IN ('FINDSTR/I \\SP.QFE\\ 1.INF') DO SET VAR=%%#
IF NOT DEFINED VAR GOTO :EOF
CALL :SUB
ECHO/gsar -o -s"%VAR%" -r 1.inf
PAUSE
GOTO :EOF
:SUB
SET VAR=%VAR:*\\SP=%
SET VAR=SP%VAR:~,6%
GOTO :EOF

Remove the ECHO on line seven to try it when happy with the output!

Edited by Yzöwl
example script added
Link to comment
Share on other sites

@Yzöwl Thank you for the script. I've done some tests and it does work for "\\SP.QFE\\" but I think I need to take a different approach. The script will just check and save a list of SP*QFE folders which exist and than basing on their actual names from the list replace the strings. It's much easier for me to write and should be more efficient since there will be no need to check for non-existing folders.

No not just spaces. Tabs are often used too, but the main thing was the hex string you used.

Your automatic routine would need to know how, for instance, the following relate when checking for duplicates

  • 0x00000000
  • 0x0
  • 0
  • ,,
  • ,<one or more spaces>,
  • ,<one or more tabs>,
  • ,<one or more spaces with one or more tabs>,

I know about it :) I'll have to add all those exceptions. At the moment it was just a basic structure.

@jaclaz Thanks. It works fine. Are there any precautions when using TYPE like that?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...